MAP SEEKING CIRCUITS

Home> Academic Documents> MAP SEEKING CIRCUITS

DOC PREVIEW

This preview shows page 1-2 out of 5 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 5 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 5 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 5 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

MAP SEEKING CIRCUITS: A NOVEL METHOD OF DETECTING AUDITORY EVENTS USING ITERATIVE TEMPLATE MAPPING B. Jerry G. Gregoire and Robert C. Maher Department of Electrical and Computer Engineering, Montana State University, Bozeman MT 59717 [email protected] , [email protected] ABSTRACT This paper reports on a new algorithm to detect the presence of a known acoustic signal in an unknown source. The algorithm, Map Seeking Circuits, has been successfully used in the visual domain. The algorithm seeks to find an appropriate transform that will match a stored template to an unknown signal. The algorithm uses superposition to significantly reduce the computational complexity of searching for a given feature in a signal. This results in a linear computational increase rather than an exponential increase as the complexity of the signal increases. The algorithm was tested with a corpus of six instruments. Results varied from 66% for the piano to 94% for the horn. Index Terms— map seeking circuits, acoustic, detection, template matching 1. INTRODUCTION The human auditory system is quite adept at recognizing sounds. It appears that the auditory system uses both source separation and recognition to accomplish this task. Acoustic source separation often falls in the category of Computational Auditory Scene Analysis (CASA). Although progress has been made in CASA research, nothing has come close to the capability of the human auditory system. In the past several years, research has focused on sound identification or classification. Most of the techniques proposed rely on traditional pattern classification techniques which require a clean signal with little noise. This work proposes using a new algorithm with the goal of detecting acoustic events in a noisy background. In this context noise is defined as any additive signal in addition to the sought after target. This paper reports on work showing that the algorithm is capable of identification of a clean target. Subsequent work will test the algorithm with noise added to the input signal. Classification and detection techniques fall into two major categories: feature based classifiers and template matching. Feature based classifiers extract a feature vector from a signal and typically uses a clustering algorithm such as k-means to discriminate between groups for classification or between individuals for detection. Template methods use a known representation of a target signal and attempt to match it to a pattern. A common problem for template matching techniques is the computational complexity that increases exponentially with the dimensionality of the data set. To overcome this, transforms are often used to produce invariance along one or more dimensions [1]. Arathorn proposed a novel template matching technique, Map Seeking Circuits (MSCs), to overcome the combinatorial explosion of template matching without the need to define invariant transforms [2]. A MSC seeks to find an appropriate set of transforms that map a stored template to an unknown signal. The algorithm uses superposition with an iterative matching process to converge on the best set of transforms that map a template to a target in an input signal. A MSC is comprised of one or more layers and a set of templates. Each layer represents a dimension and an associated transform such as translation, scale or rotation. The algorithm performs a set of transforms at each layer and sums the result. The result is then sent to the following layer where the process is repeated for another dimension. The algorithm depends on The Ordering Principle of Superposition [3]. The principle states that if matches are computed between a pattern, A, and a superposition of a set of patterns, the match will be greatest for the pattern within the superposition that is most like A. The use of superposition reduces the computational complexity from exponential growth to linear growth, thus making the problem tractable. MSCs employ a nonlinear competition function to cull out the poorer transforms. The process is iterative and continues until it convergences to a solution The ordering principle of superposition ensures that the MSC finds the best transform set that maps the template to the target within the input signal. Work has been submitted showing that the MSC algorithm will converge to either a set of unique transforms, one for each layer, or a null condition which indicates that a 5111-4244-0535-1/06/$20.00/©2006 IEEEmapping of the template to the test signal is not possible with the given set of transforms [4]. In this paper we expand on previous work that uses the MSC concept for acoustic signals, Acoustic Map Seeking Circuits (AMSCs). Previously, we demonstrated a single layer AMSC using the amplitude of an instrument’s spectrum to identify an input signal [5]. The template and signal were limited to the sustained portion of the signal. In this paper we demonstrate a three layer AMSC that uses amplitude, time, and frequency transforms for each layer. We also allow the signal to have an attack and decay portion. Research has shown that the attack portion of an acoustic signal is important for its recognition by humans [6, 7, 8] and is valuable for identification of instruments by automatic means [9, 10, 11]. The AMSC uses a gammatone filter bank to create a time versus frequency representation of both the input signal and the template or gammagram. The gammagram is a biologically inspired frequency versus time representation of an acoustical signal. Since the bandwidth of a gammatone filter bank increase with the filter’s center frequency, its use also leads to a more compact representation along the frequency dimension. An instrument’s template is created by combining several gammagrams of adjacent semitones produced by the instrument. This results in a characteristic surface that represents the instrument’s resonances and temporal evolution in the time-frequency plane. The AMSC then uses simple shifts along the time, amplitude and frequency axes to align the template with the test signal’s gammagram. The algorithm requires at least one transform along each dimension to have a minimum match value and a transform that consistently produces a better match throughout the iterative process. If either of these conditions fails the algorithm produces a null condition. A null condition indicates that the AMSC failed to find a possible mapping between the target and


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2 out of 5 pages.

Please select your school