Columbia ELEN E4896 - Lecture 3- Perception

Unformatted text preview:

E4896 Music Signal Processing (Dan Ellis) 2013-02-04 - /24Lecture 3:PerceptionDan EllisDept. Electrical Engineering, Columbia [email protected] http://www.ee.columbia.edu/~dpwe/e4896/11. Ear Physiology2. Auditory Psychophysics3. Pitch Perception4. Music PerceptionELEN E4896 MUSIC SIGNAL PROCESSINGE4896 Music Signal Processing (Dan Ellis) 2013-02-04 - /241. Ear Physiology•The ear is a very sensitive transducerof air pressure variations into nerve firingsjust above Brownian motion !?•The cochlea is largely understoodthe brain is more difficult2OuterearMiddleearInner ear(cochlea)AuditorynerveMidbrainCortexE4896 Music Signal Processing (Dan Ellis) 2013-02-04 - /24The Ear•Impedance matching & transductionpinna acts as horneardrum + bones match impedancecochlea transduces to nerve firings3PinnaEar canalEardrum(tympanum)Middle earbonesCochlea(inner ear)E4896 Music Signal Processing (Dan Ellis) 2013-02-04 - /24The Cochlea•Complex resonant structureactive feedback to maintain near-ringing stateefferent fibers?CochleaOval window (from ME bones)Basilar Membrane (BM)Travelling waveResonant frequencyPosition16 kHz50 Hz0 35mmhttp://www.wadalab.mech.tohoku.ac.jp/FEM_BM-e.htmlE4896 Music Signal Processing (Dan Ellis) 2013-02-04 - /24Hair Cells•Transduce mechanical motion to nerve firings3,000 IHCs driving 20,000 nerveseasily damaged5CochleaBasilarmembraneTectorialmembraneInner Hair Cell(IHC)Outer Hair Cell(OHC)Auditory nerveE4896 Music Signal Processing (Dan Ellis) 2013-02-04 - /24Auditory Nerve•IHC fires near maximum displacementcannot fire every cyclesome “noise”6Firing countCycle angleLocal BM displacementTypical nerve signal (mV)time / ms50E4896 Music Signal Processing (Dan Ellis) 2013-02-04 - /24Nerve Responses•Onset enhancement•Frequency selectivity•Dynamic range7Spike countTime100100 msTone burst(log) frequency100 Hz1 kHz10 kHz20406080dB SPLSpikes/secIntensity / dB SPL300200100200040 60 80 100One fiber: ~ 25 dB dynamic rangeHearing dynamic range > 100 dBE4896 Music Signal Processing (Dan Ellis) 2013-02-04 - /24Auditory Nerve Ensemble•Ensemble of nerves provide full informationsimilar to constant-Q log-intensity spectrogram8time / msfreq / 8ve re 100 HzPatSla rectsmoo on bbctmp2 (2001-02-18)0123450 10 20 30 40 50 60Secker-Walker & Searle ’90E4896 Music Signal Processing (Dan Ellis) 2013-02-04 - /24Auditory Models•Filterbank + nonlinearityvarying (but broad) bandwidth9Outer/middle ear filteringSoundCochlea filterbankIHCIHCtime / schannelSlaneyPatterson 12 chans/oct from 180 Hz, BBC1tmp (20010218)0 0.1 0.2 0.3 0.4 0.5102030405060E4896 Music Signal Processing (Dan Ellis) 2013-02-04 - /242. Auditory Psychophysics•Extensive study of relationship between physical (Φ) and psychological (Ψ) valuesperception is not “direct”!•Common across all perceptual modalitiesproprioceptive force, body positioningvisionhearing•Φ - Ψ distinction is important!10E4896 Music Signal Processing (Dan Ellis) 2013-02-04 - /24Loudness perception•Perception of physical parameterjust noticeable difference - jndmagnitude scaling•Weber’s law:•Loudness 11I  I  log(L)  log(I)-20 -10 0 101.41.61.82.02.22.42.6Sound level / dBLog(loudness rating)Hartmann(1993) Classroom loudness scaling dataPower law fit: L α I 0.22Textbook figure: L α I 0.3tracks 19+20L  I0.3 log(L)= 0.3 log(I) log10(L)= 0.03 dB(I) dB(I) = 33.3 log10(L)Hartmann ’90E4896 Music Signal Processing (Dan Ellis) 2013-02-04 - /24Equal Loudness•Fletcher-Munson curves (1937)match intensity to specific 1 kHz toneloudness growth12freq / HzIntensity / dB SPL040801201000100 10,000E4896 Music Signal Processing (Dan Ellis) 2013-02-04 - /24Masking•Limited dynamic range in cochleaeffect within frequency“critical bands”basis of MPEG Audio•Forward/backward temporal effects13masked thresholdlog freqabsolute thresholdmasking toneIntensity / dB0501001502002500510152002468101214161820Masking toneElevated masking threshold “skirt”time / msfreq / Barklevel / dBtracks 23-25E4896 Music Signal Processing (Dan Ellis) 2013-02-04 - /24Limits of Hearing•Test what listeners can discriminate•Roughly...timing: 2 ms difference, 20 ms orderingtuning: ~1%spectral profile: single components ~ 2 dBphase?tones vs. noise...14A BXX = A or B?“two-interval forced-choice”:timeE4896 Music Signal Processing (Dan Ellis) 2013-02-04 - /243. Pitch Perception•Complex (non-sinusoidal) tones give a single, fused perceptdespite harmonics resolved by cochleapercept is of a single pitch.. but pitch does NOT rely on the fundamental15102030405060700 0.05 0.1time/sfreq. chan.track 37E4896 Music Signal Processing (Dan Ellis) 2013-02-04 - /24“Place” models of pitch•Hypothesis:Pitch results from activation patternsupport: low harmonics are importantbut: pitch of noisy signals16frequency channelfrequency channelAN excitationPitch strengthresolved harmonicsbroader HF channels cannot resolve harmonicsCorrelate with harmonic ‘sieve’:Duifhuis et al. ’82E4896 Music Signal Processing (Dan Ellis) 2013-02-04 - /24“Time” models of pitch•Autocorrelation neatly unifies pitch phenomenabut: high-frequency modulation evokes weak pitch17lag / mstimefreqper-channel autocorrelationautocorrelationSummary autocorrelation0 10 20 30common period (pitch)Meddis & Hewitt’ 91E4896 Music Signal Processing (Dan Ellis) 2013-02-04 - /24Competing Cues•Perhaps brains use both place & time cuescommon perceptual strategy: opportunistic combination of information•e.g. Probabilistic combinationif x1, x2 are independent given θ18arg maxPr(|x)  arg maxPr(|x1)Pr(|x2)Pr()E4896 Music Signal Processing (Dan Ellis) 2013-02-04 - /244. Music Perception•Hearing music involvesinstruments notesrhythm•Can study with subjective experiments19Grey ’75E4896 Music Signal Processing (Dan Ellis) 2013-02-04 - /24Scene Analysis•Detect separate eventscommon onsetcommon harmonicityinstruments & timbre20time / sfreq / Hz0 1 2 3 4 5 6 7 8 902000400060008000Pierce ’83E4896 Music Signal Processing (Dan Ellis) 2013-02-04 - /24Consonance•Musical intervals relate toharmonic proximity•Pitch Helix21Warren et al. 2003E4896 Music Signal Processing (Dan Ellis) 2013-02-04 - /24Rhythm•Sensitive to periodicityspeech? breathing? brain?•Onsets + autocorrelation?variations in tapping4/4 vs 3/4220100200 300400 500 600700 800 9001000050010000 1 2 3 4 5 6 7 8


View Full Document

Columbia ELEN E4896 - Lecture 3- Perception

Download Lecture 3- Perception
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 3- Perception and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 3- Perception 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?