Columbia ELEN E4896 - Chroma and Chords

Unformatted text preview:

E4896 Music Signal Processing (Dan Ellis) 2013-04-08 - /18Lecture 11:Chroma and ChordsDan EllisDept. Electrical Engineering, Columbia [email protected] http://www.ee.columbia.edu/~dpwe/e4896/11. Features for Music Audio2. Chroma Features3. Chord RecognitionELEN E4896 MUSIC SIGNAL PROCESSINGE4896 Music Signal Processing (Dan Ellis) 2013-04-08 - /181. Features for Music Audio•Challenges of large music databaseshow to find “what we want”...2•Euclidean metaphormusic tracks as points in space•What are the dimensions?“sound” - timbre, instruments → MFCCmelody, chords→ Chromarhythm, tempo→ Rhythmic basesE4896 Music Signal Processing (Dan Ellis) 2013-04-08 - /18MFCCs•The standard feature for speech recognition3Logan 20000.25 0.255 0.26 0.265 0.27 time / s−0.500.50 1000 2000 3000 freq / Hz05100 5 10 15 freq / Mel012x 1040 5 10 15 freq / Mel0501000 10 20 30 quefrency−2000200FFT X[k]Mel scalefreq. warplog |X[k]|IFFTTruncateMFCCsSoundspectraaudspeccepstraE4896 Music Signal Processing (Dan Ellis) 2013-04-08 - /18MFCC Example•Resynthesize by imposing spectrum on noiseMFCCs capture instruments, not notes4freq / Hzcoefficientfreq / HzLet It Be - log-freq specgram (LIB-1) 30014006000MFCCs24681012time / secNoise excited MFCC resynthesis (LIB-2)0 5 10 15 20 2530014006000E4896 Music Signal Processing (Dan Ellis) 2013-04-08 - /18MFCC Artist Classification•20 Artists x 6 albums eachtrain models on 5 albums, classify tracks from last•Model as MFCC mean + covarianceper artist“single Gaussian” model20 (mean) + 10 x 19 (covariance) parameters55% correct(guessing ~5%)5Confusion: MFCCs (acc 55.13%) aebecrcudadeflgagrlemameprqurarostsutou2aerosmithbeatlescreedence_c_rcuredave_matthews_bdepeche_modefleetwood_macgarth_brooksgreen_dayled_zeppelinmadonnametallicaprincequeenradioheadroxettesteely_dansuzanne_vegatori_amosu2trueEllis 2007E4896 Music Signal Processing (Dan Ellis) 2013-04-08 - /182. Chroma Features•What about modeling tonal content (notes)?melody spottingchord recognitioncover songs...•MFCCs exclude tonal content•Polyphonic transcription is too harde.g. sinusoidal tracking: confused by harmonics•Chroma features as solution...6MIDI note number4045505560657075MIDI note numbertime / s22 24 26 28 30 32 34 364045505560657075RecognizedTrueE4896 Music Signal Processing (Dan Ellis) 2013-04-08 - /18Chroma Features•Idea: Project all energy onto 12 semitonesregardless of octavemaintains main “musical” distinctioninvariant to musical equivalenceno need to worry about harmonics?W(k) is weighting, B(b) selects every ~ mod127C(b)=NMk=0B(12 log2(k/k0)  b)W (k)|X[k]|50 100 150fft bin2 4 6 8time / sec50 100 150 200 250time / framefreq / kHz01234chromaACDFGchromaACDFGFujishima 1999E4896 Music Signal Processing (Dan Ellis) 2013-04-08 - /18Better Chroma•Problems:blurring of bins close to edgeslimitation of FFT bin resolution•Solutions:peak picking - only keep energy at center of peaksInstantaneous Frequency - high-resolution estimatesadapt tuning center based on histogram of pitches82 4 6 8 time / sec50 100 150 200time / framefreq / kHz01234chromaACDFGchromaACDFG0 2000freq / Hz()E4896 Music Signal Processing (Dan Ellis) 2013-04-08 - /18Chroma Resynthesis•Chroma describes the notes in an octave... but not the octave•Can resynthesize by presenting all octaves... with a smooth envelope“Shepard tones” - octave is ambiguousendless sequence illusion90 500 1000 1500 2000 2500freq / Hz-60-50-40-30-20-1002 4 6 8 10 time / secfreq / kHzlevel / dB01234Shepard tone resynth12 Shepard tone spectrayb(t)=Mo=1W (o +b12) cos 2o+b12w0tEllis & Poliner 2007E4896 Music Signal Processing (Dan Ellis) 2013-04-08 - /18Chroma Example•Simple Shepard tone resynthesiscan also reimpose broad spectrum from MFCCs10Let It Be - log-freq specgram (LIB-1)Chroma featuresCDEGABShepard tone resynthesis of chroma (LIB-3)MFCC-filtered shepard tones (LIB-4)freq / Hz30014006000freq / Hzchroma bin30014006000freq / Hz30014006000time / sec5202510150E4896 Music Signal Processing (Dan Ellis) 2013-04-08 - /18Beat-Synchronous Chroma•Drastically reduce data sizeby recording one chroma frame per beat11Let It Be - log-freq specgram (LIB-1)Onset envelope + beat timesBeat-synchronous chromaBeat-synchronous chroma + Shepard resynthesis (LIB-6)freq / Hz30014006000freq / Hz30014006000CDEGABchroma bintime / sec0510152025Bartsch & Wakefield 2001E4896 Music Signal Processing (Dan Ellis) 2013-04-08 - /183. Chord Recognition•Beat synchronous chroma look like chordscan we transcribe them?•Two approachesmanual templates (prior knowledge)learned models (from training data)12ACDEGchroma bintime / sec05101520C-E-GB-D-GA-C-EA-C-D-F...E4896 Music Signal Processing (Dan Ellis) 2013-04-08 - /18Chord Recognition System•Analogous to speech recognitionGaussian models of features for each chordHidden Markov Models for chord transitions13AudioLabelsBeat trackResampleChroma100-1600 HzBPFChroma25-400 HzBPFRoot normalizeHMMViterbiCounttransitionsGaussianUnnormalizebeat-synchronouschroma featureschordlabels24x24transitionmatrix24GaussmodelstraintestC D E G A BCDEGABCDEGABC D E G A BC majc min C D E F G A B c d e f g a bCDEFGABcdefgabSheh & Ellis 2003E4896 Music Signal Processing (Dan Ellis) 2013-04-08 - /18HMMs•Hidden Markov Models are good for inferring hidden statesunderlying Markov “generative model”each state has emission distributionobservationstell us somethingabout state...infer smoothedstate sequence1412300.20.40.60.80 10 20 30001230 1 2 3 400.20.40.60.8observation x time step nState sequenceEmission distributionsObservation sequencexnxnp(x|q) p(x|q)q = A q = B q = Cq = A q = B q = CAAAAAAAABBBBBBBBBBBCCCCBBBBBBBC ASECBp(qn+1|qn)S A B C E 0 1 0 0 00 0 0 0 10 .8 .1 .1 00 .1 .8 .1 00 .1 .1 .7 .1S A B C Eqnqn+1.8.8.7.1.1.1.1.1.1.1S A A A A A A A A B B B B B B B B B C C C C B B B B B B C EE4896 Music Signal Processing (Dan Ellis) 2013-04-08 - /18HMM Inference•HMM defines emission distribution and transition probabilities•Likelihood of observed given state sequence:15p(x|q)p(qn|qn1)p({xn}|{qn})=np(xn|qn)p(qn|qn1)q0q1q2q3q4SAAAES AAB ES A BBES BBBE.9 x .7 x .7 x .1 = 0.0441.9 x .7 x .2 x .2 = 0.0252.9 x .2 x .8 x .2 = 0.0288.1 x .8 x .8 x .2 = 0.0128Σ = 0.1109 Σ = p(X | M) = 0.40202.5 x 0.2 x 0.1 = 0.052.5 x 0.2 x 2.3 = 1.152.5 x 2.2 x 2.3 = 12.650.1 x 2.2 x 2.3 = 0.5060.00220.02900.36430.0065S A B


View Full Document

Columbia ELEN E4896 - Chroma and Chords

Download Chroma and Chords
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Chroma and Chords and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Chroma and Chords 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?