Multidimensional Representation of Personal Quality of Vowels and its Acoustical Correlates (Matsumoto, Hiki, Sone, Nimura; 19OutlineIntro: GoalIntro: BackgroundIntro: ApproachIntro: Acoustical ParametersIntro: ___ RecognitionTest 1 - /a/: SpecsTest 1 - /a/: ResultsTest 1 - /a/: ROCTest 1 - /a/: Results (var)Test 2 – Hybrid: SpecsTest 2 – Hybrid: ResultsTest 3 – Vowels: SpecsTest 3 – Vowels: Results (PAS)Test 3 – Vowels: Results (ROC)Test 3 – Vowels: Results (xcorr)Test 3 – Vowels: Results (var)Conclusions (1)Conclusions (2)1Multidimensional Representation of Personal Quality of Vowels and its Acoustical Correlates(Matsumoto, Hiki, Sone, Nimura; 1973)Pedro DavalosCPSC 689-604Feb 27, 20072Outline• Introduction• Test 1: /a/• Test 2: Hybrid• Test 3: Vowels• Conclusions3Intro: Goal• Determine how acoustical properties influence recognizing speakers.4Intro: Background• “Personal Quality”– Is NOT high quality as required to perform at an opera– It refers to the speaker’s characteristics and the voice attributes that allow speaker recognition5Intro: Approach⎥⎥⎥⎥⎦⎤⎢⎢⎢⎢⎣⎡nvvvM21Sensory Auditory Space(Physical Space)Psychological Auditory Space (PAS)nvvvM21nvvv L21Classified by ListenersXKruskal’s ScalingS= mdscale(X,d)Graph TheoryMultiple CorrelationC= xcorr2(S,P)nvvvM21654321 ppppppFeature ExtractionAcoustical ParametersVoice SamplesLinear RegressionV= regress(p,C)6Intro: Acoustical Parameters• Mean Fundamental Pitch Frequency– log F0• Fluctuation of Fundamental Pitch Period– σ(ΔT/T)• Slope of Glottal Source Spectrum – α• Formant Frequencies– F1,F2,F3Glottal Source Characteristics - U(s)Vocal Tract Characteristics - T(s)7Intro: ___ RecognitionSpeech Recognition:They are all the same!Æ “Hello World”Speaker Recognition:They are all the different!Æ s1, s2, s3, & s48Test 1 - /a/: Specs• Data Samples:– 8 speakers, vowel /a/ at 3 freq: 120, 140, & 160 Hz Æ 24 samples• Listener Testing:– 6 listeners, listen 9 times to each pair twice (order) Æ 108 values/pair– Listeners classify voice pairs as “same talker” or “different talker”9Test 1 - /a/: Results3D-PASCorrelation between PAS and Acoustical Parameters160140120F1 & F2 are related to both A1 & A2Lower F0, greater contribution to personal quality10Test 1 - /a/: ROCLower F0, greater contribution to personal qualityReceiver Operating Characteristics11Test 1 - /a/: Results (var)12Test 2 – Hybrid: Specs• Data Samples:– 5 speakers, vowel /a/ at 140 Hz.– Data set altered by generating fixed glottal source (removing fluctuation of fundamental pitch period variable)• 6 listeners repeat 10 trials each pair13Test 2 – Hybrid: Results• F3 became similar to F1 & F2• Vocal Tract has greater contribution than Glottal (other than F0) since hybrid voices tend to be closer to the original with the same formants.2D PAS of Hybrid VoicesVg: V-Formant, g-glottal source14Test 3 – Vowels: Specs• Data Samples– 8 Speakers, 5 vowels (40 Voices) all at 164 Hz• Listeners– 13 people listened 3 times to all voice pairs – (78 Samples)15Test 3 – Vowels: Results (PAS)Since Talkers are clustered,The perceptual cues of personal quality common to different vowels is involved in listener judgment16Test 3 – Vowels: Results (ROC)Receiver Operating Characteristics17Test 3 – Vowels: Results (xcorr)α: Slope of glottal source spectrumσ(ΔT/T): Rapid fluctuation of pitch periodF1F2F3Large Correlations and similar directions18Test 3 – Vowels: Results (var)19Conclusions (1)• F0 is the relative most significant contributor to perception ofpersonal quality• Vocal Tract and Glottal Characteristics contribute to different perceptual dimensions from each other with F0 constant• Vocal Tract contributions to perception of personal quality varies with different vowels• The perceptual dimensions of F0, F1, α-slope of glottal, and fluctuation of F0 period are independent of vowel20Conclusions (2)• Authors claim success because…– Talkers cluster on the A1-A2 PAS– The P(c) from the listeners was about 60-70%– There is uniformity of the results despite different listeners– Acoustical parameters were found to influence perception of personal quality• Future Work:– Evaluate other
View Full Document