TAMU CSCE 689 - matsumoto1973acousticCorrelatesMDSSLIDES

Unformatted text preview:

Multidimensional Representation of Personal Quality of Vowels and its Acoustical Correlates (Matsumoto, Hiki, Sone, Nimura; 19OutlineIntro: GoalIntro: BackgroundIntro: ApproachIntro: Acoustical ParametersIntro: ___ RecognitionTest 1 - /a/: SpecsTest 1 - /a/: ResultsTest 1 - /a/: ROCTest 1 - /a/: Results (var)Test 2 – Hybrid: SpecsTest 2 – Hybrid: ResultsTest 3 – Vowels: SpecsTest 3 – Vowels: Results (PAS)Test 3 – Vowels: Results (ROC)Test 3 – Vowels: Results (xcorr)Test 3 – Vowels: Results (var)Conclusions (1)Conclusions (2)1Multidimensional Representation of Personal Quality of Vowels and its Acoustical Correlates(Matsumoto, Hiki, Sone, Nimura; 1973)Pedro DavalosCPSC 689-604Feb 27, 20072Outline• Introduction• Test 1: /a/• Test 2: Hybrid• Test 3: Vowels• Conclusions3Intro: Goal• Determine how acoustical properties influence recognizing speakers.4Intro: Background• “Personal Quality”– Is NOT high quality as required to perform at an opera– It refers to the speaker’s characteristics and the voice attributes that allow speaker recognition5Intro: Approach⎥⎥⎥⎥⎦⎤⎢⎢⎢⎢⎣⎡nvvvM21Sensory Auditory Space(Physical Space)Psychological Auditory Space (PAS)nvvvM21nvvv L21Classified by ListenersXKruskal’s ScalingS= mdscale(X,d)Graph TheoryMultiple CorrelationC= xcorr2(S,P)nvvvM21654321 ppppppFeature ExtractionAcoustical ParametersVoice SamplesLinear RegressionV= regress(p,C)6Intro: Acoustical Parameters• Mean Fundamental Pitch Frequency– log F0• Fluctuation of Fundamental Pitch Period– σ(ΔT/T)• Slope of Glottal Source Spectrum – α• Formant Frequencies– F1,F2,F3Glottal Source Characteristics - U(s)Vocal Tract Characteristics - T(s)7Intro: ___ RecognitionSpeech Recognition:They are all the same!Æ “Hello World”Speaker Recognition:They are all the different!Æ s1, s2, s3, & s48Test 1 - /a/: Specs• Data Samples:– 8 speakers, vowel /a/ at 3 freq: 120, 140, & 160 Hz Æ 24 samples• Listener Testing:– 6 listeners, listen 9 times to each pair twice (order) Æ 108 values/pair– Listeners classify voice pairs as “same talker” or “different talker”9Test 1 - /a/: Results3D-PASCorrelation between PAS and Acoustical Parameters160140120F1 & F2 are related to both A1 & A2Lower F0, greater contribution to personal quality10Test 1 - /a/: ROCLower F0, greater contribution to personal qualityReceiver Operating Characteristics11Test 1 - /a/: Results (var)12Test 2 – Hybrid: Specs• Data Samples:– 5 speakers, vowel /a/ at 140 Hz.– Data set altered by generating fixed glottal source (removing fluctuation of fundamental pitch period variable)• 6 listeners repeat 10 trials each pair13Test 2 – Hybrid: Results• F3 became similar to F1 & F2• Vocal Tract has greater contribution than Glottal (other than F0) since hybrid voices tend to be closer to the original with the same formants.2D PAS of Hybrid VoicesVg: V-Formant, g-glottal source14Test 3 – Vowels: Specs• Data Samples– 8 Speakers, 5 vowels (40 Voices) all at 164 Hz• Listeners– 13 people listened 3 times to all voice pairs – (78 Samples)15Test 3 – Vowels: Results (PAS)Since Talkers are clustered,The perceptual cues of personal quality common to different vowels is involved in listener judgment16Test 3 – Vowels: Results (ROC)Receiver Operating Characteristics17Test 3 – Vowels: Results (xcorr)α: Slope of glottal source spectrumσ(ΔT/T): Rapid fluctuation of pitch periodF1F2F3Large Correlations and similar directions18Test 3 – Vowels: Results (var)19Conclusions (1)• F0 is the relative most significant contributor to perception ofpersonal quality• Vocal Tract and Glottal Characteristics contribute to different perceptual dimensions from each other with F0 constant• Vocal Tract contributions to perception of personal quality varies with different vowels• The perceptual dimensions of F0, F1, α-slope of glottal, and fluctuation of F0 period are independent of vowel20Conclusions (2)• Authors claim success because…– Talkers cluster on the A1-A2 PAS– The P(c) from the listeners was about 60-70%– There is uniformity of the results despite different listeners– Acoustical parameters were found to influence perception of personal quality• Future Work:– Evaluate other


View Full Document

TAMU CSCE 689 - matsumoto1973acousticCorrelatesMDSSLIDES

Documents in this Course
slides

slides

10 pages

riccardo2

riccardo2

33 pages

ffd

ffd

33 pages

intro

intro

23 pages

slides

slides

19 pages

p888-ju

p888-ju

8 pages

w1

w1

23 pages

vfsd

vfsd

8 pages

subspace

subspace

48 pages

chapter2

chapter2

20 pages

MC

MC

41 pages

w3

w3

8 pages

Tandem

Tandem

11 pages

meanvalue

meanvalue

46 pages

w2

w2

10 pages

CS689-MD

CS689-MD

17 pages

VGL

VGL

8 pages

ssq

ssq

10 pages

Load more
Download matsumoto1973acousticCorrelatesMDSSLIDES
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view matsumoto1973acousticCorrelatesMDSSLIDES and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view matsumoto1973acousticCorrelatesMDSSLIDES 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?