Speech Processing 15-492/18-492Human Speech ProcessingPhonetics and PhonologyThe vocal tractFrom meat to voiceBlow air through lungsBlow air through lungsVibrate larynxVibrate larynxVocal tract shape defines resonanceVocal tract shape defines resonanceObstructions modify soundObstructions modify soundTongue, teeth, lips, velum (nasal passage)Tongue, teeth, lips, velum (nasal passage)The earFrom sound to brain wavesSound wavesSound wavesVibrate ear drumVibrate ear drumCause fluid in cochlear to vibrateCause fluid in cochlear to vibrateSpiral cochlearSpiral cochlearVibrate hairs inside cochlearVibrate hairs inside cochlearDifferent frequencies vibrate different hairsDifferent frequencies vibrate different hairsConverts time domain to frequency Converts time domain to frequency domainSdomainSFrom grunts to meaningGrunts and vocalizationGrunts and vocalizationLots of variation availableLots of variation available(continuous systems (continuous systems ––not discrete)not discrete)Noises become distinct, recognizableNoises become distinct, recognizableGrow into languages, dialects and idiolectsGrow into languages, dialects and idiolectsWhat are the fundamental units?What are the fundamental units?Articulatory MovementsElectromagnetic ArticulographPhonemesDefined as fundamental units of speechDefined as fundamental units of speechIf you change it, it (can) change the meaningIf you change it, it (can) change the meaning“pat” to “bat”“pat” to “bat”“pat” to ““pat” to “pampam””Vowel Space• One or two banded frequencies (formants)English (US) VowelsfOOlfOOlUWUWfUllfUllUHUHtOYtOY, , OYsterOYsterOYOYlOnelOne, , nOsenOseOWOWbEAtbEAt, , shEEpshEEpIYIYbItbIt, , shIpshIpIHIHgAtegAte, , EIghtEIghtEYEYmakERmakER, , sEARchsEARchERERgEtgEt, , fEAtherfEAtherEHEHhIdehIde, , bUYbUYAYAYAbout, About, cAnoecAnoeAXAXhOWhOW, , sOUthsOUthAWAWlAWnlAWn, , mAllmAllAOAObUtbUt, , hUshhUshAHAHfAtfAt, , bAdbAdAEAEwAshingtonwAshingtonAAAAEnglish ConsonantsStops: P, B, T, D, K, GStops: P, B, T, D, K, GFricatives: F, V, HH, S, Z, SH, ZHFricatives: F, V, HH, S, Z, SH, ZHAffricatives: CH, JHAffricatives: CH, JHNasals: N, M, NGNasals: N, M, NGGlides: L, R, Y, WGlides: L, R, Y, WNote: voiced Note: voiced vsvsunvoiced:unvoiced:P P vsvsB, F B, F vsvsVVNumber of Phonemes in LanguageUS English: 43US English: 43UK English: 44UK English: 44Japanese: 25Japanese: 25Hindi: 81Hindi: 81Numbers aren’t definite thoughNumbers aren’t definite thoughDepends on who you ask,Depends on who you ask,And what you want it forAnd what you want it forNot all variation is PhoneticPhonology: linguistically discrete unitsPhonology: linguistically discrete unitsMay be a number of different ways to say themMay be a number of different ways to say them/r/ trill (Scottish or Spanish) /r/ trill (Scottish or Spanish) vsvsUS wayUS wayPhonetics Phonetics vsvsPhonemicsPhonemicsPhonetics: discrete unitsPhonetics: discrete unitsPhonemics: all soundsPhonemics: all sounds/t/ in US English: becomes “flap”/t/ in US English: becomes “flap”“water” / w “water” / w aoaot t erer//“water” / w “water” / w aoaodxdxerer//Dialect and IdiolectVariation within language (and speakers)Variation within language (and speakers)PhoneticPhonetic“Don” “Don” vsvs“Dawn”, “Cot” “Dawn”, “Cot” vsvs“Caught”“Caught”R deletion (R deletion (HaavaadHaavaadvsvsHarvard)Harvard)Word choice:Word choice:Y’all, Y’all, YinsYinsPoliteness levelsPoliteness levelsNot all languages use the same setAsperatedAsperatedstops (Korean, Hindi)stops (Korean, Hindi)P P vsvsPHPHEnglish uses both, but doesn’t careEnglish uses both, but doesn’t carePot Pot vsvssPotsPot(place hand over mouth)(place hand over mouth)LL--R in Japanese not phonologicalR in Japanese not phonologicalUS English dialects:US English dialects:Mary, Merry, MarryMary, Merry, MarryScottish English Scottish English vsvsUS EnglishUS EnglishNo distinction between “pull” and “pool”No distinction between “pull” and “pool”Distinction between: “for” and “four”Distinction between: “for” and “four”Different language dimensionsVowel lengthVowel lengthBit Bit vsvsbeatbeatJapanese: Japanese: shujinshujin(husband) (husband) vsvsshuujinshuujin(prisoner)(prisoner)TonesTonesF0 (tune) used phoneticallyF0 (tune) used phoneticallyChinese, Thai, BurmeseChinese, Thai, BurmeseClicksClicksXhosaXhosaCo-articulationVoicing actually doesn’t always stopVoicing actually doesn’t always stop“have honey”, “impossible”“have honey”, “impossible”Nasalized voices, lip rounding Nasalized voices, lip rounding “min” “min” vsvs“bit”, “sow” “bit”, “sow” vsvs“see”“see”Lexical stress:Lexical stress:EMphasisEMphasis, , emPHAsisemPHAsisPROjectPROject, , proJECTproJECTReduction, contractionReduction, contraction“A boy is riding a bike”“A boy is riding a bike”“I want to go to Disneyland.”“I want to go to Disneyland.”“I will go tomorrow”“I will go tomorrow”ProsodyIntonationIntonationTuneTuneDurationDurationHow long/short of each phonemeHow long/short of each phonemePhrasingPhrasingWhere the breaks areWhere the breaks areIntonation (F0)Rate of vibration during voiced speechRate of vibration during voiced speechMales: 80Males: 80--140 times a second140 times a secondFemales: 130Females: 130--220 times a second220 times a secondChildren: 180Children: 180--320 times a second320 times a secondUsed for:Used for:EmphasisEmphasisStyle: questions, statements, confidence etcStyle: questions, statements, confidence etcIntonation ContourIntonation InformationLarge pitch range (female)Large pitch range (female)AuthoritiveAuthoritivesince goes down at the endsince goes down at the endNews readerNews readerEmphasis for Finance H*Emphasis for Finance H*Final has a raise Final has a raise ––more information to
View Full Document