Unformatted text preview:

Representing Intonational Variation Julia Hirschberg CS 4706 01 13 2019 1 Today How can we represent differences in how people produce speech that influence interpretation Expanded vs compressed pitch range Louder vs softer speech Faster vs slower speech Differences in intonational prominence Differences in intonational phrasing Differences in pitch contours 01 13 2019 2 Joseph Steele 1775 01 13 2019 3 Limitations Hard to representations similarities between contours Too tied to particular instances 01 13 2019 4 Language Learning Approaches A simpler approach IS it INteresting d you feel ANGry WHAT S the PROBlem McCarthy 1991 106 Too general Doesn t capture differences beyond rising and falling contours accented and unaccented words 01 13 2019 5 Goal Capture sufficient variation to explain both similarities and differences in prosodic meaning How much detail do we need to capture 01 13 2019 6 Prosodic Prominence Terms Prominence emphasis there s ALSO some SHOPPING pitch accent stress BDC h1s9 Prominence is an acoustic excursion use to make a word or syllable stand out from its surroundings Used to draw a listeners attention to some quality of an utterance Topic Contrast Focus Information Status Interspeech 2011 Tutorial M1 More Than Words Can Say 7 Prosodic Phrasing An acoustic perceived disjuncture between words Physiologically necessary a speaker cannot produce sound indefinitely Used to structure the information in an utterance grouping words into regions Phrasing structure may be related to syntactic structure finally we will get off at Park Street and get on the Red Line BDC h1s9 Interspeech 2011 Tutorial M1 More Than Words Can Say 8 Prosodic Phrasing An acoustic perceived disjuncture between words Physiologically necessary a speaker cannot produce sound indefinitely Used to structure the information in an utterance grouping words into regions Phrasing structure may be related to syntactic structure finally we will get off at Park Street and get on the Red Line BDC h1s9 Interspeech 2011 Tutorial M1 More Than Words Can Say 9 Pitch Contour Example Doubli ngErro r Halvin g Error Pitch fundamental frequency is estimated by finding the length of the period of the speech signal If a cycle is missed the period appears to be twice as long pitch halving If an extra cycle is found the period appears to be half as long pitch doubling Interspeech 2011 Tutorial M1 More Than Words Can Say 10 Tone Sequence Models Intonation generated from sequences of categorically different phonologically distinctive tones Basic unit of intonational description intonation phrase tone unit breath group Delimited by pauses phrase final lengthening pitch Syllables may be stressed or accented Accent aligned with primary stress telephone Indicated by F0 duration intensity voice quality 01 13 2019 11 Pierrehumbert 1980 Contours pitch accents phrase accents boundary tones Pitch Accents H L L H L H Phrase Accents L H Boundary Tone L H H L H L 01 13 2019 12 ToBI Tones and Break Indices Based on Pierrehumbert s intonational phonology Silverman et al 1992 Prosody is described by high H and low L tones that are associated with prosodic events pitch accents phrase accents and boundary tones and break indices which describe the degree of disjuncture between words ToBI is inherently categorical in its description of prosody ToBI variants exist for at least American English German Japanese Korean Portuguese Greek Catalan Interspeech 2011 Tutorial M1 More Than Words Can Say 13 ToBI Accenting Words are accented or not 5 possible pitch accent types in SAE High tones can be produced in a compressed pitch range catathesis or downstepping H L L H L H H H 14 ToBI Phrasing ToBI describes phrasing as a hierarchy of two levels Intermediate phrases contain one or more words Intonational phrases contain one or more intermediate phrases Word boundaries are marked with a degree of disjuncture or break index Break indices range from 0 4 3 intermediate phrase boundary 4 intonational phrase boundary 15 ToBI Phrase Ending Types Intermediate Phrase boundaries have associated Phrase Accents describing the pitch movement from the last accent to the phrase boundary Phrase Accents H H or L Intonational phrase boundaries have Boundary Tones describing the pitch movement immediately before the boundary Boundary Tones H or L L L L H H H H L H L 16 ToBI Example in Praat Interspeech 2011 Tutorial M1 More Than Words Can Say 17 L L L H H L H H H L L H 01 13 2019 18 L L L H H L H H L H H H H H 01 13 2019 19 Online training material available at http anita simmons edu tobi index html Evaluation Good inter labeler reliability for expert and naive labelers 88 agreement on presence absence of tonal category 81 agreement on category label 91 agreement on break indices to within 1 level Silverman et al 92 Pitrelli et al 94 01 13 2019 20 Superpositional models Pitch pattern of intonation modeled with two components phrase component and accent component Phrase has basic shape and pitch movements for individual accents are superimposed over basic shape plus 01 13 2019 Apples oranges and tomatoes 21 Fujisaki model Superpositional view of intonation Fujisaki Hirose 1982 Prosody is described by a phrase command which is modified by accent commands In the Fujisaki model this is an additive process in log Hz space Interspeech 2011 Tutorial M1 More Than Words Can Say 22 Fujisaki model Interspeech 2011 Tutorial M1 More Than Words Can Say 23 Good for modeling utterance level trends Declination downtrend in f0 over the course of an utterance Successful in speech synthesis for languages like Japanese little variation in accent type e g Lily and Rosa thought this was divine Prince William was gorgeous and he was looking for a bride They dreamed of wedding bells 01 13 2019 24 Superpositional vs Sequential Superpositional models require identification of a phrase signal Sequential models describe one prosodic event phrasing or prominence at a time Similarities Both describe phrasing and accenting If the phrasal context can be accommodated by a sequential model there are no analytical reasons to suspect that Differences Categorical vs continuous accent types Superpositional model is tightly coupled with pitch Interspeech 2011 Tutorial M1 More Than Words Can Say 25 F0 Modeling for TTS Generation or DB Retrieval event detection ToBI Fujisaki TILT Tonal Center of Gravity Quantized Contour Modeling Interspeech 2011 Tutorial M1 More Than Words Can Say 26 TILT Describes an F0


View Full Document

Columbia CS 4706 - Representing Intonational Variation

Loading Unlocking...
Login

Join to view Representing Intonational Variation and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Representing Intonational Variation and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?