DOC PREVIEW
Columbia COMS W4706 - Prosodic Prediction from Text and Prosodic Recognition for Speech Corpora

This preview shows page 1-2-3-19-20-39-40-41 out of 41 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Slide 1TodayAssigning Prosodic Variation in TTSGoal: Predict How Some Native Speaker Would Produce a SentenceA Simple ApproachProblemsMore Interesting ApproachesStatistical learning methodsWhere do we get the features: phrasing?What other features may be linked to phrasing?Slide 11Integrating More Syntactic InformationWhere do we get the features: accent?What phenomena are associated with accent?Slide 15How can we approximate such information?How do we evaluate the result of prosodic event assignment?How well can we do?Slide 19The Standard Corpus-Based ApproachAvailable ToBI Labeled CorporaReciprosodyShape Modeling – Feature GenerationPitch NormalizationTILTTonal Center of Gravity (ToG)Quantized Contour ModelingPitch Contour SmoothingPiecewise Linear FitBezier Curve FittingMomel StylizationFujisaki modelEstimating Fujisaki model parametersNormalization of Phone DurationsAnalyzing RhythmClassification TechniquesCorrecting Classifier DiagramCorrecting Classifier PerformanceAuToBIAuToBI PerformanceNext Class01/14/2019 1Prosodic Prediction from Text and Prosodic Recognition for Speech CorporaJulia HirschbergCS 4706Today•Motivation: –Why predict prosody from text?–Why recognize prosody in speech?•Predicting prosodic events from text: prosodic assignment to text input for TTS•Recognizing prosodic events: automatic analysis of TTS and other corpora01/14/2019 201/14/2019 3Assigning Prosodic Variation in TTSA car bomb attack on a police station in the northern Iraqi city of Kirkuk early Monday killed four civilians and wounded 10 others U.S. military officials said. A leading Shiite member of Iraq's Governing Council on Sunday demanded no more "stalling" on arranging for elections to rule this country once the U.S.-led occupation ends June 30. Abdel Aziz al-Hakim a Shiite cleric and Governing Council member said the U.S.-run coalition should have begun planning for elections months ago. -- Loquendo01/14/2019 4Goal: Predict How Some Native Speaker Would Produce a Sentence•What words to accent?–What kind of accent to use?•Where to place prosodic phrase boundaries?–What sort of boundaries?•Intonational or intermediate?•What tonal markings?–L-L% (falling)–H-L% (plateau)–H-H% (rising)–L-H% (continuation rise)–!H-L% (calling contour)01/14/2019 5A Simple Approach•Default solution–Accent:•Accent content words•Deaccent function words•Use only simple H* accents unless the input is a yes-no question – then use L*–Phrasing:•Place phrase boundaries at any non-sentence-final punctuation (e.g. , : ;) using L-H%•Use H-H% for yes-no questions•O.W. use L-L% for sentence final punctuation (! .)Problems•Simple accent strategy is only acceptable ~70% of the time•Simple phrasing strategy is right ~80% of the time.•So, in a 20-word sentence you will probably have about 6 accent errors and 4 phrasing errors: uhoh!01/14/2019 6More Interesting Approaches•Learn accent and phrasing assignment from large corpora: Machine Learning•Problems:–There may be many acceptable ways to utter a given sentence – how do we choose which to train on?–Requires the hand labeling of large corpora in some accepted convention–Tobi is accepted, can be labeled in reliably, but labeling takes a long time (about 60 times real time)01/14/2019 701/14/2019 8Statistical learning methods•Classification and regression trees (CART), Rule induction (Ripper), Support Vector Machines, HMMs, Neural Nets•Input: Vector of independent variables and one dependent (predicted) variable, e.g. ‘theres a phrase boundary here’ or ‘theres not’Feat1 Feat2 …FeatN DepVar•Input from hand labeled dependent variable and automatically extracted independent variables•Result can be integrated into TTS text processor01/14/2019 9Where do we get the features: phrasing?•Timing–Pause–Lengthening of final syllable of preceding word•F0 changes: high and low•Vocal fry/glottalization01/14/2019 10What other features may be linked to phrasing?•Syntactic information–Abney ’91 chunking major constituents–Steedman ’90, Oehrle ’91 CCGs …–Which ‘chunks’ tend to stick together?–Which ‘chunks’ tend to be separated intonationally?•Largest constituent dominating w(i) but not w(j)NP[The man in the moon] |? VP[looks down on you]•Smallest constituent dominating w(i),w(j)NP[The man PP[in |? moon]]–Part-of-speech of words around potential boundary siteThe/DET man/NN |? in/Prep moon/NN•Sentence-level information–Length of sentence–Position in sentence01/14/2019 11This is a very |? very very long sentence ?| which thus might have a lot of phrase boundaries in?| it ?| don’t you think?This |? isn’t.•Orthographic information–They live in Butte, ?| Montana, ?| don’t they?•Word co-occurrence informationVampire ?| bat …powerful ?| but benign… •Are words on each side accented or not?The cat in |? the•Where is the most recent previous phrase boundary?He asked for pills | but |?01/14/2019 12Integrating More Syntactic Information•Incremental improvements:–Adding higher-accuracy parsing (Koehn et al ‘00)•Collins or Charniak parsing in real time•Different syntactic representations: relational grammar? Tree-Adjoining Grammar?•Ranking vs. classification?01/14/2019 13Where do we get the features: accent?•F0 excursion•Durational lengthening•Voice quality•Vowel quality•Loudness01/14/2019 14What phenomena are associated with accent?•Word class: content vs. function words•Information status:–Given/new He likes dogs and dogs like him.–Topic/Focus Dogs he likes.–Contrast He likes dogs but not cats.•Grammatical function–The dog ate his kibble.•Surface position: Today George is hungry.01/14/2019 15•Association with focus:–John only introduced Mary to Sue.•Semantic parallelism–John likes beer but Mary prefers wine.•How many of these are easy to compute automatically?01/14/2019 16How can we approximate such information?•POS window•Position of candidate word in sentence•Location of prior phrase boundary•Pseudo-given/new•Location of word in complex nominal and stress prediction for that nominalCity hall, parking lot, city hall parking lot•Word co-occurrenceBlood vessel, blood orange01/14/2019 17How do we evaluate the result of prosodic event assignment?•How to define a Gold Standard?–Natural speech corpus–Multi-speaker/same text–Subjective judgments•No simple mapping from text to


View Full Document

Columbia COMS W4706 - Prosodic Prediction from Text and Prosodic Recognition for Speech Corpora

Download Prosodic Prediction from Text and Prosodic Recognition for Speech Corpora
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Prosodic Prediction from Text and Prosodic Recognition for Speech Corpora and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Prosodic Prediction from Text and Prosodic Recognition for Speech Corpora 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?