DOC PREVIEW
Columbia COMS W4706 - Accenting and Information Status

This preview shows page 1-2-3-21-22-23-43-44-45 out of 45 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

2/28/2011 1Accenting and Information StatusJulia HirschbergCS 47062/28/2011 2Information Status• Topic/comment, theme/rhemeThe orangutan we wanted to buy escaped from the pet store.• Focus of attentionI only bought candy for that orangutan.•Given/newI only bought candy for that orangutan. I would never buy an ape drugs!• All commonly signaled in human speech by intonation2/28/2011 3Today: Acent and Given/New• Motivation in speech technology• Models of Given/New• Experiments on Given/New and pitch accent• Possible models of intonation wrt given/new entities• How might we identify given/new information automatically?• How should we produce given/new information appropriately?• Why is this important?2/28/2011 4A Simple Definition•Given: Recoverable from some form of contextor, what a Speaker believes to be in a Hearer’s consciousness•New: Not recoverable from context or, what a Speaker believes is not in a Hearer’s consciousness2/28/2011 5Role in Speech Technologies• TTS: Natural production– Given information is often deaccented– New information is usually accented• ASR: Improved recognition– Given information may already have been recognized earlier– New information may be important cue to topic shift• Summarization: Improved precision– Given information less likely to be included in a summary; new information more likely2/28/2011 6• Spoken Dialogue Systems: Grounding– Critical for system to convey what is given and what is new to facilitate Hearer comprehension2/28/2011 7Prince ’81: A More Complex Model• Speaker (S) and Hearer (H), in a discourse, construct a discourse model– Includes discourse entities, attributes, and links between entities– Discourse entities: individuals, classes, exemplars, substances, concepts (NPs)• Entities when first introduced are new– Brand-new (H must create a new entity)My dog bit a rhinoceros this morning.2/28/2011 8– Unused (H already knows of this entity)The sun came out this morning.• Evoked entities are old, or ‘given’ -- already in the discourse– Explicitly evoked (in text or speech)The rhinoceros was wearing suspenders. Rather unusual for a rhino.– Situationally evokedWatch out for the snake!• Inferables are also old, or ‘given’I bought a new car. The gear shift is a bit tricky.2/28/2011 9Prince ’92: A Still More Complex Model• Hearer-centric information status:–Given: what S believes H has in his/her consciousness–New: what S believes H does not have in his/her consciousness• But discourse entities may also be given and new wrt the current discourse– Discourse-old: already evoked in the discourse– Discourse-new: not evoked2/28/2011 10The stars are very bright tonight (Hearer-given; Discourse-new)When I see stars this bright, I think of my vacations in the mountains. (Hearer-given; Discourse-given)My friend Buddy and I would sneak out late at night. (Hearer-new; Discourse-new)I said, “My friend BUDDY…” (Hearer-new; Discourse-given)2/28/2011 11Given/New and Pitch Accent• New information is often accented and given information is often deaccented (Halliday ‘67, Brown ‘83, Terken ‘84) – But there are many exceptions: a simple TTS rule: accent ‘new’ and deaccent ‘given’will make 25-30% errors– How can we reduce these errors, to produce human-like intonation?2/28/2011 12Brown ‘83: Accent Status and Subclasses of Given/New• Speech elicitation in laboratory– 12 Scottish-English undergrads– A describes a diagram for B to draw, which Bcannot seeDraw a black triangle.Draw a circle in the middle.Draw a blue triangle next to the black one with a line from the top angle to the bottom.• Analysis: based on Prince ‘81 categories with modifications2/28/2011 13– Brand-new (a triangle), given:inferrable(middle, angle), given:contextually evoked(the page), given:‘textually’ evoked (divided into current topic vs. earlier mention)– Accent status of all entity-referring NPs• Results:– Brand-new information accented (87%)• Note: new entity/old expression issue– Given: contextually evoked information deaccented (98%)– Given: ’textually’ evoked deaccented (current topic 100%; earlier: 96%)– Given: inferable information accented (79%)2/28/2011 14Boston Directions Corpus (Hirschberg & Nakatani ’96)• Experimental Design• 12 speakers: 4 used• Spontaneous and read versions of 9 direction-giving tasks (monologues)• Corpus: 50m read; 67m spon• Labeling– Prosodic: ToBI intonational labeling– Given/new (Prince ’92), grammatical function, p.o.s.,…2/28/2011 15d1: dsp1: step 1: enter and get tokenfirstenter the Harvard Square T stopand buy a tokend2: dsp2: inbound on red linethenproceed to get on theinboundumRed Lineuh subwayBoston Directions Corpus: Describe how to get to MIT from Harvard2/28/2011 16dp3 dsp3: take subway from hs, to cs to ksandtake the subwayfrom Harvard Squareto Central Squareand then to Kendall Squaredp4: dsp4: get off T.then get off the T2/28/2011 17Hearer and Discourse Given/New Labelingfirstenter the Harvard Square T stopand buy a tokenthenproceed to get on theinboundumRed Lineuh subwayandtake the subwayfrom Harvard Squareto Central Squareand then to Kendall Squarethen get off the T2/28/2011 18Hearer and Discourse Given/New Labelingfirstenter <HG/DN the Harvard Square T stop>and buy <HI/DN a token>thenproceed to get on <HI/DN theinboundumRed Lineuh subway>andtake <HG/DG the subway>from <HG/DG Harvard Square>to <HG/DN Central Square>and then to <HG/DN Kendall Square>then get off <HG/DG the T>2/28/2011 19Does Given/New Status Predict Deaccenting?9505961304061009Total38.8%43.3%26.2%53.9%37.1%DeaccentedDNDGHNHIHGNPaHG: Hearer Given HI: Hearer Inferable HN: Hearer New DG: Discourse Given DN: Discourse New39.4% of (H or D) Given items deaccented…36.9% of (H or D) New Items are deaccented…2/28/2011 20And….Bard’99: Givenness, deaccenting and intelligibility• Speech elicited in laboratory– Glasgow Scottish-English Map Task• Each has a slightly different map• A traces a route described by B• Analysis– Compare repeated mentions of same items (i.e. given items) wrt accent status• Within dialogue• Across dialogue• Findings2/28/2011 21– Deaccenting rare in repeated mentions (within 15% and across 6% dialogues)– But repeated mentions were `less intelligible’• Caveats:– Were they really identifying ‘deaccenting’ (the absence of a pitch accent)?–


View Full Document

Columbia COMS W4706 - Accenting and Information Status

Download Accenting and Information Status
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Accenting and Information Status and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Accenting and Information Status 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?