Unformatted text preview:

Error Detection and Correction in SDS Julia Hirschberg CS 4706 01 14 19 1 Today Avoiding errors Detecting errors From the user side what cues does the user provide to indicate an error From the system side how likely is it the system made an error Dealing with Errors what can the system do when it thinks an error has occurred Evaluating SDS evaluating problem dialogues 01 14 19 2 Avoiding misunderstandings The problem By imitating human performance Timing and grounding Clark 03 Confirmation strategies Clarification and repair subdialogues 01 14 19 3 Today Avoiding errors Detecting errors From the user side what cues does the user provide to indicate an error From the system side how likely is it the system made an error Dealing with Errors what can the system do when it thinks an error has occurred Evaluating SDS evaluating problem dialogues 01 14 19 4 Percentage of all repetitions Learning from Human Behavior Features in repetition corrections KTH 01 14 19 50 40 adults children 30 20 10 0 more increased shifting of clearly focus loudness articulated 5 Learning from Human Behavior Krahmer et al 01 Learning from human behavior go on and go back signals in grounding situations implicit explicit verification Positive short turns unmarked word order confirmation answers no corrections or repetitions new info Negative long turns marked word order disconfirmation no answer corrections repetitions no new info 01 14 19 6 Hypotheses supported but Can these cues be identified automatically How might they affect the design of SDS 01 14 19 7 Today Avoiding errors Detecting errors From the user side what cues does the user provide to indicate an error From the system side how likely is it the system made an error Dealing with Errors what can the system do when it thinks an error has occurred Evaluating SDS evaluating problem dialogues 01 14 19 8 Systems Have Trouble Knowing When They ve Made a Mistake Hard for humans to correct system misconceptions Krahmer et al 99 User I want to go to Boston System What day do you want to go to Baltimore Easier answering explicit requests for confirmation or responding to ASR rejections System Did you say you want to go to Baltimore System I m sorry I didn t understand you Could you please repeat your utterance 01 14 19 9 But constant confirmation or over cautious rejection lengthens dialogue and decreases user satisfaction 01 14 19 10 And Systems Have Trouble Recognizing User Corrections Probability of recognition failures increases after a misrecognition Levow 98 Corrections of system errors often hyperarticulated louder slower more internal pauses exaggerated pronunciation more ASR error Wade et al 92 Oviatt et al 96 Swerts Ostendorf 97 Levow 98 Bell Gustafson 99 01 14 19 11 Can Prosodic Information Help Systems Perform Better If errors occur where speaker turns are prosodically marked Can we recognize turns that will be misrecognized by examining their prosody Can we modify our dialogue and recognition strategies to handle corrections more appropriately 01 14 19 12 Approach Collect corpus from interactive voice response system Identify speaker turns misrecognition incorrectly recognized aware site where speakers first aware of error that correct misrecognitions correction Identify prosodic features of turns in each category and compare to other turns Use Machine Learning techniques to train a classifier to make these distinctions automatically 01 14 19 13 Turn Types TOOT Hi This is AT T Amtrak Schedule System This is TOOT How may I help you User Hello I would like trains from Philadelphia to New York leaving on misrecognition Sunday at ten thirty in the evening TOOT Which city docorrection you want to go to User New York 01 14 19 aware site 14 TOOT Dialogues Collected to study effects of differences in dialogue strategy on user performance and satisfaction Litman Pan 99 type of initiative system user mixed type of confirmation explicit implicit none adaptability condition Subjects 39 summer students 16 23 F M 20 19 native speaker non 01 14 19 15 Platform combined over the phone ASR and TTS Kamm et al 97 with web access to train information Task find train information for 4 scenarios Corpus for current study 2328 speaker turns 52 dialogues Misrecognitions Overall word accuracy 61 Overall concept accuracy CA 71 I want to go to Boston from Philadelphia 2 domain concepts recognized as I want to go to Boston one concept 50 01 14 19 16 A Successful Dialogue S Hi this is AT T Amtrak schedule system This is TOOT How may I help you U I would like a train to New York City from Philadelphia on Sunday at ten thirty P M S I heard you say go to New York City from Philadelphia on Sunday approximately at ten thirty p m Do you want me to find the trains from Philadelphia to New York City on Sunday approximately at ten thirty p m now U Yes S I am going to get the train schedule for you 01 14 19 17 Are Misrecognitions Aware Turns Corrections Measurably Different from Other Turns For each type of turn For each speaker for each prosodic feature calculate mean values for e g all correctly recognized speaker turns and for all incorrectly recognized turns Perform paired t tests on these speaker pairs of means e g for each speaker pairing mean values for correctly and incorrectly recognized turns 01 14 19 18 How Prosodic Features Examined per Turn Raw prosodic acoustic features f0 maximum and mean pitch excursion range rms maximum and mean amplitude total duration duration of preceding silence amount of silence within turn speaking rate estimated from syllables of recognized string per second Normalized versions of each feature compared to first turn in task to previous turn in task Z scores 01 14 19 19 Distinguishing Correct Recognitions from Misrecognitions NAACL 00 Misrecognitions differ prosodically from correct recognitions in F0 maximum higher RMS maximum louder turn duration longer preceding pause longer slower Effect holds up across speakers and even when hyperarticulated turns are excluded 01 14 19 20 WER Based Results Misrecognitions are higher in pitch louder longer more preceding pause and less internal silence 01 14 19 F0 Max F0 Mean RMS Max RMS Mean Duration Prior Pause Tempo Sil in Turn T stat Mean MisrecMean Rec P 5 78 1 52 2 52 1 82 9 94 5 586 4 71 1 48 25 84 Hz 1 56 Hz 150 56 25 05 2 13 sec 0 29 sec 0 54 sps 02 0 000 0 140 0 020 0 080 0 000 0 000 0 000 0 150 21 Predicting Turn Types Automatically Ripper Cohen 96 automatically induces rule sets for predicting


View Full Document

Columbia CS 4706 - Error Detection and Correction in SDS

Loading Unlocking...
Login

Join to view Error Detection and Correction in SDS and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Error Detection and Correction in SDS and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?