MIT 6 893 - Speech Recognition - D111074

Home> Schools> Massachusetts Institute of Technology> Electrical Engineering and Computer Science (6) > 6 893> Speech Recognition

DOC PREVIEW

MIT 6 893 - Speech Recognition

School name Massachusetts Institute of Technology

Course 6 893-

Pages 11

This preview shows page 1-2-3-4 out of 11 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 11 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 11 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 11 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 11 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 11 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Speech RecognitionMIT 6.893SMA 5508Spring 2004Larry Rudolph (MIT)6.893 5508 Spring 2004: Speech Recognition Larry RudolphDRAFT -- To Be Revised ShortlyA long term goalSince 1950, AI researchers claimedCrucial problemWill be solved within the decadeFinally, it appears trueFailure rates still too high90% hit rate is 10% error ratewant 98% or 99% success rate6.893 Spring 2004: User Interface Larry RudolphSpectrum of choicesConstrained DomainUnconstrained DomainSpeaker DependentVoice tags (e.g. phone)Trained Dictation (Viavoice)Speaker IndependentGalaxy(we are here)What everyone wants6.893 5508 Spring 2004: Speech Recognition Larry RudolphDRAFT -- To Be Revised ShortlyWaveform to PhonemesWaveform is very fuzzyWe think there is a large break between words and sentenceshard to see from waveformMapping waveform segments to phonemes is not accurate6.893 5508 Spring 2004: Speech Recognition Larry RudolphDRAFT -- To Be Revised ShortlyPhonemes to wordsGroup phonemes into wordsnot always 1-1 mappingmissing phonemesfalse phonemes (extra ones)accentsmany possible choicesWord should be known to systemdomain or dictionary6.893 5508 Spring 2004: Speech Recognition Larry RudolphDRAFT -- To Be Revised Shortlywords to sentencesPeople do not always speak grammatically correctsome invariant rules (for speech)extra or missing wordsphrases not always sentencesEasier when sentence is in domaindomain specified by grammar6.893 5508 Spring 2004: Speech Recognition Larry RudolphDRAFT -- To Be Revised Shortlysentences into meaningDictation system: want sentencesOther system: want to understandIntegrate high-level processingMost applications need it anywayHelps with recognitionuseful to disambiguate input6.893 5508 Spring 2004: Speech Recognition Larry RudolphDRAFT -- To Be Revised Shortlymeaning into actionWhat happens after meaning?Respond to user (even a beep)Usually generate more substantial responseAction should be valid in context6.893 5508 Spring 2004: Speech Recognition Larry RudolphDRAFT -- To Be Revised ShortlyDisambiguationEach transformation is rarely highly accurateLots of choicesSubsequent steps can rule out choices from previous steps6.893 5508 Spring 2004: Speech Recognition Larry RudolphDRAFT -- To Be Revised Shortlydisambiguation strategySelect “n-best” choices and pass onEach step restricts possible meaningMake heavy use of probabilityViterby searchstate transitions along with probabilities. push through n choices at once6.893 5508 Spring 2004: Speech Recognition Larry RudolphDRAFT -- To Be Revised Shortlyafter domain dependentHandling out-of-vocabulary wordsMultimodal inputimprove recognition ratese.g. lip readingsometimes easier to point than

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-4 out of 11 pages.

MIT 6 893 - Speech Recognition

Sign up for free to view:

Please select your school