DOC PREVIEW
MIT 6 893 - Speech Recognition

This preview shows page 1-2-3-4 out of 11 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Speech RecognitionMIT 6.893SMA 5508Spring 2004Larry Rudolph (MIT)6.893 5508 Spring 2004: Speech Recognition Larry RudolphDRAFT -- To Be Revised ShortlyA long term goalSince 1950, AI researchers claimedCrucial problemWill be solved within the decadeFinally, it appears trueFailure rates still too high90% hit rate is 10% error ratewant 98% or 99% success rate6.893 Spring 2004: User Interface Larry RudolphSpectrum of choicesConstrained DomainUnconstrained DomainSpeaker DependentVoice tags (e.g. phone)Trained Dictation (Viavoice)Speaker IndependentGalaxy(we are here)What everyone wants6.893 5508 Spring 2004: Speech Recognition Larry RudolphDRAFT -- To Be Revised ShortlyWaveform to PhonemesWaveform is very fuzzyWe think there is a large break between words and sentenceshard to see from waveformMapping waveform segments to phonemes is not accurate6.893 5508 Spring 2004: Speech Recognition Larry RudolphDRAFT -- To Be Revised ShortlyPhonemes to wordsGroup phonemes into wordsnot always 1-1 mappingmissing phonemesfalse phonemes (extra ones)accentsmany possible choicesWord should be known to systemdomain or dictionary6.893 5508 Spring 2004: Speech Recognition Larry RudolphDRAFT -- To Be Revised Shortlywords to sentencesPeople do not always speak grammatically correctsome invariant rules (for speech)extra or missing wordsphrases not always sentencesEasier when sentence is in domaindomain specified by grammar6.893 5508 Spring 2004: Speech Recognition Larry RudolphDRAFT -- To Be Revised Shortlysentences into meaningDictation system: want sentencesOther system: want to understandIntegrate high-level processingMost applications need it anywayHelps with recognitionuseful to disambiguate input6.893 5508 Spring 2004: Speech Recognition Larry RudolphDRAFT -- To Be Revised Shortlymeaning into actionWhat happens after meaning?Respond to user (even a beep)Usually generate more substantial responseAction should be valid in context6.893 5508 Spring 2004: Speech Recognition Larry RudolphDRAFT -- To Be Revised ShortlyDisambiguationEach transformation is rarely highly accurateLots of choicesSubsequent steps can rule out choices from previous steps6.893 5508 Spring 2004: Speech Recognition Larry RudolphDRAFT -- To Be Revised Shortlydisambiguation strategySelect “n-best” choices and pass onEach step restricts possible meaningMake heavy use of probabilityViterby searchstate transitions along with probabilities. push through n choices at once6.893 5508 Spring 2004: Speech Recognition Larry RudolphDRAFT -- To Be Revised Shortlyafter domain dependentHandling out-of-vocabulary wordsMultimodal inputimprove recognition ratese.g. lip readingsometimes easier to point than


View Full Document

MIT 6 893 - Speech Recognition

Documents in this Course
Toolkits

Toolkits

16 pages

Cricket

Cricket

29 pages

Quiz 1

Quiz 1

8 pages

Security

Security

28 pages

Load more
Download Speech Recognition
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Speech Recognition and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Speech Recognition 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?