DOC PREVIEW
CMU CS 15492 - sds_pda

This preview shows page 1-2-3-4-5 out of 16 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Speech Processing 15-492/18-492Spoken Dialog SystemsCase-study: Personal Digital AssistantsSpeech-based Personal Digital AssistantBuild a speech enabled PDABuild a speech enabled PDASpeech in/out for individual useSpeech in/out for individual useGoalsGoalsControl scheduleControl scheduleControl messagingControl messagingReplace personal assistantReplace personal assistantAny similarity to any existing product is purely Any similarity to any existing product is purely coincidentalcoincidentalDisclaimer: Much of this is relevant to Apple’s Disclaimer: Much of this is relevant to Apple’s SiriSiri, but this information is , but this information is general and may or may not be what is in general and may or may not be what is in SiriSiri..SPDA:ScopeScheduleScheduleCalls (in and out?)Calls (in and out?)NavigationNavigationFinding local businessesFinding local businessesWith reviewsWith reviewsOpen questionsOpen questionsReminders/AlarmsReminders/AlarmsSPDA: Scope“Call John”“Call John”“Call John, Bill and Mary and setup a meeting “Call John, Bill and Mary and setup a meeting sometime next week about Plan B that’s fits my sometime next week about Plan B that’s fits my schedule”schedule”“Make a reservation at a local Chinese restaurant “Make a reservation at a local Chinese restaurant for 4 at 8pm.”for 4 at 8pm.”“You should call your mom as its her birthday”“You should call your mom as its her birthday”“I have sent flowers to your mom as its her “I have sent flowers to your mom as its her birthday”birthday”CALO (DARPA)Cognitive Assistant that Learns OnlineCognitive Assistant that Learns OnlineDARPA project (2003DARPA project (2003--2008)2008)Led by SRI (involved many sites, including CMU)Led by SRI (involved many sites, including CMU)Personal Assistant that Learns (Pal)Personal Assistant that Learns (Pal)Answers questionsAnswers questionsLearn from experienceLearn from experienceTake initiativeTake initiativeSpinSpin--off company off company --> SIRI> SIRIAquired by Apple in April 2010Aquired by Apple in April 2010SPDA: PlatformDesktopDesktopComputational powerComputational powerPhone (nonPhone (non--smartphonesmartphone))General Magic General Magic Was handheld, became phone basedWas handheld, became phone basedLed into GM’s Led into GM’s OnStarOnStarSmartphoneSmartphoneLocal to deviceLocal to deviceWith CloudWith CloudSmartphone + CloudSmartphoneSmartphoneKnow about userKnow about userContacts, Schedule etcContacts, Schedule etcSame speaker Same speaker Some computation possible on deviceSome computation possible on deviceCloudCloudLearn from multiple examplesLearn from multiple examplesRetrain acoustic/language/understanding Retrain acoustic/language/understanding modelsmodelsVoice Search and User FeedbackVoice SearchVoice SearchGoogleGoogle, Bing, , Bing, VlingoVlingo, Apple, AppleGet users to help label the dataGet users to help label the dataListen to userListen to userShow best optionsShow best optionsThey select which on is correctThey select which on is correctFind out how users actually speakFind out how users actually speakFull sentences Full sentences vsvs“search terms”“search terms”How do English speakers say ethnic namesHow do English speakers say ethnic namesVoice Search: SimplificationsToo many words …Too many words …ContextContextWhere you are (location: home/not home)Where you are (location: home/not home)What is on your phone (contacts)What is on your phone (contacts)What you’ve said beforeWhat you’ve said beforePersonalityHave a characterHave a characterCalls you by name (you choose)Calls you by name (you choose)Pushy, helpful, nagging …Pushy, helpful, nagging …Allow user choiceAllow user choicePersonalize itPersonalize itMay form better relationship with itMay form better relationship with ite.g. e.g. SiriSiriUS and UK are female/maleUS and UK are female/maleMake it do things wellTargeted appsTargeted appsChose what it will do wellChose what it will do wellSay, 12 different appsSay, 12 different appsHave target (hand written) interactionHave target (hand written) interactionChose what fields you need, and how to Chose what fields you need, and how to intereactintereactwith with the back end datathe back end dataIf all else fails dump result in If all else fails dump result in GoogleGoogleHardware aidHardware aidInfraInfra--red detector for VADred detector for VADMarketingMake sure people know its thereMake sure people know its there(Voice search has been on PDA’s for years)(Voice search has been on PDA’s for years)Get a *lot* of people to use itGet a *lot* of people to use itGive “silly” examplesGive “silly” examplesPeople will repeat them, you can adapt your system People will repeat them, you can adapt your system and expect them to say themand expect them to say themKnow Your UsersYoung educatedYoung educatedStandard English speakersStandard English speakers(Non(Non--native too?)native too?)Can you train them to use it betterCan you train them to use it betterGet them to adaptGet them to adaptWhat is Missing?Add an SDKAdd an SDKOther app developers will want to allow speechOther app developers will want to allow speechMay make it harder to distinguishMay make it harder to distinguishDialog contextDialog contextWhat was said in the previous utteranceWhat was said in the previous utteranceOthers …Others …Will it work?Will people talk in publicWill people talk in publicTalking on the phone is now acceptableTalking on the phone is now acceptableTalking to the phone … Talking to the phone … Will people continue to use itWill people continue to use itCool at first, but easier to use menusCool at first, but easier to use menusOnly use for setting alarmsOnly use for setting alarmsLong term use …Long term use …But others may join in anywayBut others may join in


View Full Document
Download sds_pda
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view sds_pda and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view sds_pda 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?