DOC PREVIEW
MSU ECE 4522 - Automatic Speaker Recognition

This preview shows page 1-2-14-15-30-31 out of 31 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Automatic Speaker Recognition Recent Progress, Current Applications, and Future TrendsOutlineExtracting Information from SpeechIntroduction IdentificationIntroduction Verification/Authentication/DetectionIntroduction Speech ModalitiesIntroduction Voice as a BiometricIntroduction ApplicationsSlide 9General Theory Components of Speaker Verification SystemGeneral Theory Phases of Speaker Verification SystemGeneral Theory Features for Speaker RecognitionSlide 13General Theory Speech ProductionSlide 15General Theory Speaker ModelsSlide 17Slide 18General Theory Verification DecisionSlide 20Slide 21Verification Performance Evaluating Speaker Verification SystemsSlide 23Verification Performance NIST Speaker Verification EvaluationsVerification Performance Range of PerformanceVerification Performance Human vs. MachineVerification Performance Application DeploymentsVerification Performance Speaker + Knowledge VerificationSlide 29ConclusionsFuture DirectionsMIT Lincoln Laboratory Nuance Communications Automatic Speaker RecognitionRecent Progress, Current Applications, and Future TrendsDouglas A. Reynolds, PhDSenior Member of Technical StaffM.I.T. Lincoln LaboratoryLarry P. Heck, PhDSpeaker Verification R&DNuance CommunicationsMIT Lincoln Laboratory Nuance Communications Outline•Introduction and applications•General theory•Performance•Conclusion and future directionsMIT Lincoln Laboratory Nuance Communications Extracting Information from SpeechSpeechRecognitionLanguageRecognitionSpeakerRecognitionWordsLanguage NameSpeaker Name“How are you?”EnglishJames WilsonSpeech SignalGoal: Automatically extract information transmitted in speech signalMIT Lincoln Laboratory Nuance Communications IntroductionIdentification•Determines who is talking from set of known voices•No identity claim from user (many to one mapping)•Often assumed that unknown voice must come from set of known speakers - referred to as closed-set identification????Whose voice is this?MIT Lincoln Laboratory Nuance Communications IntroductionVerification/Authentication/Detection•Determine whether person is who they claim to be•User makes identity claim: one to one mapping•Unknown voice could come from large set of unknown speakers - referred to as open-set verification•Adding “none-of-the-above” option to closed-set identification gives open-set identification?Is this Bob’s voice?MIT Lincoln Laboratory Nuance Communications IntroductionSpeech Modalities•Text-dependent recognition–Recognition system knows text spoken by person–Examples: fixed phrase, prompted phrase–Used for applications with strong control over user input–Knowledge of spoken text can improve system performanceApplication dictates different speech modalities:•Text-independent recognition–Recognition system does not know text spoken by person–Examples: User selected phrase, conversational speech–Used for applications with less control over user input–More flexible system but also more difficult problem–Speech recognition can provide knowledge of spoken textMIT Lincoln Laboratory Nuance Communications IntroductionVoice as a BiometricStrongest security•Biometric: a human generated signal or attribute for authenticating a person’s identity•Voice is a popular biometric:–natural signal to produce–does not require a specialized input device–ubiquitous: telephones and microphone equipped PC •Voice biometric with other forms of security–Something you have - e.g., badge–Something you know - e.g., password–Something you are - e.g., voiceHaveKnowAreMIT Lincoln Laboratory Nuance Communications IntroductionApplications•Access control–Physical facilities–Data and data networks•Transaction authentication–Toll fraud prevention–Telephone credit card purchases–Bank wire transfers•Monitoring–Remote time and attendance logging–Home parole verification–Prison telephone usage•Information retrieval–Customer information for call centers–Audio indexing (speech skimming device)•Forensics–Voice sample matchingMIT Lincoln Laboratory Nuance Communications Outline•Introduction and applications•General theory•Performance•Conclusion and future directionsMIT Lincoln Laboratory Nuance Communications ACCEPTGeneral TheoryComponents of Speaker Verification SystemFeature extractionFeature extractionSpeakerModelSpeakerModelBob’s “Voiceprint”“My Name is Bob”ACCEPTBobImpostorModelImpostorModelIdentity ClaimDecisionDecisionREJECTInput SpeechImpostor “Voiceprints”MIT Lincoln Laboratory Nuance Communications General TheoryPhases of Speaker Verification SystemTwo distinct phases to any speaker verification systemFeature extractionFeature extractionModel trainingModel trainingEnrollment speech for each speakerBobSallyVoiceprints (models) for each speakerSallyBobEnrollment Enrollment PhasePhaseModel trainingModel trainingAccepted!Feature extractionFeature extractionVerificationdecisionVerificationdecisionClaimed identity: SallyVerification Verification PhasePhaseVerificationdecisionVerificationdecisionMIT Lincoln Laboratory Nuance Communications General TheoryFeatures for Speaker Recognition•Humans use several levels of perceptual cues for speaker recognitionSemantics, diction,pronunciations,idiosyncrasiesSocio-economicstatus, education,place of birthProsodics, rhythm,speed intonation,volume modulationPersonality type,parental influenceAcoustic aspect ofspeech, nasal,deep, breathy,roughAnatomical structureof vocal apparatusSemantics, diction,pronunciations,idiosyncrasiesSocio-economicstatus, education,place of birthProsodics, rhythm,speed intonation,volume modulationPersonality type,parental influenceAcoustic aspect ofspeech, nasal,deep, breathy,roughAnatomical structureof vocal apparatusHigh-level cues (learned traits)Low-level cues (physical traits)Easy to automatically extractDifficult to automatically extractHierarchy of Perceptual Cues•There are no exclusive speaker identifiably cues•Low-level acoustic cues most applicable for automatic systemsMIT Lincoln Laboratory Nuance Communications General TheoryFeatures for Speaker Recognition•Desirable attributes of features for an automatic system (Wolf ‘72)•Occur naturally and frequently in speech•Easily measurable•Not change over time or be affected by speaker’s health•Not be affected by reasonable background noise nor depend on specific transmission characteristics•Not be subject to


View Full Document

MSU ECE 4522 - Automatic Speaker Recognition

Documents in this Course
Theremin

Theremin

34 pages

Review

Review

12 pages

Load more
Download Automatic Speaker Recognition
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Automatic Speaker Recognition and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Automatic Speaker Recognition 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?