CMU CS 15492 - multilingual - D1894408

Home> Schools> Carnegie Mellon University> Computer Science (CS) > CS 15492> multilingual

DOC PREVIEW

CMU CS 15492 - multilingual

School name Carnegie Mellon University

Course Cs 15492- Special Topic: Speech Processing

Pages 23

This preview shows page 1-2-22-23 out of 23 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 23 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 23 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 23 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 23 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 23 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Speech Processing 15-492/18-492MultilingualityDealing with *all* LanguagesOver 6000 LanguagesOver 6000 LanguagesMaybe not all commercially interesting … nowMaybe not all commercially interesting … nowMajor languages (economic)Major languages (economic)Cell phone manufacturers list 46 languagesCell phone manufacturers list 46 languagesBut even those not all coveredBut even those not all coveredWhat you needASRASRAcoustic model (lots of speakers)Acoustic model (lots of speakers)Pronunciation LexiconPronunciation LexiconLanguage modelLanguage modelTTSTTSAcoustic model (one speaker)Acoustic model (one speaker)Pronunciation LexiconPronunciation LexiconText analysisText analysisWriting SystemsRomanized writing systemsRomanized writing systemsLatinLatin--1 (iso1 (iso--85998599--1)1)Covers many Western Europeans languagesCovers many Western Europeans languagesCyrillic Cyrillic Covers many Eastern European LanguagesCovers many Eastern European LanguagesArabic ScriptsArabic ScriptsArabic(sArabic(s), Farsi, Urdu, etc), Farsi, Urdu, etcDevenagariDevenagariCovers many Northern India LanguagesCovers many Northern India LanguagesChinese Chinese HanziHanziCovers some Chinese dialects but different versionsCovers some Chinese dialects but different versionsMany other scripts some nonMany other scripts some non--standardstandardWriting SystemsLetter based Letter based Latin, CyrillicLatin, CyrillicConsonant basedConsonant basedArabic, HebrewArabic, HebrewMora basedMora basedHalf syllable or syllableHalf syllable or syllableIndian scripts, Japanese native scriptsIndian scripts, Japanese native scriptsSyllable based Syllable based Hangul, ChineseHangul, ChineseStandardsWriting standardsWriting standardsTaught at schools, newspapers, computer Taught at schools, newspapers, computer supportsupportTypically standardized spellingTypically standardized spellingMay be mostly spokenMay be mostly spokenOccasionally writtenOccasionally writtenLanguage Specific IssuesNo explicit markingsNo explicit markingsStress, accent, tonesStress, accent, tonesNo word boundariesNo word boundariesChinese, ThaiChinese, ThaiNo (short) vowelsNo (short) vowelsArabic, HebrewArabic, HebrewRich morphologyRich morphologyMany different words in the languagesMany different words in the languagesFinnish, Turkish, GreenlandicFinnish, Turkish, GreenlandicGenre Specific IssuesNo capitals, punctuationsNo capitals, punctuationsUnpunctuatedUnpunctuatedPlain Plain vsvspolite formpolite formSpeech Speech vsvstext formtext formMany foreign phrasesMany foreign phrases(technology directed genre’s)(technology directed genre’s)Many new abbreviationsMany new abbreviationsE.g. SMS messagesE.g. SMS messagesCharacter EncodingUnicode Unicode vsvsutf8 utf8 vsvslatinlatinDocuments mix themDocuments mix themSometime accent omittedSometime accent omittedFor ease of typingFor ease of typingLots of standardsLots of standardsUnicode, EUC, BIG5, TIS42, …Unicode, EUC, BIG5, TIS42, …Everyone has their own standardEveryone has their own standardSome create their own standardsSome create their own standardsMixed character setsMixed character setsPhoneme SetsHard to find consensus for new languagesHard to find consensus for new languagesTypically lots of different dialectsTypically lots of different dialectsWhat level of distinction?What level of distinction?Some good for speech but not really phoneticSome good for speech but not really phonetic/t/ /t/ vsvs//dxdx/ in “water”/ in “water”Often doesn’t include foreign phonesOften doesn’t include foreign phones/w/ in German is common for younger people/w/ in German is common for younger peopleWordsMay be hard to defineMay be hard to defineNo word boundariesNo word boundariesRich morphologyRich morphologyWords have many variations of compoundsWords have many variations of compoundsYomenakattaYomenakatta--> could not read> could not readYomemasendeshitaYomemasendeshita--> could not read (polite)> could not read (polite)Gender specific speechGender specific speechBokuBokuvsvsatashiatashiLanguage mixturesLanguage mixturesPronunciation lexicons““proper” speech proper” speech vsvs“actual” speech“actual” speechHard to generalizeHard to generalizeChineseChineseCross lingual pronunciationsCross lingual pronunciations“Human” (English/German)“Human” (English/German)“Industry” wayCollect at least 100 hours of spoken speechCollect at least 100 hours of spoken speechAt least 20 different speakersAt least 20 different speakersMixture of gender, age, etcMixture of gender, age, etcThrough desired channel (phone/desktop)Through desired channel (phone/desktop)Collect at least 5 hours from one speakerCollect at least 5 hours from one speakerHigh quality recording studioHigh quality recording studioData should be targeted to applicationData should be targeted to applicationBuild pronunciation lexiconBuild pronunciation lexiconExpert Expert phonologistphonologistIndustry wayProbably 3Probably 3--6 months6 monthsLead developerLead developerLocal language expertLocal language expertLots of human transcribersLots of human transcribersCosts?Costs?Many hundreds of thousandsMany hundreds of thousandsOr cheaper (?) …Find existing dataFind existing dataLinguistic Data Consortium (Linguistic Data Consortium (UPennUPenn))ELRA (European equivalent)ELRA (European equivalent)AppenAppen, Australia, AustraliaFind local people who have collected dataFind local people who have collected dataFound data might be in wrong formatFound data might be in wrong formatData cleaning is often the most expensiveData cleaning is often the most expensiveActual wayOften mixtureOften mixtureFound data for initial modelFound data for initial modelCollect data with actual/initial applicationCollect data with actual/initial applicationMultilingual

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-22-23 out of 23 pages.

CMU CS 15492 - multilingual

Sign up for free to view:

Please select your school