Johns Hopkins EN 600 465 - Machine translation: Word-based models and the EM algorithm

Unformatted text preview:

Machine translation: Word-based modelsand the EM algorithmChris Callison-Burch(slides borrowed from Philipp Koehn)December 3, 2007Chris Callison-Burch Word-based translation models and EM December 3, 20071Machine translation• Task: make sense of foreign text like• One of the oldest problems in Artificial Intelligence• Solutions may many encompass many other NLP applications: parsing,generation, word sense disambiguation, named entity recognition,transliteration, pronoun resolution, etc.Chris Callison-Burch Word-based translation models and EM December 3, 20072The Rosetta stone• Egyptian language was a mystery for centuries• 1799 a stone with Egyptian text and its translation into Greek was found⇒ Allowed people to learn how to translate EgyptianChris Callison-Burch Word-based translation models and EM December 3, 20073Modern day Rosetta stonesooner or later we will have to be sufficiently progressive in terms of own resources as a basis for this fair tax system . we plan to submit the first accession partnership in the autumn of this year . it is a question of equality and solidarity . the recommendation for the year 1999 has been formulated at a time of favourable developments and optimistic prospects for the european economy .that does not , however , detract from the deep appreciation which we have for this report . what is more , the relevant cost dynamic is completely under control. früher oder später müssen wir die notwendige progressivität der eigenmittel als grundlage dieses gerechten steuersystems zur sprache bringen . wir planen , die erste beitrittspartnerschaft im herbst dieses jahres vorzulegen .hier geht es um gleichberechtigung und solidarität .die empfehlung für das jahr 1999 wurde vor dem hintergrund günstiger entwicklungen und einer für den kurs der europäischen wirtschaft positiven perspektive abgegeben . im übrigen tut das unserer hohen wertschätzung für den vorliegenden bericht keinen abbruch . im übrigen ist die diesbezügliche kostenentwicklung völlig unter kontrolle . Chris Callison-Burch Word-based translation models and EM December 3, 20074Parallel data• Lots of translated text available: 100s of million words of translated text forsome language pairs– a book has a few 100,000s words– an educated person may read 10,000 words a day→ 3.5 million words a year→ 300 million a lifetime→ soon computers will be able to see more translated text than humans readin a lifetime⇒ Machines can learn how to translated foreign languagesChris Callison-Burch Word-based translation models and EM December 3, 20075Statistical Machine Translation• Components: Translation model, language model, decoderstatistical analysis statistical analysisforeign/Englishparallel textEnglishtextTranslationModelLanguageModelDecoding AlgorithmChris Callison-Burch Word-based translation models and EM December 3, 20076Lexical translation• How to translate a word → look up in dictionaryHaus — house, building, home, household, shell.• Multiple translations– some more frequent than others– for instance: house, and building most common– special cases: Haus of a snail is its shellChris Callison-Burch Word-based translation models and EM December 3, 20077Collect statistics• Look at a parallel corpus (German text along with English translation)Translation of Haus Counthouse 8,000building 1,600home 200household 150shell 50Chris Callison-Burch Word-based translation models and EM December 3, 20078Estimate translation probabilities• Maximum likelihood estimationpf(e) =0.8 if e = house,0.16 if e = building,0.02 if e = home,0.015 if e = household,0.005 if e = shell.Chris Callison-Burch Word-based translation models and EM December 3, 20079Alignment• In a parallel text (or when we translate), we align words in one language withthe words in the otherdas Haus ist kleinthe house is small1 2 3 41 2 3 4• Word positions are numbered 1–4Chris Callison-Burch Word-based translation models and EM December 3, 200710Alignment function• Formalizing alignment with an alignment function• Mapping an English target word at position i to a German source word atposition j with a function a : i → j• Examplea : {1 → 1, 2 → 2, 3 → 3, 4 → 4}Chris Callison-Burch Word-based translation models and EM December 3, 200711Reordering• Words may be reordered during translationdas Hausistkleinthe house is small1 2 3 41 2 3 4a : {1 → 3, 2 → 4, 3 → 2, 4 → 1}Chris Callison-Burch Word-based translation models and EM December 3, 200712One-to-many translation• A source word may translate into multiple target wordsdas Haus ist klitzekleinthe house is very small1 2 3 41 2 3 45a : {1 → 1, 2 → 2, 3 → 3, 4 → 4, 4 → 5}Chris Callison-Burch Word-based translation models and EM December 3, 200713Dropping words• Words may be dropped when t ranslated– The German article das is droppeddas Haus ist kleinhouse is small1 2 31 2 3 4a : {1 → 2, 2 → 3, 3 → 4}Chris Callison-Burch Word-based translation models and EM December 3, 200714Inserting words• Words may be added during translation– The English just does not have an equivalent in German– We still need to map it to something: special null tokendas Haus ist kleinthe house is just smallNULL1 2 3 41 2 3 450a : {1 → 1, 2 → 2, 3 → 3, 4 → 0, 5 → 4}Chris Callison-Burch Word-based translation models and EM December 3, 200715IBM Model 1• Generative model: break up translation process into smaller steps– IBM Model 1 only uses lexical translation• Translation probability– for a foreign sentence f = (f1, ..., flf) of length lf– to an English sentence e = (e1, ..., ele) of length le– with an alignment of each English word ejto a foreign word fiaccording tothe alignment function a : j → ip(e, a|f ) =leYj=1t(ej|fa(j))Chris Callison-Burch Word-based translation models and EM December 3, 200716Exampledas Haus ist kleine t(e|f)the 0.7that 0.15which 0.075who 0.05this 0.025e t(e|f)house 0.8building 0.16home 0.02household 0.015shell 0.005e t(e|f)is 0.8’s 0.16exists 0.02has 0.015are 0.005e t(e|f)small 0.4little 0.4short 0.1minor 0.06petty 0.04p(e, a|f) = t(the|das) × t(house|Haus) × t(is|ist) × t(small|klein)= 0.7 × 0.8 × 0.8 × 0.4= 0.0028Chris Callison-Burch Word-based translation models and EM December 3, 200717Learning lexical translation models• We would like to estimate the lexical translation probabilities t(e|f) from


View Full Document

Johns Hopkins EN 600 465 - Machine translation: Word-based models and the EM algorithm

Download Machine translation: Word-based models and the EM algorithm
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Machine translation: Word-based models and the EM algorithm and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Machine translation: Word-based models and the EM algorithm 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?