1CS 188: Artificial IntelligenceSpring 2006Lecture 28: Machine Translation5/2/2006Dan Klein – UC BerkeleyMachine Translation: ExamplesLevels of TransferInterlinguaSemanticStructureSemanticStructureSyntacticStructureSyntacticStructureWordStructureWordStructureSource TextTarget TextSemanticCompositionSemanticDecompositionSemanticAnalysisSemanticGenerationSyntacticAnalysisSyntacticGenerationMorphologicalAnalysisMorphologicalGenerationSemanticTransferSyntacticTransferDirect(Vauquoistriangle)General Approaches Rule-based approaches Expert system style rewrite systems Interlingua methods (analyze and generate) Lexicons come from humans or dictionaries Can be very fast, and can accumulate a lot of knowledge over time (e.g. Systran) Statistical approaches Noisy channel systems Lower-level transfer Lexicons discovered using parallel corpora Require little human declaration of knowledgeThe Coding View “One naturally wonders if the problem of translation could conceivably be treated as a problem in cryptography. When I look at an article in Russian, I say: ‘This is really written in English, but it has been coded in some strange symbols. I will now proceed to decode.’ ” Warren Weaver (1955:18, quoting a letter he wrote in 1947)MT System ComponentssourceP(e)efdecoderobserved argmax P(e|f) = argmax P(f|e)P(e)eeefbestchannelP(f|e)Language Model Translation ModelFinds an English translation which is both fluent and semantically faithful to the French source2Language Models Language Models Any probabilistic model capable of assigning probabilities to sentences Usually n-gram models, but also PCFGs Exact same technology (and software) as in ASR Train on a huge collection of monolingual corpora (documents in the target language)w1w2wn-1STOPSTARTParallel Corpora Parallel corpora (or bitexts) Collection of source-target translation pairs Main resource for learning a translation model Either naturally occurring (e.g. parliamentary proceedings, news translation services) or commissionedBuilding a Translation Model Steps in building a simple statistical translation model Match up words in training sentence pairs (word alignment) Learn a lexicon from these alignments Learn larger phrasesWhatis theanticipatedcostofcollecting fees under the new proposal?En vertudelesnouvellespropositions, quelestle coûtprévude perception de les droits?1-to-Many AlignmentsMany-to-Many Alignments The HMM Alignment Model The HMM model (Vogel 96) Re-estimate using the forward-backward algorithm Handling nulls requires some care Note: alignments are not provided, but induced-2 -1 0 1 2 33Examples: Translation and FertilityPhrases vs Word Modelsil hoche la têtehe is noddingExtracting Phrases Basic Phrase-Based Model[Koehn et al, 2003]Segmentation Translation DistortionDecoding Now we have a phrase table: A huge list of translation phrases (e.g. 1M phrases) Each phrase has a probability P(f|e) When we see a new input sentence: Grow a translation left to right Extend translation using known phrases Also multiply by language model scoreThe Pharaoh Decoder Probabilities at each step include LM and TM4Some OutputMadame la présidente, votre présidence de cette institution a étémarquante.Mrs Fontaine, your presidency of this institution has been outstanding.Madam President, president of this house has been discoveries.Madam President, your presidency of this institution has been impressive.Je vais maintenant m'exprimer brièvement en irlandais.I shall now speak briefly in Irish .I will now speak briefly in Ireland . I will now speak briefly in Irish .Nous trouvons en vous un président tel que nous le souhaitions.We think that you are the type of president that we want.We are in you a president as the wanted. We are in you a president as we the wanted.Translations Even human translators aren’t perfect: In an Austrian ski hotel:Not to perambulate the corridors in the hours of repose in the boots of ascension. In a Copenhagen airline ticket office:We take your bags and send them in all directions. From a brochure of a car rental firm in Tokyo: When passenger of foot heave in sight, tootle the horn. Trumpet him melodiously at first, but if he still obstacles your passage then tootle him with
View Full Document