DOC PREVIEW
MIT 6 863J - Lecture 20: Machine translation 4

This preview shows page 1-2-3-4-5-35-36-37-38-39-71-72-73-74-75 out of 75 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 75 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 75 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 75 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 75 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 75 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 75 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 75 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 75 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 75 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 75 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 75 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 75 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 75 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 75 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 75 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 75 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

6.863J Natural Language ProcessingLecture 20: Machine translation 4Robert C. Berwick6.863J/9.611J Lecture 20 Sp03The Menu Bar• Administrivia:• final projects –•Agenda:• Combining statistics with language knowledge in MT• MT – the statistical approach (the “low road”)• Evaluation• Where does it go wrong? Beyond the “talking dog”• MT – Middleuropa ground• Transfer Approach: using syntax to helpHow to combine w/ statistical informationCan we attain the Holy Grail?6.863J/9.611J Lecture 20 Sp03How well does stat MT do?• What happens if the sentence is already seen(part of training pair)?• Then the system works just as hard• Remembrance of translations past…?• We get “only” 60% accuracy (but better than Systran…)• Let’s see how to improvethis by adding knowledge re syntax• Probably even better to add knowledge re semantics… as we shall see6.863J/9.611J Lecture 20 Sp03The game plan to get betterDirect/bigram30/24%Systran(transfer)54/74%Human84/86%IBM Model 5(Statistical)58/67%0100100ADEQUACYFLUENCYMT Fluency & Adequacy Bake-offStat+Syntaxtransfer6.863J/9.611J Lecture 20 Sp03Problemos• F in: L’atmosphère de la Terre rend un peumyopes mêmes les meilleurs de leur télèscopes• E out: The atmosphere of the Earth returns a little myopes same the best ones of their telescopes• (Systran): The atmosphere of the Earth makes a little short-sighted same the best of theirtélèscopes• (Better) The earth’s atmosphere makes even the best of their telescopes a little ‘near sighted’• Why?6.863J/9.611J Lecture 20 Sp03Let’s take a look at some results…6.863J/9.611J Lecture 20 Sp03Shouldshouldf t(f|e) phi (phi|e)devrait 0.330 1 0.649Devraient 0.123 0 0.336devrions 0.109 2 0.014faudrait 0.073faut 0.058doit 0.058aurait 0.041doivent 0.024devons 0.017devrais 0.0136.863J/9.611J Lecture 20 Sp03What about…• In French, what is worth saying is worth saying in many different ways• He is nodding:• Il fait signe qui oui• Il fait un signe de la tête• Il fait un signe de tête affirmatif• Il hoche la tête affirmativement6.863J/9.611J Lecture 20 Sp03Nodding hill…noddingf t(f|e) phi n(phi | e)signe 0.164 4 0.342la 0.123 3 0.293tête 0.097 2 0.167oui 0.086 1 0.163fait 0.073 0 0.023que 0.073hoche 0.054hocher 0.048faire 0.030me 0.024approuve 0.019qui 0.019un 0.012faites 0.0116.863J/9.611J Lecture 20 Sp03Best of 1.9 x 1026alignments!6.863J/9.611J Lecture 20 Sp03Best of 8.4 x 1029alignments!6.863J/9.611J Lecture 20 Sp035.6 x 1031alignments!6.863J/9.611J Lecture 20 Sp03Morals? ¿Moralejas? ????.• Always works hard – even if the input sentence is one of the training examples• Ignores morphology – so what happens?• Ignores phrasal chunks – can we include this? (Do we?)…• Can we include syntax and semantics?• (why not?)6.863J/9.611J Lecture 20 Sp03Other languages…• Aligning corpus – a cottage industry• Instruction Manuals• Hong Kong Legislation - Hansards• Macao Legislation• Canadian Parliament Hansards• United Nations Reports• Official Journalof the European Communities6.863J/9.611J Lecture 20 Sp03How can we do better?• Systran: transfer approach• Q: What’s that?• A: transfer rulesfor little bits of syntax• Then: combine these rules with the statistical method• Even doing this a little will improve us to about 65%• Gung ho – we can get to 70%• Can we get to the magic number?6.863J/9.611J Lecture 20 Sp03The golden (Bermuda?) trianglestword-wordsyntacticIncreasingabstractionlowhighthematicInterlingual meaning (universal)Source(eg, Spanish)Target(eg, English)6.863J/9.611J Lecture 20 Sp03The Bermuda triangle revisitedVauquois TriangleTransferInterlinguaGenerationAnalysisTransfer cost6.863J/9.611J Lecture 20 Sp03Transfer station• Transfer: Contrasts are fundamental to translation. Statements in one theory (source language) are mapped into statements in another theory (target language)• Interlingua: Meanings are language independent and can be encoded. They are extracted from Source sentences and rendered as Target sentences.6.863J/9.611J Lecture 20 Sp03Transfer approach• Analysis using a morphological analyser,parser and a grammar• Depending on approach, grammar must build syntactic and/or semantic representation• Transfer: mapping between S and T• Generation using grammar and morphological synthesizer (from analysis?)6.863J/9.611J Lecture 20 Sp03Transfer system: 2 languages6.863J/9.611J Lecture 20 Sp03Transfer – multiple languages6.863J/9.611J Lecture 20 Sp03Syntactic TransferDetelNperroAdjgrisNPN1DettheAdjgreyNdogNPN16.863J/9.611J Lecture 20 Sp03Syntactic transfer5 transfer rules: 3 syntax, 2 lexical6.863J/9.611J Lecture 20 Sp03Syntactic transfer• Maps trees to trees• No need for ‘generation’ except morphology• Method: top-down recursive, non-deterministic match of transfer rules (where tvis a variable) against tree in source language• Output is tree in target language (w/o word morphology)6.863J/9.611J Lecture 20 Sp03Simple syntactic transfer example• Rules (English-Spanish) – 3 in previous example• 1 for NP NP; 1 for N1 N1’; one for Det Det• Lexical correspondences• Suppose input is as in preceding example –trace through matching6.863J/9.611J Lecture 20 Sp03Syntactic transfer6.863J/9.611J Lecture 20 Sp03Handling other differences• E: You like her• S: Ella te gusta• Lit: She you-accusative pleases(Grammatical object in English is subject in Spanish, and v.v.)6.863J/9.611J Lecture 20 Sp03Tree mapping rule for thisSStv(subj)tv(obj)VPVP6.863J/9.611J Lecture 20 Sp03Is this systematic?• Yes, and taxonomic too…• Roughly 8-9 such ‘classes’ of divergence:1. Thematic2. Head switching3. Structural4. Lexical Gap5. Lexicalization6. Categorial7. Collocational8. Multi-lexeme/idiomatic9. Generalization/morphological6.863J/9.611J Lecture 20 Sp03Other divergences- systematic• E: The baby just ate• S: El bébé acaba de comer• Lit: The baby finish of to-eatHead-switching• E: Luisa entered the house• S: Luisa entró a la casa• Lit: Luisa entered to the houseStructural6.863J/9.611J Lecture 20 Sp03Divergences diverging• E: Camilio got up early• S: Camilio madrugóLexical gap• E: Susan swam across the channel• S: Susan cruzó el canal nadando• (Systran: Susan nadó a través del canal)• Lit: Susan crossed the channel swimming(manner & motion combined in verb E, path in across; in S, verb cruzó has motion & path, motion


View Full Document

MIT 6 863J - Lecture 20: Machine translation 4

Documents in this Course
N-grams

N-grams

42 pages

Semantics

Semantics

75 pages

Semantics

Semantics

82 pages

Semantics

Semantics

64 pages

Load more
Download Lecture 20: Machine translation 4
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 20: Machine translation 4 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 20: Machine translation 4 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?