DOC PREVIEW
CMU LTI 11731 - Machine Translation Word Alignment

This preview shows page 1-2-22-23 out of 23 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Machine Translation Word AlignmentOverviewFertility ModelsThe Generative StoryFertility ModelFertility Model: ConstraintsSlide 7Fertility Model: Some IssuesFertility Model: Empty PositionSlide 10DeficiencyIBM 4: 1st Order Distortion ModelInverted AlignmentCharacteristics of Alignment ModelsConsideration: OverfittingExtension: Using Manual DictionariesExtension: Using POSAnd Much More …Alignment ResultsUnaligned WordsAlignment Errors for Most Frequent Words (CH-EN)Sentence Length DistributionSummaryStephan Vogel - Machine Translation 1Machine TranslationWord AlignmentStephan VogelSpring Semester 2011Stephan Vogel - Machine Translation 2OverviewIBM 3: FertilityIBM 4: Relative DistortionAcknowledgement: These slides are based on slides by Hermann Ney and Franz Josef OchStephan Vogel - Machine Translation 3Fertility ModelsBasic concept: each word in one language can generatemultiple words in the other languagedeseo – I would likeübermorgen – the day after tomorrowdeparted – fuhr abThe same word can generate different number of words -> probability distribution Alignment is function -> fertility only on one sideIn my terminology: target words have fertility, i.e. each target word can cover multiple source wordsOthers say source word generates multiple target wordsSome source words are aligned to NULL word, i.e. NULL word has fertilityMany target words are not aligned, i.e. have fertility 0Stephan Vogel - Machine Translation 4The Generative Storye0e1e2e3e4e51 2 0 1 3 0 f01 f11 f12 f31 f41 f42 f43f1f2f3f4f5f6f7fertilitygenerationwordgenerationpermutationgenerationStephan Vogel - Machine Translation 5Fertility ModelJaIJJIJeafef0)|,Pr()|Pr(01101)(ie)(,...,1,~iief Alignment model:Select fertility for each English word:For each English word select a tablet of French words:Select a permutation for the entire sequence of French words:iji ),(:Sum over all realizations:),(),~(001111)|,~Pr()|,Pr(JJaffIIJJefeafStephan Vogel - Machine Translation 6Fertility Model: ConstraintsJjjiiaie1),()(iffi~Fertility bound to alignment:Permutation:French words:iajiiii :,...,1 ,Stephan Vogel - Machine Translation 7Fertility ModelIiiiIIiefpef0 100)|~(),|~Pr(),,~|Pr(),|~Pr()|Pr()|,~Pr(0000000IIIIIIIefefeef IiiiIiiIIepepe110000)|(),|()|Pr(Decomposition into factors:Apply chain rule to each factor, limit dependencies:Fertility generation (IBM 3,4,5):Word generation (IBM 3,4,5):Permutation generation (only IBM 3):IiiIIiJIipef1 1000),,|(!1),,~|Pr(Note: 1/0 results from special model for i = 0.Stephan Vogel - Machine Translation 8Fertility Model: Some IssuesPermutation model can not guaranty that p is a permutation-> Words ca be stacked on top of each other-> This leads to deficiencyPosition i = 0 is not a real position-> special alignment and fertility model for the empty wordStephan Vogel - Machine Translation 9Fertility Model: Empty PositionAlignment assumptions for the empty position i = 0Uniform position distribution for each of the 0 French words generated from e0Place these French words only after all other words have been placedAlignment model for the positions aligned to the Empty position:One position:All positions:0010010!111),,0|(JIip vacantis j if11occupied is j if0:),,0|(00JIijpStephan Vogel - Machine Translation 10Fertility Model: Empty PositionFertility model for words generated by e0, i.e. by empty positionWe assume that each word from f1J requires the Empty word withprobability [1 – p0]Probability that exactly 0from the J words in f1J require the Empty word:': ,:'with ]1['),'|(010'000000JJJppJeJpIiiJStephan Vogel - Machine Translation 14DeficiencyDistortion model for real words is deficientDistortion model for empty word is non-deficientDeficiency can be reduced by aligning more words to the empty wordTraining corpus likelihood can be increased by aligning more words with empty wordPlay with p0!Stephan Vogel - Machine Translation 15IBM 4: 1st Order Distortion ModelIntroduce more detailed dependencies into the alignment (permutation) modelFirst order dependency along e-axisHMMIBM4Stephan Vogel - Machine Translation 16Inverted AlignmentConsider alignmentsDependency along I axis: jumps along the J axisTwo first order models for aligning first word in a set and for aligning remaining wordsWe skip the math :-)},...,,...,1{: JjBiBi ...)|( and ...)|(11jpjp Stephan Vogel - Machine Translation 17Characteristics of Alignment ModelsModel Alignment Fertility E-step DeficientIBM1 Uniform No Exact NoIBM2 0-order No Exact NoHMM 1-order No Exact NoIBM3 0-order Yes Approx YesIBM4 1-order Yes Approx YesIBM5 1-order Yes Approx NoStephan Vogel - Machine Translation 18Consideration: OverfittingTraining on data has always the danger of overfittingModel describes training data in too much detailBut does not perform well on unseen test dataSolution: SmoothingLexicon: distribute some of the probability mass from seen events to unseen eventsfor p( f | e ), do this for each e)For unseen e: uniform distribution or ???Distortion: interpolate with uniform distributionFertility: for many languages ‘longer word’ = ‘more content’E.g. compounds or agglutinative morphologyTrain a model for fertility given word length and interpolate with Interpolate fertility estimates based on word frequency: frequent word, use the word model, low frequency word bias towards the length model ))(|( egp))(|( egp/Iα,I)a|α)p(a(,I)a|p'(ajjjj1111Stephan Vogel - Machine Translation 19Extension: Using Manual DictionariesAdding manual dictionariesSimple method 1: add as bilingual dataSimple method 2: interpolate manual with trained dictionaryUse constraint GIZA (Gao, Nguyen, Vogel, WMT 2010)Can put higher weight on word pairs from dictionary (Och, ACL 2000)Not so simple: “But dictionaries are data too” (Brown et al, HLT 93)Problem: manual dictionaries do not have inflected formPossible


View Full Document

CMU LTI 11731 - Machine Translation Word Alignment

Download Machine Translation Word Alignment
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Machine Translation Word Alignment and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Machine Translation Word Alignment 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?