11-731 Machine Translation Syntax-Based Translation Models – Principles, Approaches, AcquisitionOutlineSyntax-based Models: RationaleSyntax-based Statistical MTSyntax-based Resources vs. ModelsSyntax-based Translation ModelsSlide 7Slide 8Structure Available During AcquisitionHierarchical Phrase-Based ModelsSlide 11Slide 12Syntax-Augmented Hierarchical ModelSlide 14Slide 15Tree-to-Tree: Stat-XFERTransfer Rule FormalismTranslation Lexicon: French-to-English ExamplesFrench-English Transfer Grammar Example Rules (Automatically-acquired)Syntax-driven Acquisition ProcessPFA Constituent Node AlignerPFA Node Alignment Algorithm ExampleSlide 23Slide 24Slide 25Further ImprovementsExtracted Syntactic PhrasesComparative Results: French-to-EnglishTransfer Rule AcquisitionRule Extraction AlgorithmSlide 31Slide 32Slide 33Slide 34Some Chinese XFER RulesDCU Tree-bank Alignment methodString-to-Tree: Galley et al. (GHKM)Slide 38Tree Transduction ModelsSlide 40SummaryMajor ChallengesSlide 43References11-731 Machine TranslationSyntax-Based Translation Models – Principles, Approaches, AcquisitionAlon Lavie16 March 201111-731 Machine Translation (2011) 2OutlineSyntax-based Translation Models: Rationale and MotivationResource Scenarios and Model DefinitionsString-to-Tree, Tree-to-String and Tree-to-TreeHierarchical Phrase-based Models (Chiang’s Hiero)Syntax-Augmented Hierarchical Models (Venugopal and Zollmann)String-to-Tree Models Phrase-Structure-based Model (Galley et al., 2004, 2006)Tree-to-Tree ModelsPhrase-Structure-based Stat-XFER Model (Lavie et al., 2008)DCU Tree-bank Alignment method (Zhachev, Tinsley et. al.)Tree-to-String ModelsTree Transduction Models (Yamada and Knight, Gildea et al.)11-731 Machine Translation (2011) 3Syntax-based Models: RationalePhrase-based models model translation at very shallow levels:Translation equivalence modeled at the multi-word lexical levelPhrases capture some cross-language local reordering, but only for phrases that were seen in training – No effective generalizationNon-local cross-language reordering is modeled only by permuting order of phrases during decodingNo explicit modeling of syntax, structural divergences or syntax-to-semantic mapping differencesGoal: Improve translation quality using syntax-based modelsCapture generalizations, reorderings and divergences at appropriate levels of abstractionModels direct the search during decoding to more accurate translationsStill Statistical MT: Acquire translation models automatically from (annotated) parallel-data and model them statistically!11-731 Machine Translation (2011) 4Syntax-based Statistical MTBuilding a syntax-based Statistical MT system:Similar in concept to simpler phrase-based SMT methods:Model Acquisition from bilingual sentence-parallel corporaDecoders that given an input string can find the best translation according to the modelsOur focus today will be on the models and their acquisitionNext week: Chris Dyer will cover decoding for hierarchical and syntax-based MT11-731 Machine Translation (2011) 5Syntax-based Resources vs. ModelsImportant Distinction:1. What structural information for the parallel-data is available during model acquisition and training?2. What type of translation models are we acquiring from the annotated parallel data?Structure available during Acquisition – Main Distinctions:Syntactic/structural information for the parallel training data:Given by external components (parsers) or inferred from the data?Syntax/Structure available for one language or for both?Phrase-Structure or Dependency-Structure?What do we extract from parallel-sentences?Sub-sentential units of translation equivalence annotated with structureRules/structures that determine how these units combine into full transductions11-731 Machine Translation (2011) 6Syntax-based Translation ModelsString-to-Tree:Models explain how to transduce a string in the source language into a structural representation in the target languageDuring decoding: No separate parsing on source sideDecoding results in set of possible translations, each annotated with syntactic structureThe best-scoring string+structure can be selected as the translationExample:ne VB pas (VP (AUX (does) RB (not) x211-731 Machine Translation (2011) 7Syntax-based Translation ModelsTree-to-String:Models explain how to transduce a structural representation of the source language input into a string in the target languageDuring decoding: Parse the source string to derive its structureDecoding explores various ways of decomposing the parse tree into a sequence of composable models, each generating a translation string on the target sideThe best-scoring string can be selected as the translationExamples:11-731 Machine Translation (2011) 8Syntax-based Translation ModelsTree-to-Tree:Models explain how to transduce a structural representation of the source language input into a structural representation in the target languageDuring decoding: Decoder synchronously explores alternative ways of parsing the source-language input string and transduce it into corresponding target-language structural output.The best-scoring structure+string can be selected as the translationExample:NP::NP [VP 北 CD 北 北北 ] [one of the CD countries that VP](;; Alignments(X1::Y7)(X3::Y4))11-731 Machine Translation (2011) 9Structure Available During AcquisitionWhat information/annotations are available for the bilingual sentence-parallel training data?(Symerticized) Viterbi Word Alignments (i.e. from GIZA++)(Non-syntactic) extracted phrases for each parallel sentenceParse-trees/dependencies for “source” languageParse-trees/dependencies for “target” languageSome major potential issues and problems:GIZA++ word alignments are not aware of syntax – word-alignment errors can have bad consequences on the extracted syntactic modelsUsing external monolingual parsers is also problematic:Using single-best parse for each sentence introduces parsing errorsParsers were designed for monolingual parsing, not translationParser design decisions for each language may be very different: •Different notions of constituency and structure•Different sets of POS and constituent labels11-731 Machine Translation (2011) 10Hierarchical Phrase-Based ModelsProposed by David Chiang
View Full Document