DOC PREVIEW
MEMT: Multi-Engine Machine Translation Guided by Explicit Word Matching

This preview shows page 1-2-3-4-5-6 out of 17 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

MEMT: Multi-Engine Machine Translation Guided by Explicit Word MatchingMEMT Goals and ApproachSynthetic Combination MEMTSlide 4The Word Alignment MatcherMatcher ExampleScoring MEMT HypothesesDemoExampleSystem DevelopmentExperimental Results: Arabic-to-EnglishArchitecture and EngineeringUIMA-based MEMTSlide 14ConclusionsOpen Research IssuesReferencesMEMT:Multi-Engine Machine Translation Guided by Explicit Word MatchingAlon LavieLanguage Technologies InstituteCarnegie Mellon University Joint work with:Gregory Hanneman, Justin Merrill, Shyamsundar Jayaraman, Satanjeev Banerjee, Jaime CarbonellMarch 22, 2006 GALE: MEMT 2MEMT Goals and Approach•Scientific Challenge:–How to combine the output of multiple MT engines into a synthetic output that outperforms the originals in translation quality–Synthetic combination of the output from the original systems, NOT just selecting the best system•Engineering Challenge:–How to integrate multiple distributed translation engines and the MEMT combination engine in a common framework that supports ongoing development and evaluationMarch 22, 2006 GALE: MEMT 3Synthetic Combination MEMTTwo Stage Approach:1. Identify common words and phrases across the translations provided by the engines2. Decode: search the space of synthetic combinations of words/phrases and select the highest scoring combined translationExample:1. announced afghan authorities on saturday reconstituted four intergovernmental committees 2. The Afghan authorities on Saturday the formation of the four committees of governmentMarch 22, 2006 GALE: MEMT 4Synthetic Combination MEMTTwo Stage Approach:1. Identify common words and phrases across the translations provided by the engines2. Decode: search the space of synthetic combinations of words/phrases and select the highest scoring combined translationExample:1. announced afghan authorities on saturday reconstituted four intergovernmental committees 2. The Afghan authorities on Saturday the formation of the four committees of governmentMEMT: the afghan authorities announced on Saturday the formation of four intergovernmental committeesMarch 22, 2006 GALE: MEMT 5The Word Alignment Matcher•Developed by Satanjeev Banerjee as a component in our METEOR Automatic MT Evaluation metric•Finds maximal alignment match with minimal “crossing branches”•Allows alignment of:–Identical words–Morphological variants of words–Synonymous words (based on WordNet synsets)•Implementation: Clever search algorithm for best match using pruning of sub-optimal sub-solutionsMarch 22, 2006 GALE: MEMT 6Matcher Examplethe sri lanka prime minister criticizes the leader of the countryPresident of Sri Lanka criticized by the country’s Prime MinisterMarch 22, 2006 GALE: MEMT 7Scoring MEMT Hypotheses•Scoring:–Word confidence score [0,1] based on engine confidence and reinforcement from alignments of the words–LM score based on trigram LM–Log-linear combination: weighted sum of logs of confidence score and LM score–Select best scoring hypothesis based on:•Total score (bias towards shorter hypotheses)•Average score per wordMarch 22, 2006 GALE: MEMT 8DemoMarch 22, 2006 GALE: MEMT 9ExampleIBM: victims russians are one man and his wife and abusing their eight year old daughter plus a ( 11 and 7 years ) man and his wife and driver , egyptian nationality . : 0.6327 ISI: The victims were Russian man and his wife, daughter of the most from the age of eight years in addition to the young girls ) 11 7 years ( and a man and his wife and the bus driver Egyptian nationality. : 0.7054 CMU: the victims Cruz man who wife and daughter both critical of the eight years old addition to two Orient ( 11 ) 7 years ) woman , wife of bus drivers Egyptian nationality . : 0.5293 MEMT Sentence : Selected : the victims were russian man and his wife and daughter of the eight years from the age of a 11 and 7 years in addition to man and his wife and bus drivers egyptian nationality . 0.7647 -3.25376Oracle : the victims were russian man and wife and his daughter of the eight years old from the age of a 11 and 7 years in addition to the man and his wife and bus drivers egyptian nationality young girls . 0.7964 -3.44128March 22, 2006 GALE: MEMT 10System Development•Initial development tests performed on TIDES 2003 Arabic-to-English MT data, using IBM, ISI and CMU SMT system output•Evaluation tests performed on Arabic-to-English EBMT Apptek and SYSTRAN system output and on three Chinese-to-English COTS systems•Tests on GALE dry-run data currently in progress:–MT systems from IBM, CMU, UMDMarch 22, 2006 GALE: MEMT 11Experimental Results:Arabic-to-EnglishSystem METEOR ScoreApptek .4241EBMT .4231Systran .4405Choosing best online translation .4432MEMT .5185Best hypothesis generated by MEMT .5883March 22, 2006 GALE: MEMT 12Architecture and Engineering•Challenge: How do we construct an effective architecture for running MEMT within large-scale distributed projects?–Example: GALE Project–Multiple MT engines running at different locations–Input may be text or output of speech recognizers, Output may go downstream to other applications (IE, Summarization, TDT)•Approach: Using IBM’s UIMA: Unstructured Information Management Architecture–Provides support for building robust processing “workflows” with heterogeneous components–Components act as “annotators” at the character level within documentsMarch 22, 2006 GALE: MEMT 13UIMA-based MEMT•MT engines and MEMT engine are set up as distributed servers:–Communication over socket connections–Sentence-by-sentence translation•Java “wrappers” convert these into UIMA-style annotator components•UIMA-based “workflows” implement a variety of a-synchronous tasks, with results stored in a common Annotations Database (ADB)–Translation workflows–MEMT workflow–Evaluation/scoring workflow•ADB and ADB Collection Reader/Consumer components developed at CMU by Eric Nyberg’s groupMarch 22, 2006 GALE: MEMT 14UIMA-based MEMT•MEMT Workflow:–Retrieve document translation annotations labeled by X, Y, Z from ADB–“Annotate” the document with a new MEMT annotation–Write back MEMT annotation into ADBMarch 22, 2006 GALE: MEMT 15Conclusions•New sentence-level MEMT approach with nice properties and encouraging results•Easy to run on both research and COTS systems•UIMA-based architecture design for effective integration in large distributed systems/projects–Pilot study has been


MEMT: Multi-Engine Machine Translation Guided by Explicit Word Matching

Download MEMT: Multi-Engine Machine Translation Guided by Explicit Word Matching
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view MEMT: Multi-Engine Machine Translation Guided by Explicit Word Matching and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view MEMT: Multi-Engine Machine Translation Guided by Explicit Word Matching 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?