DOC PREVIEW
CMU LTI 11731 - HISTORY OF MACHINE TRANSLATION

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1HISTORY OFMACHINE TRANSLATIONLTI MT Graduate ClassJaime CarbonellJanuary-22-2007OUTLINE:• Origins of MT• MIT and Georgetown Experiments• ALPAC Report • The MT Winter• MT in Europe and Japan• Resurgence of MT• Current approaches to MTOrigins of MT:Early “Successes”• 1933 – Smirnov-Troyanskii Patent for a word translation & printing machine• 1939-1941 – Troyanskii added memory (first Russian computer)• 1946 – MT as code-braking (ENIAC in US), Weaver et al• 1946-1947 – Weaver, Booth, Weiner… Weaver realizes complexity• 1949 – Weaver Memorandum (what it would take for MT) Origins of MT: Early “Successes”• 1951 – Bar Hillel survey ÆHuman/machine is best• 1952 – MIT Conference on MT (first small scale E-F, F-E mostly)• 1954 – Mechanical Translation Journal (Yngve)• 1954 – Georgetown-IBM Experiment (50 sentences R-E) Æ massive US fundingOrigins of MT: Early “Successes”• 1956-1962 – Massive MT efforts at U of Washington, IBM, Georgetown, MIT, Harvard, Oakridge, Rand, using any and all hardware including Mark II, ILIAC, …• 1960-1964 – Kuno (Harvard) and Oettinger (Georgetown) parser• 1955-1967 – UK active in MT (Booth, Cambridge group)• 1956-1965 – MT in Japan starts (Wada at ETL, Fukuoka at Kyushu, …)• 1960’s Æ on – GETA in Grenoble (Vauquois)Origins of MT: End of Optimism• 1960 – Bar-Hillel report and the FAHQT Myth• 1964, April – ALPAC Report2The MIT Early History: Bar-Hillel• Philosopher & Mathematician, but turned Linguist & MT booster• First-ever full-time MT researcher (MIT: 1951-1953)• Recognized lexical ambiguity as largest challenge for MT• Identified other MT challengesAmbiguity Makes MT Hard(not Bar Hillel’s examples)• SyntacticI saw the Grand Canyon flying to New York.Observe the man with the telescope with care. • Word Sense (i.e., “polysemy”)Power line (cable)Subway line (track)Be on line (be connected to internet)Be on the line (be on telephone)Line up (verb: to form a straight line)Line one’s pockets (verb: to get rich)Line one’s jacket (verb: add layer)Actor’s line (what an actor says)Get a line on someone (verb: get info)Ambiguity Makes MT Hard• Word Sense (even more senses in multiple English-Japanese Dictionaries)Power line –densen(電線)Subway line – chikatetsu (地下鉄)(Be) on line – onrain (オンライン)(Be) on the line – denwachuu (電話中)Line up – narabu (並ぶ)Line one’s pockets – kanemochi ni naru (金持ちになる)Line one’s jacket –uwagi o nijuu ni suru (上着を二重にする)Actor’s line –serifu(セリフ)Get a line on – joho o eru (情報を得る)Types of Machine TranslationInterlinguaSyntactic ParsingSemantic AnalysisSentence PlanningText GenerationSource(Arabic)Target(English)Transfer RulesDirect: SMT, EBMTThe MIT Early History:Victor Yngve• High-Energy Physicist turned Linguist•2nd-ever full-time MT researcher (MIT: 1953-1961)• Word-for-word MT => syntax matters (for resolving homonyms e.g. “block” and for word-order inversion)• Recognized phrasal lexiconThe MIT Early History:Victor Yngve• Invented analysis-transfer-generation method• Invented COMIT (operational grammar encoding)• Implemented Chomsky’s TG in COMIT (which proved a dismal failure for analysis)3The Georgetown Early History:Leon Dosert• Linguist & Interpreter during WWII• Attracted most MT funding (military)• Focused on Russian => English • Strongest advocate for MT researchThe Georgetown Early History:Other Contributors• Peter Toma – system builder• Murial Vasconcellos – later PanAm MT• M Zarechnak -- LinguistThe Georgetown Early History:First “large-scale” MT• About 100,000-word Russian Text MTed in demo adding out-of-dictionary words (1958)• System scaled further in next 5 years• GAT (Georgetown Automated Translator) ÆWell-known SYSTRAN in later yearsThe ALPAC Report:Members• Pierce (Chair) Bell Labs• Several discouraged MT researchers (Oettinger, Hays)• Linguists (Hamp, Hockett)• Token Computer Scientist (Alan Perlis from Carnegie Tech)The ALPAC Report:Findings• Myth – MT does not and cannot work• Reality – MT is more difficult than originally envisioned• Reality – Basic Research in NLP should be done before doing MT• Reality – MT is too expensive (computers cost more than people)The ALPAC Report:Net Effect• The end of Government-funded MT research in US for 10+ years• Continuation of private MT (e.g. Systran, Logos) in US• Not much effect on Japan or France (efforts continued)• USSR and UK followed US example, it appears4MT: 1967-1985ALPAC Myth Fades Away in US• SYSTRAN quite successful in E-R (Air Force at Wright-Patterson etc.)• Partial success E-S, E-F, E-G (SYSTRAN, Logos, Weidner)• SYSTRAN Æ use in Europe (later by EC)• Knowledge-Based MT (KBMT) concept advanced (Carbonell, Nirenburg, …)MT: 1967-1985 (II)ALPAC Myth Fades Away in US• “Underground MT” in US Universities dares to seek funding again• Machine-aided Translation (MAT) concept advanced (Kay, …)• Very-narrow-domain MT demonstrated (Kittredge et al, METEO)MT: 1975-1985Golden-Age of MT in Japan:1980’s• Nagao proposes Example-Based MT (not taken seriously then)• Nagao proposes Transfer-Based MT for E-J (Mu project)• Mu’s success triggers MT-mania in giant Japanese companies, e.g., ATLAS in Fujitsu, PIVOT in NEC, HICATS in Hitachi, …• Japanese MT Research budgets soar, US and Europe take note• JEIDA Report paints upbeat future for MT Types of Machine TranslationInterlinguaSyntactic ParsingSemantic AnalysisSentence PlanningText GenerationSource(eg, Arabic)Target(eg, English)Transfer RulesDirect: SMT, EBMTMT: 1975-1985MT in Europe, not as Rosy• “Interlingua” approach tried (ROSETTA, DLT)• First language-neutral Interlingua (Yale-MT, Carbonell & Cullingford 1979, 1981)• Eurotra proposed and started to build ultimate collaborative MT system, but later tanks due to incompatible transfer paradigms• …but SYSTRAN adopted by EC for volume internal translationsMT Matures 1985-1995:MT Spring in US• Center for Machine Translation at CMU opens in 1986• Interlingual KBMT success at CMU for domain-oriented MT (KANT) with controlled-language input, but did not generalize to open-ended and uncontrolled domains (PANGLOSS)• Resurgence of statistical corpus MT at IBM (Brown et al), which also succeeds for E-F but needs huge training


View Full Document

CMU LTI 11731 - HISTORY OF MACHINE TRANSLATION

Download HISTORY OF MACHINE TRANSLATION
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view HISTORY OF MACHINE TRANSLATION and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view HISTORY OF MACHINE TRANSLATION 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?