Unformatted text preview:

C SC 620 Advanced Topics in Natural Language ProcessingReading ListPaper 16. Automatic Translation and the Concept of Sublanguage. Lehrberger, J.Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Paper 16. Automatic Translation and the Concept of Sublanguage. Lehrberger, J.Slide 22Slide 23Slide 24Slide 25Slide 26Slide 27Slide 28Slide 29Slide 30Slide 31Slide 32Slide 33Slide 34Slide 35Slide 36Slide 37Slide 38Slide 39Slide 40Slide 41Slide 42Slide 43C SC 620Advanced Topics in Natural Language ProcessingLecture 183/30Reading List•Readings in Machine Translation, Eds. Nirenburg, S. et al. MIT Press 2003.•Reading list:–12. Correlational Analysis and Mechanical Translation. Ceccato, S.–13. Automatic Translation: Some Theoretical Aspects and the Design of a Translation System. Kulagina, O. and I. Mel’cuk–16. Automatic Translation and the Concept of Sublanguage. Lehrberger, J.–17. The Proper Place of Men and Machines in Language Translation. Kay, M.Paper 16. Automatic Translation and the Concept of Sublanguage. Lehrberger, J. •Concept of sublanguage–language of X is a sublanguage of English•where X = (physics, aeronautics, electronics, etc.)–It is within the domain of sublanguages that automatic translation appears to be practical•Example: –Taum-meteo: English -> French for weather reports–aviation maintenance manualsPaper 16. Automatic Translation and the Concept of Sublanguage. Lehrberger, J. •2 Description of a Particular Sublanguage•2.1 The Corpus–TAUM = Traduction Automatique Université de Montréal–Instructions for aircraft maintenance•70,000 words in English•3,548 different words•nouns 1714 verbs 667 adjectives 664 adverbs 168•prepositions 134 numerals 63 quantifiers 46 pronouns 35•571 idioms: 443 of which are technicalPaper 16. Automatic Translation and the Concept of Sublanguage. Lehrberger, J. •2.2 Restrictions•2.2.1 Lexical Restrictions–4,876 different lexical items in 70,000 words–Estimate for full set of texts: 40,000 lexical items–Compare to Webster’s 3rd: 450,000–Vocabulary of this sublanguage is highly restricted•contains: aileron, motor, compressor, jack, filter, check, axial, quick-disconnect•not present: parsley, meson, seduce, endocrine, hope, think, believe, personal pronouns (I, me, we, us, he, she)•categories noun, verb, adjective and adverb are the most limited•all articles and coordinate conjunctions present•80% of one-word prepositions: not apropos, notwithstandingPaper 16. Automatic Translation and the Concept of Sublanguage. Lehrberger, J. •2.2.1 Syntactic Restrictions–Direct questions do not occur at all•Do you have your tool kit?•Is the motor turned off?–Tag questions inappropriate•Check the batteries, won’t you?•The switch should not be on, should it?–No simple past tense•The engine stopped•High temperatures caused buckling–No exclamatory sentences•How powerful the engine is!•What a complex hydraulic system this plane has!Paper 16. Automatic Translation and the Concept of Sublanguage. Lehrberger, J. –Full range of constructions present:•passive, restrictions and non-restrictive relative clauses, extraposition, nominalization–Long and complicated sentences common:•“This unit contains the fuel metering section. shutoff valve, and a mechanical governor that functions as either an over speed governor for the high pressure rotor or provides manual control when the electronic computer section of the fuel control system is deactivated.”•“… as lightweight, two-spool geared transonic-stage, front-fan, jet propulsion engine.”Paper 16. Automatic Translation and the Concept of Sublanguage. Lehrberger, J. –Difficult problems •conjunction scope–“Disconnect pressure and return lines from pump”•compound bracketing–“The stability augmentor pitch axis actuator housing support”•2.2.3 Semantic Restrictions•2.2.3.1 Categorization and Subcategorization–Reduction in polysemy: word classes•case (N) *case the joint•lug (N) *they lugged the equipment from the plane•cake (V) *the pilot likes banana cake•jerky (A) *carry a pound of jerky on long flights•fine (A) *fine them for smoking *a fine for smoking•cable (N) *cable the forward compartmentPaper 16. Automatic Translation and the Concept of Sublanguage. Lehrberger, J. –Reduction in polysemy: senses within classes•eccentric (A) [-animate] *eccentric pilot•ball (N) *the annual ball•check (N) [+abstract] *cash this check•bore (V) [-animate object] *inaction may bore the crew•bore (N) *the pilot is a bore–cylindrical hole, inside diameter of a cylinder–Categorial ambiguity•check pump case drain fitting•N N N N N•V V V V V•25=32, but case is N onlyin corpus => 16Paper 16. Automatic Translation and the Concept of Sublanguage. Lehrberger, J. •Case ejection door locks immediately•N V N N Adv•V V•Case N only =>•subject: case ejection door =>•locks only candidate for a verb–Semantic range reduction•A small heat exchanger uses engine fuel for cooling purposes–cooling modifies purposes–cooling takes purposes as object–only concrete objects are cooled in corpus (not tempers etc.)–cool (V): direct object [+concrete]Paper 16. Automatic Translation and the Concept of Sublanguage. Lehrberger, J. •2.2.3.2 Specificity–the + N specific only•the oil tank is not a component of the engine•the computer provides increased fuel scheduling–no generic reference as in:•the dolphin is a mammal•the invention of the wheel was a crucial step–differs from a textbook•the motor is a machine that converts electrical into mechanical energy vs.•the motor is a constant-displacement piston type–Omission of articles •clean (the) reservoir system•French translation requires a definite articlePaper 16. Automatic Translation and the Concept of Sublanguage. Lehrberger, J. •2.2.3.3 Semantic Features–[+concrete] use only in this sublanguge•air, battery, dirt, machine, flap, flash, post, rod, solution, speed, spring, tool, net, web, race–[-human] use only•agent, body, boss, buffer, crank, elbow, governor, joint, nut, page, selector, starter–Subject/object restrictions•charge object [+concrete]•circulate subject [+fluid] (intransitive)•divert object [+fluid]•function subject [+part] (part of aircraft or related equip.)•top object


View Full Document

UA CSC 620 - CSC 620 Lecture Notes

Download CSC 620 Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view CSC 620 Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view CSC 620 Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?