UT CS 388 - Natural Language Processing Introduction

Unformatted text preview:

CS 388: Natural Language Processing IntroductionNatural Language ProcessingRelated AreasCommunicationCommunication (cont)Slide 6Syntax, Semantic, PragmaticsModular ComprehensionAmbiguityAmbiguity is UbiquitousAmbiguity is ExplosiveHumor and AmbiguityWhy is Language Ambiguous?Natural Languages vs. Computer LanguagesNatural Language TasksSyntactic TasksWord SegmentationMorphological AnalysisPart Of Speech (POS) TaggingPhrase ChunkingSyntactic ParsingSemantic TasksWord Sense Disambiguation (WSD)Semantic Role Labeling (SRL)Semantic ParsingTextual EntailmentTextual Entailment Problems from PASCAL ChallengePragmatics/Discourse TasksAnaphora Resolution/ Co-ReferenceEllipsis ResolutionOther TasksInformation Extraction (IE)Question AnsweringText SummarizationMachine Translation (MT)Ambiguity Resolution is Required for TranslationResolving AmbiguityManual Knowledge AcquisitionAutomatic Learning ApproachLearning ApproachAdvantages of the Learning ApproachThe Importance of ProbabilityHuman Language AcquisitionPipelining ProblemPipelining Problem (cont.)Increasing Module BandwidthGlobal Integration/ Joint InferenceEarly History: 1950’sHistory: 1960’sHistory: 1970’sHistory: 1980’sHistory: 1990’sHistory: 2000’sRelevant Scientific Conferences1CS 388: Natural Language ProcessingIntroductionRaymond J. MooneyUniversity of Texas at AustinNatural Language Processing•NLP is the branch of computer science focused on developing systems that allow computers to communicate with people using everyday language.•Also called Computational Linguistics–Also concerns how computational methods can aid the understanding of human language2Related Areas•Artificial Intelligence•Formal Language (Automata) Theory•Machine Learning•Linguistics•Psycholinguistics•Cognitive Science•Philosophy of Language34Communication•The goal in the production and comprehension of natural language is communication.•Communication for the speaker:–Intention: Decide when and what information should be transmitted (a.k.a. strategic generation). May require planning and reasoning about agents’ goals and beliefs.–Generation: Translate the information to be communicated (in internal logical representation or “language of thought”) into string of words in desired natural language (a.k.a. tactical generation).–Synthesis: Output the string in desired modality, text or speech.5Communication (cont)•Communication for the hearer:–Perception: Map input modality to a string of words, e.g. optical character recognition (OCR) or speech recognition.–Analysis: Determine the information content of the string.•Syntactic interpretation (parsing): Find the correct parse tree showing the phrase structure of the string.•Semantic Interpretation: Extract the (literal) meaning of the string (logical form).•Pragmatic Interpretation: Consider effect of the overall context on altering the literal meaning of a sentence.–Incorporation: Decide whether or not to believe the content of the string and add it to the KB.6Communication (cont)7Syntax, Semantic, Pragmatics•Syntax concerns the proper ordering of words and its affect on meaning.–The dog bit the boy.–The boy bit the dog.–* Bit boy dog the the.–Colorless green ideas sleep furiously.•Semantics concerns the (literal) meaning of words, phrases, and sentences.–“plant” as a photosynthetic organism–“plant” as a manufacturing facility–“plant” as the act of sowing•Pragmatics concerns the overall communicative and social context and its effect on interpretation.–The ham sandwich wants another beer. (co-reference, anaphora)–John thinks vanilla. (ellipsis)8Modular ComprehensionAcoustic/Phonetic Syntax Semantics Pragmaticswordsparsetreesliteralmeaningmeaning(contextualized)sound waves9Ambiguity•Natural language is highly ambiguous and must be disambiguated.–I saw the man on the hill with a telescope.–I saw the Grand Canyon flying to LA.–Time flies like an arrow.–Horse flies like a sugar cube.–Time runners like a coach.–Time cars like a Porsche.10Ambiguity is Ubiquitous•Speech Recognition–“recognize speech” vs. “wreck a nice beach”–“youth in Asia” vs. “euthanasia”•Syntactic Analysis–“I ate spaghetti with chopsticks” vs. “I ate spaghetti with meatballs.”•Semantic Analysis–“The dog is in the pen.” vs. “The ink is in the pen.”–“I put the plant in the window” vs. “Ford put the plant in Mexico”•Pragmatic Analysis–From “The Pink Panther Strikes Again”:–Clouseau: Does your dog bite? Hotel Clerk: No. Clouseau: [bowing down to pet the dog] Nice doggie. [Dog barks and bites Clouseau in the hand] Clouseau: I thought you said your dog did not bite! Hotel Clerk: That is not my dog.11Ambiguity is Explosive•Ambiguities compound to generate enormous numbers of possible interpretations.•In English, a sentence ending in n prepositional phrases has over 2n syntactic interpretations (cf. Catalan numbers).–“I saw the man with the telescope”: 2 parses–“I saw the man on the hill with the telescope.”: 5 parses–“I saw the man on the hill in Texas with the telescope”: 14 parses–“I saw the man on the hill in Texas with the telescope at noon.”: 42 parses–“I saw the man on the hill in Texas with the telescope at noon on Monday” 132 parses12Humor and Ambiguity•Many jokes rely on the ambiguity of language:–Groucho Marx: One morning I shot an elephant in my pajamas. How he got into my pajamas, I’ll never know.–She criticized my apartment, so I knocked her flat.–Noah took all of the animals on the ark in pairs. Except the worms, they came in apples.–Policeman to little boy: “We are looking for a thief with a bicycle.” Little boy: “Wouldn’t you be better using your eyes.”–Why is the teacher wearing sun-glasses. Because the class is so bright.Why is Language Ambiguous?•Having a unique linguistic expression for every possible conceptualization that could be conveyed would make language overly complex and linguistic expressions unnecessarily long.•Allowing resolvable ambiguity permits shorter linguistic expressions, i.e. data compression.•Language relies on people’s ability to use their knowledge and inference abilities to properly resolve ambiguities.•Infrequently, disambiguation fails, i.e. the compression is lossy.13Natural Languages vs. Computer Languages•Ambiguity is the primary difference between natural and


View Full Document

UT CS 388 - Natural Language Processing Introduction

Download Natural Language Processing Introduction
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Natural Language Processing Introduction and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Natural Language Processing Introduction 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?