Unformatted text preview:

CMSC 723 LING 645 Intro to Computational Linguistics September 1 2004 Dorr Overview History Goals Problems Techniques Intro to MT J M 1 21 Prof Bonnie J Dorr Dr Christof Monz TA Adam Lee Administrivia http www umiacs umd edu christof courses cmsc723 fall04 IMPORTANT For Today Chapters 1 and 21 For Next Time Chapter 2 Other Important Stuff This course is interdisciplinary cuts across different areas of expertise Expect that a subset of the class will be learning new material at any time while others will have to be patient The subsets will swap frequently Project 1 and Project 2 are designed differently Be prepared for this distinction P1 will focus on the fundamentals getting your feet wet with software By the end you should feel comfortable using testing certain types of NLP software P2 will require a significantly deeper level of understanding critique analysis You ll be expected to think deeply and write a lot in the second project What you write will be a major portion of the grade No solutions will be handed out Written comments will be sent to you by the TA All email correspondence MUST HAVE CMSC 723 in the Subject line Submission format for assignments projects plain ascii pdf Assignment 1 will be posted next week CL vs NLP Why Computational Linguistics CL rather than Natural Language Processing NLP Computational Linguistics Computers dealing with language Modeling what people do Natural Language Processing Applications on the computer side Relation of CL to Other Disciplines Artificial Intelligence AI notions of rep search etc Machine Learning particularly probabilistic or statistic ML techniques Human Computer Interaction HCI Electrical Engineering EE Optical Character Recognition Linguistics Syntax Semantics etc CL Psychology Philosophy of Language Formal Logic Theory of Computation Information Retrieval A Sampling of Other Disciplines Linguistics formal grammars abstract characterization of what is to be learned Computer Science algorithms for efficient learning or online deployment of these systems in automata Engineering stochastic techniques for characterizing regular patterns for learning and ambiguity resolution Psychology Insights into what linguistic constructions are easy or difficult for people to learn or to use History 1940 1950 s Development of formal language theory Chomsky Kleene Backus Formal characterization of classes of grammar context free regular Association with relevant automata Probability theory language understanding as decoding through noisy channel Shannon Use of information theoretic concepts like entropy to measure success of language models 1957 1983 Symbolic vs Stochastic Symbolic Use of formal grammars as basis for natural language processing and learning systems Chomsky Harris Use of logic and logic based programming for characterizing syntactic or semantic inference Kaplan Kay Pereira First toy natural language understanding and generation systems Woods Minsky Schank Winograd Colmerauer Discourse Processing Role of Intention Focus Grosz Sidner Hobbs Stochastic Modeling Probabilistic methods for early speech recognition OCR Bledsoe and Browning Jelinek Black Mercer 1983 1993 Return of Empiricism Use of stochastic techniques for part of speech tagging parsing word sense disambiguation etc Comparison of stochastic symbolic more or less powerful models for language understanding and learning tasks 1993 Present Advances in software and hardware create NLP needs for information retrieval web machine translation spelling and grammar checking speech recognition and synthesis Stochastic and symbolic methods combine for real world applications Language and Intelligence Turing Test Turing test machine human and human judge Judge asks questions of computer and human Machine s job is to act like a human human s job is to convince judge that he s not the machine Machine judged intelligent if it can fool judge Judgement of intelligence linked to appropriate answers to questions from the system ELIZA Remarkably simple Rogerian Psychologist Uses Pattern Matching to carry on limited form of conversation Seems to Pass the Turing Test McCorduck 1979 pp 225 226 Eliza Demo http www lpa co uk pws dem4 htm What s involved in an intelligent Answer Analysis Decomposition of the signal spoken or written eventually into meaningful units This involves Speech Character Recognition Decomposition into words segmentation of words into appropriate phones or letters Requires knowledge of phonological patterns I m enormously proud I mean to make you proud Morphological Analysis Inflectional duck s N duck plural s duck s V duck 3rd person s Derivational kind kindness Spelling changes drop dropping hide hiding Syntactic Analysis Associate constituent structure with string Prepare for semantic interpretation S OR NP I VP V watched watch Subject NP det I Object terrapin N Det the terrapin the Semantics A way of representing meaning Abstracts away from syntactic structure Example First Order Logic watch I terrapin Can be I watched the terrapin or The terrapin was watched by me Real language is complex Who did I watch Lexical Semantics The Terrapin is who I watched Watch the Terrapin is what I do best Terrapin is what I watched the I experiencer Watch the Terrapin predicate The Terrapin patient Compositional Semantics Association of parts of a proposition with semantic roles Scoping Proposition Experiencer I 1st pers sg Predicate Be perc pred saw patient the Terrapin Word Governed Semantics Any verb can add able to form an adjective I taught the class The class is teachable I rejected the idea The idea is rejectable Association of particular words with specific semantic forms John masculine The boys masculine plural human Pragmatics Real world knowledge speaker intention goal of utterance Related to sociology Example 1 Could you turn in your assignments now command Could you finish the homework question command Example 2 I couldn t decide how to catch the crook Then I decided to spy on the crook with binoculars To my surprise I found out he had them too Then I knew to just follow the crook with binoculars the crook with binoculars the crook with binoculars Discourse Analysis Discourse How propositions fit together in a conversation multi sentence processing Pronoun reference The professor told the student to finish the assignment He was pretty aggravated at how long it was taking to pass it in Multiple reference to same entity George W Bush president of the U S Relation


View Full Document

UMD CMSC 723 - Intro to Computational Linguistics

Documents in this Course
Lecture 9

Lecture 9

12 pages

Smoothing

Smoothing

15 pages

Load more
Loading Unlocking...
Login

Join to view Intro to Computational Linguistics and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Intro to Computational Linguistics and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?