CMSC 723 / LING 645: Intro to Computational LinguisticsAdministriviaOther Important StuffCL vs NLPRelation of CL to Other DisciplinesA Sampling of “Other Disciplines”History: 1940-1950’s1957-1983 Symbolic vs. Stochastic1983-1993: Return of Empiricism1993-PresentLanguage and Intelligence: Turing TestELIZAWhat’s involved in an “intelligent” Answer?Speech/Character RecognitionMorphological AnalysisSyntactic AnalysisSemanticsLexical SemanticsCompositional SemanticsWord-Governed SemanticsPragmaticsDiscourse AnalysisNLP PipelineRelation to Machine TranslationAmbiguitySyntactic DisambiguationPart of Speech Tagging and Word Sense DisambiguationResources for NLP SystemsSome NLP ApplicationsWhat is MT?An Old ExampleMachine Translation HistoryWhat happened between ALPAC and Now?Three MT Approaches: Direct, Transfer, InterlingualExamples of Three ApproachesMT Systems: 1964-1990Statistical MT and Hybrid Symbolic/Stats MT: 1990-PresentDirect MT: Pros and ConsTransfer MT: Pros and ConsInterlingual MT: Pros and ConsApproximate IL ApproachApproximating IL: Handling DivergencesInterlingual vs. Approximate ILMapping from Input Dependency to English Dependency TreeStatistical ExtractionBenefits of Approximate IL ApproachWhat Resources are Required?CMSC 723 / LING 645: Intro to Computational LinguisticsSeptember 1, 2004: DorrOverview, History, Goals, Problems, Techniques; Intro to MT (J&M 1, 21)Prof. Bonnie J. DorrDr. Christof MonzTA: Adam LeeAdministriviahttp://www.umiacs.umd.edu/~christof/courses/cmsc723-fall04/IMPORTANT:•For Today: Chapters 1 and 21•For Next Time: Chapter 2Other Important StuffThis course is interdisciplinary—cuts across different areas of expertise. Expect that a subset of the class will be learning new material at any time, while others will have to be patient! (The subsets will swap frequently!)Project 1 and Project 2 are designed differently. Be prepared for this distinction!–P1 will focus on the fundamentals, getting your feet wet with software. By the end, you should feel comfortable using/testing certain types of NLP software.–P2 will require a significantly deeper level of understanding, critique, analysis. You’ll be expected to think deeply and write a lot in the second project. What you write will be a major portion of the grade!No solutions will be handed out. Written comments will be sent to you by the TA.All email correspondence MUST HAVE “CMSC 723” in the Subject line!!!Submission format for assignments, projects: plain ascii, pdfAssignment 1 will be posted next week.CL vs NLPWhy “Computational Linguistics (CL)” rather than “Natural Language Processing” (NLP)? •Computational Linguistics— Computers dealing with language— Modeling what people do•Natural Language Processing—Applications on the computer sideRelation of CL to Other Disciplines Artificial Intelligence (AI)(notions of rep, search, etc.)Machine Learning(particularly, probabilistic or statistic ML techniques)CLLinguistics (Syntax, Semantics, etc.)PsychologyElectrical Engineering (EE) (Optical Character Recognition)Philosophy of Language, Formal LogicInformationRetrievalTheory of ComputationHuman ComputerInteraction (HCI)A Sampling of “Other Disciplines”Linguistics: formal grammars, abstract characterization of what is to be learned.Computer Science: algorithms for efficient learning or online deployment of these systems in automata.Engineering: stochastic techniques for characterizing regular patterns for learning and ambiguity resolution.Psychology: Insights into what linguistic constructions are easy or difficult for people to learn or to useHistory: 1940-1950’sDevelopment of formal language theory (Chomsky, Kleene, Backus). –Formal characterization of classes of grammar (context-free, regular) –Association with relevant automata Probability theory: language understanding as decoding through noisy channel (Shannon)–Use of information theoretic concepts like entropy to measure success of language models.1957-1983 Symbolic vs. StochasticSymbolic–Use of formal grammars as basis for natural language processing and learning systems. (Chomsky, Harris)–Use of logic and logic based programming for characterizing syntactic or semantic inference (Kaplan, Kay, Pereira)–First toy natural language understanding and generation systems (Woods, Minsky, Schank, Winograd, Colmerauer)–Discourse Processing: Role of Intention, Focus (Grosz, Sidner, Hobbs)Stochastic Modeling–Probabilistic methods for early speech recognition, OCR (Bledsoe and Browning, Jelinek, Black, Mercer)1983-1993: Return of EmpiricismUse of stochastic techniques for part of speech tagging, parsing, word sense disambiguation, etc.Comparison of stochastic, symbolic, more or less powerful models for language understanding and learning tasks.1993-Present Advances in software and hardware create NLP needs for information retrieval (web), machine translation, spelling and grammar checking, speech recognition and synthesis.Stochastic and symbolic methods combine for real world applications.Language and Intelligence: Turing TestTuring test: –machine, human, and human judgeJudge asks questions of computer and human.–Machine’s job is to act like a human, human’s job is to convince judge that he’s not the machine.–Machine judged “intelligent” if it can fool judge.Judgement of “intelligence” linked to appropriate answers to questions from the system.ELIZARemarkably simple “Rogerian Psychologist”Uses Pattern Matching to carry on limited form of conversation.Seems to “Pass the Turing Test!” (McCorduck, 1979, pp. 225-226)Eliza Demo:http://www.lpa.co.uk/pws_dem4.htmWhat’s involved in an “intelligent” Answer?Analysis: Decomposition of the signal (spoken or written) eventually into meaningful units. This involves …Speech/Character RecognitionDecomposition into words, segmentation of words into appropriate phones or lettersRequires knowledge of phonological patterns:–I’m enormously proud.–I mean to make you proud.Morphological AnalysisInflectional–duck + s = [N duck] + [plural s]–duck + s = [V duck] + [3rd person s] Derivational–kind, kindnessSpelling changes–drop, dropping–hide, hidingSyntactic AnalysisAssociate constituent structure with stringPrepare for semantic interpretationSNP VP I V NP watched det N
View Full Document