CS 188: Artificial Intelligence Spring 2007AnnouncementsWhat is NLP?Why is Language Hard?Syntactic Ambiguities ISuggestive facts about language comprehensionWhy probabilistic models of language comprehension?When do we choose between multiple thingsStudying AmbiguitiesStudying sentence comprehension: garden path sentencesProbabilistic Factors: Summary of evidence in comprehensionSummary: Probabilistic factors and sentence comprehensionA Bayesian model of sentence comprehension Narayanan and Jurafsky (2002, 2007(in press))Reading TimeBasic Predictions of the ModelExpectationThe Attention PrincipleHow to compute linguistic probabilitiesDecomposing ProbabilitiesUsing Bayes ruleOur ModelWord-to-word expectationsStructure and Parse TreesPCFGs and IndependenceSemanticsCombining structured sourcesPCFG Parses as Bayes NetsCombining Syntax and SemanticsResults IResults IIBasic resultResults on funny ambiguitiesCurrent workCS 188: Artificial IntelligenceSpring 2007Lecture 27: NLP: Language Understanding4/26/2007Srini Narayanan – ICSI and UC BerkeleyAnnouncementsOthello tournament Please submit by Friday 4/271 % Extra Credit for winner and runner up(Optional) free lunch with CS188 staff for winnerIn-class face-off on 5/3 between top (2?)Participation points for those who submit.Extra office hoursTuesday 11- 1 (Will post on the announcements list) and following weekPlease consider coming to them if your midterm grade was < 55%Reinforcement Learning TutorialMonday 5 – 7 place TBA.Upside-down helicopter control talk next week.Final Exam review page (including previous and current midterm sols) up by weekendWhat is NLP?Fundamental goal: deep understanding of broad languageNot just string processing or keyword matching!End systems that we want to build:Ambitious: speech recognition, machine translation, information extraction, dialog interfaces, question answering…Modest: spelling correction, text categorization…Why is Language Hard?AmbiguityEYE DROPS OFF SHELFMINERS REFUSE TO WORK AFTER DEATHKILLER SENTENCED TO DIE FOR SECOND TIME IN 10 YEARSLACK OF BRAINS HINDERS RESEARCHSyntactic Ambiguities IPrepositional phrases:They cooked the beans in the pot on the stove with handles. Particle vs. preposition:A good pharmacist dispenses with accuracy.Complement structuresThe tourists objected to the guide that they couldn’t hear.She knows you like the back of her hand. Gerund vs. participial adjectiveVisiting relatives can be boring.Changing schedules frequently confused passengers.Suggestive facts about language comprehensionLanguage is noisy, ambiguous, and unsegmented. How might humans interpret noisy input?human visual processing: probabilistic models (Rao et al. 2001; Weiss & Fleet 2001)categorization: probabilistic models (Tenenbaum 2000; Tenenbaum and Griffiths 2001b; Tenenbaum and Griffiths 2001a, Griffiths 2004)human understanding of causation: probabilistic models (Rehder 1999; Glymour and Cheng 1998, Gopnik et al 2004)Why Probabilistic Models?Why probabilistic models of language comprehension?•Principled methodology for weighing and combining evidence to choose between competing hypotheses/interpretations•Coherent semantics•Learnable from interaction with world•Bounded optimalityWhen do we choose between multiple thingsComprehension:Segmenting speech inputLexical ambiguitySyntactic ambiguitySemantic ambiguityPragmatic ambiguityProduction: choice of words (or syntactic structure or phonological form or etc)Learning: choosing between:Different grammarsPossible lexical entries for new wordsStudying AmbiguitiesLinguists study the role of various factors in processing language by constructing garden path sentences•Carefully construct sentences which are not ambiguous but have ambiguous regions where more than one interpretation is possible.•Often the ambiguous region has a preferred interpretation which becomes dispreferred at the end of the input.•Using eye-tracking, behavioral and imaging studies, behavior at various regions of the input is recorded.Studying sentence comprehension: garden path sentencesMain Verb (MV) versus Reduced Relative (RR) ambiguityThe horse raced past the barn fell (Bever 1970).The horse raced past the barn stumbled.The horse ridden past the barn stumbled.The crook arrested by the police confessed.The cop arrested by the police confessed.The complex houses married and single students.The warehouse fires many employees in the spring.Probabilistic Factors: Summary of evidence in comprehensionWord LevelLexeme frequencies (Tyler 1984; Salasoo and Pisoni 1985; inter alia)Lemma frequencies (Hogaboam and Perfetti 1975; Ahrens 1998;)Phonological probabilities (Pierrehumbert 1994, Hay et al (in press), Pitt et al(1998)).Word RelationsDependency (word-word) probabilities (MacDonald (1993, 2001), Bod (2001)Lexical category frequencies (Burgess; MacDonald 1993, Trueswell et al. 1996; Jurafsky 1996)Grammatical/SemanticGrammatical probabilities (Mitchell et al. 1995; Croft 1995; Jurafsky 1996; (Corley and Crocker 1996, 2000; Narayanan and Jurafsky 1998, 2001; Hale 2001)Sub- categorization probabilities (Ford, Bresnan, Kaplan (1982); Clifton, Frazier, Connine (1984), Trueswell et al. (1993) Idiom frequencies (d’Arcais 1993)Thematic role probabilities (Trueswell et al. 1994; Garnsey et al. 1997, McRae et al. (1998) McRae, Hare, Elman (2004))Summary: Probabilistic factors and sentence comprehensionWhat we knowLots of kinds of knowledge interact probabilistically to build interpretationsWhat we don’t knowHow are probabilistic aspects of linguistic knowledge represented?How are these probabilities combined?How are interpretations selected?What’s the relationship between probability and behavioral information like reading time?A Bayesian model of sentence comprehensionNarayanan and Jurafsky (2002, 2007(in press))How do we do linguistic decision-making under uncertainty?Proposal: Use on-line probabilistic reasoners.Bayesian approach tells usHow to combine structure and probability.What probability to assign to a particular belief/interpretation/structure.How these beliefs should be updated in the light of new evidence.Processing: In processing a sentence, humans:consider possible interpretations (constructions) in
View Full Document