Speech and Language ProcessingTodaySyntaxSyntaxSyntaxConstituencyConstituencyConstituencyGrammars and ConstituencyContext-Free GrammarsContext-Free GrammarsSome NP RulesL0 GrammarGenerativityDerivationsDefinitionParsingAn English Grammar FragmentSentence TypesNoun PhrasesNoun PhrasesNP StructureDeterminersNominalsPostmodifiersAgreementProblemVerb PhrasesSubcategorizationSubcategorizationSubcategorizationWhy?Possible CFG SolutionCFG Solution for AgreementThe PointTreebanksPenn TreebankTreebank GrammarsTreebank GrammarsGrammar SummarySpeech and Language ProcessingFormal Grammars, ParsingChapter 12, start 1310/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 2Today Formal Grammars Context-free grammar Grammars for English Treebanks Start Parsing10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 3Syntax By grammar, or syntax, we have in mind the kind of implicit knowledge of your native language that you had mastered by the time you were 3 years old without explicit instruction Not the kind of stuff you were later taught in “grammar” school10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 4Syntax Why should you care? Grammars (and parsing) are key components in many applications Grammar checkers Dialogue management Question answering Information extraction Machine translation10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 5Syntax Key notions that we’ll cover Constituency Grammatical relations and Dependency Heads Key formalism Context-free grammars Resources Treebanks10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 6Constituency The basic idea here is that groups of words within utterances can be shown to act as single units. And in a given language, these units form coherent classes that can be be shown to behave in similar ways With respect to their internal structure And with respect to other units in the language10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 7Constituency Internal structure We can describe an internal structure to the class (might have to use disjunctions of somewhat unlike sub-classes to do this). External behavior For example, we can say that noun phrases can come before verbs10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 8Constituency For example, it makes sense to the say that the following are all noun phrasesin English... Why? One piece of evidence is that they can all precede verbs. This is external evidence10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 9Grammars and Constituency Of course, there’s nothing easy or obvious about how we come up with right set of constituents and the rules that govern how they combine... That’s why there are so many different theories of grammar and competing analyses of the same data. The approach to grammar, and the analyses, adopted here are very generic (and don’t correspond to any modern linguistic theory of grammar).10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 10Context-Free Grammars Context-free grammars (CFGs) Also known as Phrase structure grammars Backus-Naur form Consist of Rules Terminals Non-terminals10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 11Context-Free Grammars Terminals We’ll take these to be words (for now) Non-Terminals The constituents in a language Like noun phrase, verb phrase and sentence Rules Rules are equations that consist of a single non-terminal on the left and any number of terminals and non-terminals on the right.10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 12Some NP Rules Here are some rules for our noun phrases Together, these describe two kinds of NPs. One that consists of a determiner followed by a nominal And another that says that proper names are NPs. The third rule illustrates two things An explicit disjunction Two kinds of nominals A recursive definition Same non-terminal on the right and left-side of the rule10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 13L0 Grammar10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 14Generativity As with FSAs and FSTs, you can view these rules as either analysis or synthesis machines Generate strings in the language Reject strings not in the language Impose structures (trees) on strings in the language10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 15Derivations A derivation is a sequence of rules applied to a string that accountsfor that string Covers all the elements in the string Covers only the elements in the string10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 16Definition More formally, a CFG consists of10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 17Parsing Parsing is the process of taking a string and a grammar and returning a (multiple?) parse tree(s) for that string It is completely analogous to running a finite-state transducer with a tape It’s just more powerful Remember this means that there are languages we can capture with CFGs that we can’t capture with finite-state methods More on this when we get to Ch. 13.10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 18An English Grammar Fragment Sentences Noun phrases Agreement Verb phrases Subcategorization Prepositional Phrases10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 19Sentence Types Declaratives: A plane left.S →NP VP Imperatives: Leave!S →VP Yes-No Questions: Did the plane leave?S →Aux NP VP WH Questions: When did the plane leave?S →WH-NP Aux NP VP10/13/2008Speech
View Full Document