Speech and Language Processing Formal Grammars Parsing Chapter 12 start 13 Today Formal Grammars Context free grammar Grammars for English Treebanks Start Parsing 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 2 Syntax By grammar or syntax we have in mind the kind of implicit knowledge of your native language that you had mastered by the time you were 3 years old without explicit instruction Not the kind of stuff you were later taught in grammar school 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 3 Syntax Why should you care Grammars and parsing are key components in many applications 10 13 2008 Grammar checkers Dialogue management Question answering Information extraction Machine translation Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 4 Syntax Key notions that we ll cover Constituency Grammatical relations and Dependency Heads Key formalism Context free grammars Resources Treebanks 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 5 Constituency The basic idea here is that groups of words within utterances can be shown to act as single units And in a given language these units form coherent classes that can be be shown to behave in similar ways With respect to their internal structure And with respect to other units in the language 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 6 Constituency Internal structure We can describe an internal structure to the class might have to use disjunctions of somewhat unlike sub classes to do this External behavior For example we can say that noun phrases can come before verbs 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 7 Constituency For example it makes sense to the say that the following are all noun phrases in English Why One piece of evidence is that they can all precede verbs This is external evidence 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 8 Grammars and Constituency Of course there s nothing easy or obvious about how we come up with right set of constituents and the rules that govern how they combine That s why there are so many different theories of grammar and competing analyses of the same data The approach to grammar and the analyses adopted here are very generic and don t correspond to any modern linguistic theory of grammar 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 9 Context Free Grammars Context free grammars CFGs Also known as Phrase structure grammars Backus Naur form Consist of Rules Terminals Non terminals 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 10 Context Free Grammars Terminals We ll take these to be words for now Non Terminals The constituents in a language Like noun phrase verb phrase and sentence Rules Rules are equations that consist of a single non terminal on the left and any number of terminals and non terminals on the right 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 11 Some NP Rules Here are some rules for our noun phrases Together these describe two kinds of NPs One that consists of a determiner followed by a nominal And another that says that proper names are NPs The third rule illustrates two things An explicit disjunction Two kinds of nominals A recursive definition Same non terminal on the right and left side of the rule 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 12 L0 Grammar 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 13 Generativity As with FSAs and FSTs you can view these rules as either analysis or synthesis machines Generate strings in the language Reject strings not in the language Impose structures trees on strings in the language 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 14 Derivations A derivation is a sequence of rules applied to a string that accounts for that string Covers all the elements in the string Covers only the elements in the string 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 15 Definition More formally a CFG consists of 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 16 Parsing Parsing is the process of taking a string and a grammar and returning a multiple parse tree s for that string It is completely analogous to running a finite state transducer with a tape It s just more powerful Remember this means that there are languages we can capture with CFGs that we can t capture with finite state methods More on this when we get to Ch 13 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 17 An English Grammar Fragment Sentences Noun phrases Agreement Verb phrases Subcategorization Prepositional Phrases 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 18 Sentence Types Declaratives A plane left S NP VP Imperatives Leave S VP Yes No Questions Did the plane leave S Aux NP VP WH Questions When did the plane leave S WH NP Aux NP VP 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 19 Noun Phrases Let s consider the following rule in more detail NP Det Nominal Most of the complexity of English noun phrases is hidden in this rule Consider the derivation for the following example All the morning flights from Denver to Tampa leaving before 10 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 20 Noun Phrases 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 21 NP Structure Clearly this NP is really about flights That s the central criticial noun in this NP Let s call that the head We can dissect this kind of NP into the stuff that can come before the head and the stuff that can come after it 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 22 Determiners Noun phrases can start with determiners Determiners can be Simple lexical items the this a an etc A car Or simple possessives John s car Or
View Full Document
Unlocking...