Lecture 7: Context-free parsing &statistical context-free parsingProfessor Robert C. [email protected]/9.611J SP11The Menu Bar• Administrivia: RR#2 out today• Efficient context-free parsing:Three Loops toRule them All, Three Loops to Bind them• Removing redundancy redundancy• Statistical context-free parsing• Parsing tricks; the Earley algorithm6.863J/9.611J SP11Review: Context-free grammar (CFG) MorepreciselyA context-free grammar (CFG) is a 4-tuple (N, Σ, P, S) where:N is a finite set of nonterminal symbols (phrase names,categories); Σ is a finite set of terminal symbols (‘lexical items’, ‘parts ofspeech’, eg, noun, verb, auxiliary verb, …);P is a set of production rules <A∈N, α>, where α is a sequence ofterminal or nonterminals; sometimes also written in the form X→ Y1Y2…Yn for n ≥ 0, X ∈N, Yi ∈(N∪Σ)S ∈N is a designated start symbol.We write the productions as A→ α (‘is-a’)Note: this is already off the mark from human language in at leasttwo ways! (What ways?)6.863J/9.611J SP11Languages, grammars, and all that• A (terminal) string s ∈Σ* is in the language generated bythe context-free grammar G iff there is at least onederivation that yields s• The set of strings derivable (in the language of) G isdenoted L(G), the grammar’s weak generative capacity• A string may have more than one derivation (as in the fsacase!) = ambiguity• Parsing = given an input sentence w, CFG G, recover the(set of derivations) of w wrt G• And that is what makes parsing hard! Why?6.863J/9.611J SP11Canonical derivations in CFGs• Left-most derivation in a CFG:Sequence of derivation lines s1,s2, …, snwhere s1= Start symbolsn∈Σ*, that is, all terminal symbolsFor i=2, …, n, each si is derived from si-1 by selecting theleftmost nonterminal X in si-1 and replacing it by some β,the right-hand side of a rule X → β in P(Can you tell me how to map derivations to ‘hierarchicalstructures’– no, no, do not use the ‘T’ word….)6.863J/9.611J SP11CFGs, languages, and derivations• The derives relation ⇒• Define wrt grammar G= (N, Σ, P, S) as followsα⇒β iff ∃ α1, α2 s.t. α = α1 A α2 ; β= α1 γ α2; and A→γ ∈P. (Some rule rewrites α as β)• Reflexive, transitive closure of ⇒ is ⇒∗If α, β is in ⇒∗ then we say that α derives β (by 0 or more steps)The language generated by a cfg G is the set of terminal strings w∈Σ* s.t. S ⇒∗ w (also called the yield of S)6.863J/9.611J SP11Derives relation & added info in context-free grammar• Relates all elts by either dominance or precedence• Induces a (derivation) tree (Q: do we lose any information in thistree?)• FSA represents pure linear relation: what can precede or (follow)what• CFG adds a single new predicate: dominate• Claim: The dominance and precedence relations amongst thewords exhaustively describe its syntactic structure• When we parse, we are recovering these predicates6.863J/9.611J SP11Derivation induces 2 relations,dominance & precedence• Binary Relation D, dominance:A D v iff ∃ α1, α2 s.t. α⇒β via A →α1v α2• Binary relation < precedence:v < w iff ∃ α1, α2 s.t. α = α1vw α2 or β = α1vw α2 & α⇒βConfirm that our derivation steps previously induce suchrelations… note that all elts are related by < or D.(Suppose not…?)6.863J/9.611J SP11A context-free grammarS → NP V S → NP VP N → guyNP→ Det N N→ caviarNP→ NP PP N→ spoonVP→ VP PP V→ spoonVP→ V NP V → atePP→ P NP P → withDet→ a Det→ the | a6.863J/9.611J SP11A derivation of:Madoff ate caviar with a spoon1. Root2. S3. NP VP4. NP V NP5. NP V NP PP6. NP V NP P NP7. NP V NP P Det N8. NP V NP P Det with9. NP V NP P a spoon10. NP V NP with a spoon11. NP V Det N with a spoon12. NP V Det caviar with a spoon13. NP V the caviar with a spoon14. NP ate the caviar with a spoon15. Det N ate the caviar with a spoon16. Det guy ate the caviar with aspoon17. The guy ate the caviar with aspoonNow parse this…6.863J/9.611J SP11Some simple methods• Shift-reduce parsing: stack + twooperations….shift item onto stack, or reduce (viagrammar rule); bottom up• What’s the problem with this?• Recursive descent parsing: top-down6.863J/9.611J SP11Marxist analysisSVPVNPNPateNcaviarPPPrepDETwithspoonNPNaHow can we “re-use” the PP for bothanalyses? Otherwise, we will doexponential work!The guyDetNDetthe6.863J/9.611J SP11Local & Global ambiguity makes parsinghard• Local ambiguity• Plural/singular: the sheep are/is…• The chair is too wobbly for the woman to sit on/on it• Global ambiguity NP →NP PP | PP → P NP, the guy on the hill with the telescope n 1 2 3 4 5 6 7 8# 2 5 14 132 469 1430 4862 16796Noun compounds: NP→ NN NN, water meter cover screw Conjunctions: NP→ NP and NPNP→ NP and NP and NPn 1 2 3 4 5 6 7 8 9# 1 1 3 11 45 197 903 4279 20793 6.863J/9.611J SP11Why is parsing hard?Martha Stewart’s revenge• If you write cookbooks…this from an actualexample in a 30M word corpus…Combine grapefruit with bananas, strawberriesand bananas, bananas and melon balls,raspberries or strawberries and melon balls,seedless white grapes and melon balls, orpineapple cubes with orange slices...# of parses with 10 conjuncts is 103,049(grows approx as 6# conjuncts)6.863J/9.611J SP11Use dynamic programming (aka‘memoization’ to avoid building same phrasemore than once(actually: A* search through phrase space…when we add some notion of “distance”)Let’s see how…6.863J/9.611J SP11A context-free grammarS → NP VP NP → NS → NP V N → guy | PapaNP→ Det N N→ caviarNP→ NP PP N→ spoonVP→ VP PP V→ spoonVP→ V NP V → atePP→ P NP P → withDet→ a Det→ the(suppose VP rule was VP → V NP.. What then?)6.863J/9.611J SP11Tabular, bottom-up parsing:CKY, recognition versionUses a ‘table’, aka a ‘chart’Grammar: binary branching form, rules Nk→ NlNmInput: string of n wordsOutput: yes/no; table(i, j) of phrases spanning i,j, split(i,j,k) of backptrsData structure: n x n tablerows labeled 1 to n (indexed by j)columns labeled 1 to n (indexed by i) table(i, j ) lists constituents found covering words i thru j6.863J/9.611J SP11CKY algorithm6.863J/9.611J SP117786543218654321i jthe guy ate the caviar with a spoon1,16.863J/9.611J SP11The CKY algorithm: 3 Loops to Rule Them Allfunction CKY-PARSE(words,grammar) returns tablefor j ← from 1 to n (# words) dotable[j, j] ← {A| A
View Full Document