Unformatted text preview:

Natural Language Processing Lecture 13 10 15 2013 Jim Martin Today CKY Parsing Bottom up parsing with dynamic programming That is parsing by filling in a table bottom up 01 14 19 Speech and Language Processing Jurafsky and Martin 2 Dynamic Programming DP search methods fill tables with partial results and thereby Avoid doing avoidable repeated work Solve exponential problems in polynomial time well no not really Efficiently store ambiguous structures with shared sub parts We ll cover two approaches that roughly correspond to top down and bottom up approaches CKY Earley 01 14 19 Speech and Language Processing Jurafsky and Martin 3 Bottom Up Parsing We want trees that cover the input words So we might start with trees that link up with the words in the right way Then we work our way up from there to larger and larger trees 01 14 19 Speech and Language Processing Jurafsky and Martin 4 Bottom Up Search 01 14 19 Speech and Language Processing Jurafsky and Martin 5 Bottom Up Search 01 14 19 Speech and Language Processing Jurafsky and Martin 6 Bottom Up Search 01 14 19 Speech and Language Processing Jurafsky and Martin 7 Bottom Up Search 01 14 19 Speech and Language Processing Jurafsky and Martin 8 Bottom Up Search 01 14 19 0 1 2 Speech and Language Processing Jurafsky and Martin 3 9 CKY Parsing First we ll limit our grammar to epsilon free binary rules Consider the rule A BC If there is an A somewhere in the input generated by this rule then there must be a B followed by a C in the input If the A spans from i to j in the input then there must be some k st i k j Ie The B splits from the C someplace 01 14 19 Speech and Language Processing Jurafsky and Martin 10 CKY Let s build a table so that an A spanning from i to j in the input is placed in cell i j in the table So a non terminal spanning an entire string will sit in cell 0 n Hopefully it will be an S Now we know that the parts of the A must go from i to k and from k to j for some k Which k Check them all 01 14 19 Speech and Language Processing Jurafsky and Martin 11 CKY Meaning that for a rule like A B C we should look for a B in i k and a C in k j In other words if we think there might be an A spanning i j in the input AND A B C is a rule in the grammar THEN There must be a B in i k and a C in k j for some i k j 01 14 19 Speech and Language Processing Jurafsky and Martin 12 CKY So to fill the table loop over the cell i j values in some systematic way For each cell loop over the appropriate k values to search for things to add 01 14 19 Speech and Language Processing Jurafsky and Martin 13 CKY Table 01 14 19 Speech and Language Processing Jurafsky and Martin 14 CKY Algorithm 01 14 19 Speech and Language Processing Jurafsky and Martin 15 CKY Algorithm Looping over the columns Filling the bottom cell Filling row i in column j Looping over the possible split locations between i and j Check the grammar for rules that link the constituents in i k with those in k j For each rule found store the LHS of the rule in cell i j 01 14 19 Speech and Language Processing Jurafsky and Martin 16 Example 01 14 19 Speech and Language Processing Jurafsky and Martin 17 Example Filling column 5 01 14 19 Speech and Language Processing Jurafsky and Martin 18 Example 01 14 19 Speech and Language Processing Jurafsky and Martin 19 Example 01 14 19 Speech and Language Processing Jurafsky and Martin 20 Example 01 14 19 Speech and Language Processing Jurafsky and Martin 21 Example 01 14 19 Speech and Language Processing Jurafsky and Martin 22 CKY Parsing Is that really a parser Not without a backtrace 01 14 19 Speech and Language Processing Jurafsky and Martin 23 Note We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever we re filling a cell the parts needed to fill it are already in the table to the left and below It s somewhat natural in that it processes the input a left to right a word at a time Known as online processing 01 14 19 Speech and Language Processing Jurafsky and Martin 24 Note An alternative is to fill a diagonal at a time That still satisfies our requirement that the component parts of each constituent cell will already be available when it is filled in Filling a diagonal at a time corresponds naturally to what task 01 14 19 Speech and Language Processing Jurafsky and Martin 25 CKY Notes Since it s bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are constituents but cannot really occur in the context in which they are being suggested To avoid this we can switch to a top down control strategy Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis 01 14 19 Speech and Language Processing Jurafsky and Martin 26 Back to Ambiguity Did we solve it 01 14 19 Speech and Language Processing Jurafsky and Martin 27 Ambiguity No Given an ambiguous input CKY will result in multiple S structures for the 0 N table entry But sub parts are shared between multiple parses And they obviously avoid re deriving those sub parts But there s no way to tell which one is right 01 14 19 Speech and Language Processing Jurafsky and Martin 28 Next Time Partial parsing Chunking Dependency parsing Start on probabilistic parsing 01 14 19 Speech and Language Processing Jurafsky and Martin 29


View Full Document

CU-Boulder CSCI 5832 - Lecture 13

Loading Unlocking...
Login

Join to view Lecture 13 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 13 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?