Unformatted text preview:

Speech and Language Processing SLP Chapter 13 Parsing Rest of Today Parsing with CFGs Bottom up top down Ambiguity Shared sub problems 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 2 Parsing Parsing with CFGs refers to the task of assigning proper trees to input strings Proper here means a tree that covers all and only the elements of the input and has an S at the top It doesn t actually mean that the system can select the correct tree from among all the possible trees 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 3 Parsing As with everything of interest parsing involves a search which involves the making of choices We ll start with some basic meaning bad methods before moving on to the one or two that you need to know 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 4 For Now Assume You have all the words already in some buffer The input isn t POS tagged We won t worry about morphological analysis All the words are known These are all problematic in various ways and would have to be addressed in real applications 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 5 Top Down Search Since we re trying to find trees rooted with an S Sentences why not start with the rules that give us an S Then we can work our way down from there to the words 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 6 Top Down Space 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 7 Bottom Up Parsing Of course we also want trees that cover the input words So we might also start with trees that link up with the words in the right way Then work your way up from there to larger and larger trees 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 8 Bottom Up Search 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 9 Bottom Up Search 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 10 Bottom Up Search 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 11 Bottom Up Search 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 12 Bottom Up Search 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 13 Top Down and Bottom Up Top down Only searches for trees that can be answers i e S s But also suggests trees that are not consistent with any of the words Bottom up Only forms trees consistent with the words But suggests trees that make no sense globally 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 14 Control Of course in both cases we left out how to keep track of the search space and how to make choices Which node to try to expand next Which grammar rule to use to expand a node One approach is called backtracking Make a choice if it works out then fine If not then back up and make a different choice 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 15 Problems Even with the best filtering backtracking methods are doomed because of two inter related problems Ambiguity Shared subproblems 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 16 Ambiguity 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 17 Shared Sub Problems No matter what kind of search top down or bottom up or mixed that we choose We don t want to redo work we ve already done Unfortunately na ve backtracking will lead to duplicated work 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 18 Shared Sub Problems Consider A flight from Indianapolis to Houston on TWA 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 19 Shared Sub Problems Assume a top down parse making choices among the various Nominal rules In particular between these two Nominal Noun Nominal Nominal PP Statically choosing the rules in this order leads to the following bad results 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 20 Shared Sub Problems 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 21 Shared Sub Problems 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 22 Shared Sub Problems 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 23 Shared Sub Problems 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 24 Dynamic Programming DP search methods fill tables with partial results and thereby Avoid doing avoidable repeated work Solve exponential problems in polynomial time well no not really Efficiently store ambiguous structures with shared sub parts We ll cover two approaches that roughly correspond to top down and bottom up approaches CKY Earley 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 25 CKY Parsing First we ll limit our grammar to epsilonfree binary rules more later Consider the rule A BC If there is an A somewhere in the input then there must be a B followed by a C in the input If the A spans from i to j in the input then there must be some k st i k j Ie The B splits from the C someplace 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 26 Problem What if your grammar isn t binary As in the case of the TreeBank grammar Convert it to binary any arbitrary CFG can be rewritten into Chomsky Normal Form automatically What does this mean The resulting grammar accepts and rejects the same set of strings as the original grammar But the resulting derivations trees are different 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 27 Problem More specifically we want our rules to be of the form A BC Or A w That is rules can expand to either 2 nonterminals or to a single terminal 10 13 2008 Speech and Language Processing Jurafsky and Martin with minor modifications by Dorr 28 Binarization Intuition Eliminate chains of unit


View Full Document

UMD CMSC 723 - Speech and Language Processing

Documents in this Course
Lecture 9

Lecture 9

12 pages

Smoothing

Smoothing

15 pages

Load more
Loading Unlocking...
Login

Join to view Speech and Language Processing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Speech and Language Processing and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?