Unformatted text preview:

Speech and Language ProcessingRest of TodayParsingParsingFor NowTop-Down SearchTop Down SpaceBottom-Up ParsingBottom-Up SearchBottom-Up SearchBottom-Up SearchBottom-Up Search Bottom-Up SearchTop-Down and Bottom-UpControlProblemsAmbiguityShared Sub-ProblemsShared Sub-ProblemsShared Sub-ProblemsShared Sub-ProblemsShared Sub-ProblemsShared Sub-ProblemsShared Sub-ProblemsDynamic ProgrammingCKY ParsingProblemProblemBinarization IntuitionSample L1 GrammarCNF ConversionConversion to CNFSpeech and Language ProcessingSLP Chapter 13Parsing10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 2Rest of Today Parsing with CFGs Bottom-up, top-down Ambiguity Shared sub-problems10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 3Parsing Parsing with CFGs refers to the task of assigning proper trees to input strings Proper here means a tree that covers all and only the elements of the input and has an S at the top It doesn’t actually mean that the system can select the correct tree from among all the possible trees10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 4Parsing As with everything of interest, parsing involves a search which involves the making of choices We’ll start with some basic (meaning bad) methods before moving on to the one or two that you need to know10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 5For Now Assume… You have all the words already in some buffer The input isn’t POS tagged We won’t worry about morphological analysis All the words are known These are all problematic in various ways, and would have to be addressed in real applications.10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 6Top-Down Search Since we’re trying to find trees rooted with an S(Sentences), why not start with the rules that give us an S. Then we can work our way down from there to the words.10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 7Top Down Space10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 8Bottom-Up Parsing Of course, we also want trees that cover the input words. So we might also start with trees that link up with the words in the right way. Then work your way up from there to larger and larger trees.10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 9Bottom-Up Search10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 10Bottom-Up Search10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 11Bottom-Up Search10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 12Bottom-Up Search10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 13Bottom-Up Search10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 14Top-Down and Bottom-Up Top-down Only searches for trees that can be answers (i.e. S’s) But also suggests trees that are not consistent with any of the words Bottom-up Only forms trees consistent with the words But suggests trees that make no sense globally10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 15Control Of course, in both cases we left out how to keep track of the search space and how to make choices Which node to try to expand next Which grammar rule to use to expand a node One approach is called backtracking. Make a choice, if it works out then fine If not then back up and make a different choice10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 16Problems Even with the best filtering, backtracking methods are doomed because of two inter-related problems Ambiguity Shared subproblems10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 17Ambiguity10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 18Shared Sub-Problems No matter what kind of search (top-down or bottom-up or mixed) that we choose. We don’t want to redo work we’ve already done. Unfortunately, naïve backtracking will lead to duplicated work.10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 19Shared Sub-Problems Consider A flight from Indianapolis to Houston on TWA10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 20Shared Sub-Problems Assume a top-down parse making choices among the various Nominal rules. In particular, between these two Nominal -> Noun Nominal -> Nominal PP Statically choosing the rules in this order leads to the following bad results...10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 21Shared Sub-Problems10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 22Shared Sub-Problems10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 23Shared Sub-Problems10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 24Shared Sub-Problems10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 25Dynamic Programming DP search methods fill tables with partial results and thereby Avoid doing avoidable repeated work Solve exponential problems in polynomial time (well, no not really) Efficiently store ambiguous structures with shared sub-parts. We’ll cover two approaches that roughly correspond to top-down and bottom-up approaches. CKY Earley10/13/2008Speech and Language Processing - Jurafsky and Martin (with minor modifications by Dorr) 26CKY Parsing First we’ll limit our grammar to epsilon-free, binary rules (more later) Consider the rule A →BC If there is an A somewhere in the input then there must be a B followed


View Full Document

UMD CMSC 723 - Speech and Language Processing

Documents in this Course
Lecture 9

Lecture 9

12 pages

Smoothing

Smoothing

15 pages

Load more
Download Speech and Language Processing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Speech and Language Processing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Speech and Language Processing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?