DOC PREVIEW
Berkeley COMPSCI 294 - Parsing: Search

This preview shows page 1 out of 2 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 2 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 2 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1CS 294-5: StatisticalNatural Language ProcessingParsing: SearchDan KleinGeneral Problem Someone gives you a PCFG G For any given sentence you might want to: Find the best parse according to G Find a bunch of reasonable parses Find the total probability of all parses Techniques: Beam search Agenda-based search The CKY algorithmBeam Search State space search States are partial parses Find a way to ensure that all parses of a sentence have the same number N steps Leftmost top-down CFG derivations in CNF Shift-reduce derivations in CNF (Use a binary grammar, or binarize what you’ve got)Beam Search Time-synchronous beam searchBeam at time iSuccessors of beam elementsBeam at time i+1Kinds of Beam Search Constant beam size K Constant beam width Additive Multiplicative Sometimes do fancier stuff, like try to keep beam diverse Beam search can be made very fast No measure of how optimal it is Correct hypothesis trickAgenda-Based Parsing For general grammars Start with a table recording δ(X,i,j) The best score of a parse of X over [i,j] All entries start at ∞ Can be a sparse or dense map Sometimes record backtraces, too Step I: Hit the lexicon For each word w, and each tag t, set δ(t,i,j) = tag-score(w,t)2Agenda-Based Parsing Keep a list of edges called an agenda Edges are triples [X,i,j] Agenda is a priority queue Every time some δ(X,i,j) lowers: Stick the edge [X,i,j] into the agenda Update the backtrace for δ(X,i,j)Agenda-Based Parsing Step II: While agenda not empty: Get the “next” edge [X,i,j] from the agenda Fetch all compatible neighbors [Y,j,k] or [Z,k,i] Compatible means there are rules A→XY or B→ZX Build parent edges [A,i,k] or [B,k,j] δ(A,i,k) ≤δ(X,i,j) + δ(Y,j,k) + P(XY|A) If we’ve improved δ(A,i,k), stick [A,i,k] on the agenda Also project unary rules: When do we know we have a parse for the root?Agenda-Based Parsing Open questions: Agenda priority: What did “next” mean? Efficiency: how do we do as little work as possible? Optimality: how do we know when we find the best parse of a sentence? If we use δ(X,i,j) as the priority: Each edge goes on the agenda at most once When an edge pops off the agenda, its best parse score is known (why?) This is basically uniform- cost searchSpeeding Up Agenda Parsers Two options for doing less work The optimal way: A* Parsing The ugly (but possibly faster) way: Best-First Parsing CKY Parsing Assuming: You’ve got a lot of memory You’re willing to do exhaustive parsing Your grammar is in CNF There’s an easy solution: CKY parsingNext Time Grammars beyond PCFGs Reading: M+S 11 (over next few classes) J+M 12 (over next few


View Full Document

Berkeley COMPSCI 294 - Parsing: Search

Documents in this Course
"Woo" MAC

"Woo" MAC

11 pages

Pangaea

Pangaea

14 pages

Load more
Download Parsing: Search
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Parsing: Search and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Parsing: Search 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?