DOC PREVIEW
MSU CSE 842 - CSE 842 Natural Language Processing
Course Cse 842-
Pages 12

This preview shows page 1-2-3-4 out of 12 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

2/21/2011 CSE842, Spring 2011, MSU 1CSE 842Natural Language ProcessingLecture 11: Probabilistic Parsing2/21/2011 CSE842, Spring 2011, MSU 2Dynamic Programming• Create table of solutions to sub-problems (e.g. subtrees) as parse proceeds• Look up subtrees for each constituent rather than re-parsing, avoiding repeated work.• Since all parses implicitly stored, all available for later disambiguation• We will look at two approaches corresponding to top-down and bottom-up: – CKY: Cocke-Kasami-Younger (CKY) (1960),– Earley: Earley (1970) 2/21/2011 CSE842, Spring 2011, MSU 3Earley Parsing• Allows arbitrary CFGs• Top-down control• Fills a table in a single sweep over the input– Table is length N+1; N is number of words– Think of chart entries as sitting between words in the input string keeping track of states of the parse at these positions– For each word position, chart contains set of states representing all partial parse trees generated to date.• Completed constituents and their locations• In-progress constituents• Predicted constituents2/21/2011 CSE842, Spring 2011, MSU 4States• The table-entries are called states and are represented with dotted-rules.S →· VP A VP is predictedNP →Det · Nominal An NP is in progressVP →V NP · A VP has been found2/21/2011 CSE842, Spring 2011, MSU 5States/Locations•S →z VP [0,0]•NP →Det z Nominal [1,2]•VP →V NP z [0,3]• A VP is predicted at the start of the sentence• An NP is in progress; the Detgoes from 1 to 2• A VP has been found starting at 0 and ending at 3–[x,y] tells us where the state begins (x) and where the dot lies (y) with respect to the input2/21/2011 CSE842, Spring 2011, MSU 6S --> • VP, [0,0] – First 0 means S constituent begins at the start of the input– Second 0 means the dot is here too– So, this is a top-down predictionNP --> Det • Nom, [1,2]– the NP begins at position 1– the dot is currently at position 2– so, Det has been successfully parsed– Nom is predicted next0Book 1that 2flight 32/21/2011 CSE842, Spring 2011, MSU 7Successful Parse• Final answer found by looking at last entry in chart• If entry resembles S --> α •[0,N] then input parsed successfully• But note that chart will also contain a record of all possible parses of input string, given the grammar -- not just the successful one(s)2/21/2011 CSE842, Spring 2011, MSU 8Parsing Procedure for the EarleyAlgorithm• Move through each set of states in order, applying one of three operators to each state:– predictor: add predictions to the chart– scanner: read input and add corresponding state to chart– completer: move dot to right when new constituent found• Results (new states) added to current or next set of states in chart• No backtracking and no states removed: keep complete history of parse2/21/2011 CSE842, Spring 2011, MSU 9Core Earley Code2/21/2011 CSE842, Spring 2011, MSU 10Predictor• Intuition: create new states represent top-down expectations• Applied when non part-of-speech non-terminals are to the right of a dotS --> • VP [0,0]• Adds new states to current chart– One new state for each expansion of the non-terminal in the grammarVP --> • V [0,0]VP --> • V NP [0,0]2/21/2011 CSE842, Spring 2011, MSU 11Scanner• New states for predicted part of speech.• Applicable when part of speech is to the right of a dotVP --> • V NP [0,0] ‘Book…’• Looks at current word in input• If match, adds state(s) to next chartV --> book • [0,1]2/21/2011 CSE842, Spring 2011, MSU 12Completer• Intuition: parser has discovered a constituent, so must find and advance all states that are waiting for this• Applied when dot has reached right end of ruleNP --> Det Nom • [1,3]• Find all states w/dot at 1 and expecting an NPVP --> V • NP [0,1]• Adds new (completed) state(s) to current chartVP --> V NP • [0,3]2/21/2011 CSE842, Spring 2011, MSU 13Earley Code2/21/2011 CSE842, Spring 2011, MSU 14CFG for Fragment of EnglishVP Æ V NPVP Æ VPropN Æ Houston | TWANom Æ Nom NPrep Æfrom | to | onNom Æ N Aux Æ doesNP ÆPropNV Æ book | include | preferNP Æ Det NomN Æ book | flight | meal | moneyS Æ VPDet Æ that | this | aS Æ Aux NP VPS Æ NP VPNote: the example here uses less rules than the example in the book.2/21/2011 CSE842, Spring 2011, MSU 15Book that flight (Chart [0])Seed chart with top-down predictions for S from grammarγ Æ • S [0,0] Dummy start state S Æ • NP VP [0,0] Predictor S Æ • Aux NP VP [0,0] Predictor S Æ • VP [0,0] Predictor NP Æ • Det Nom [0,0] Predictor NP Æ • PropN [0,0] Predictor VP Æ • V [0,0] Predictor VP Æ • V NP [0,0] Predictor 2/21/2011 CSE842, Spring 2011, MSU 16• When dummy start state is processed, it’s passed to Predictor, which produces states representing every possible expansion of S, and adds these and every expansion of the left corners of these trees to bottom of Chart[0]•When VP --> • V, [0,0] is reached, Scanner called, which consults first word of input, Book, and adds first state to Chart[1],V --> Book •, [0,1]• Note: When VP --> • V NP, [0,0] is reached in Chart[0],Scanner does not need to add V --> Book •, [0,1] again to Chart[1]2/21/2011 CSE842, Spring 2011, MSU 17Chart[1]VÆ book •[0,1] ScannerVP Æ V •[0,1] CompleterVP Æ V • NP[0,1] CompleterS Æ VP •[0,1] CompleterNP Æ • Det Nom[1,1] PredictorNP Æ • PropN[1,1] PredictorV--> book • passed to Completer, which finds 2 states in Chart[0]whose left corner is V and adds them to Chart[1],moving dots to right2/21/2011 CSE842, Spring 2011, MSU 18• When VP Æ V • is itself processed by the Completer, S Æ VP • is added to Chart[1]since VP is a left corner of S• Last 2 rules in Chart[1]are added by Predictorwhen VP Æ V • NP is processed• And so on….2/21/2011 CSE842, Spring 2011, MSU 19Chart[2]Det Æthat • [1,2] Scanner NP Æ Det • Nom [1,2] Completer Nom Æ • N [2,2] Predictor NomÆ • Nom N [2,2] Predictor 2/21/2011 CSE842, Spring 2011, MSU 20Chart[3]N Æflight • [2,3] Scanner Nom Æ N • [2,3] Completer NP Æ Det Nom • [1,3] Completer NomÆ Nom • N [2,3] Completer VP Æ V NP • [0,3] Completer S Æ VP • [0,3] Completer2/21/2011 CSE842, Spring 2011, MSU 21How do we retrieve the parses at the end?• Augment the Completer to add pointers to prior states it advances as a field in the current state– i.e. what state did we advance here?– Read the pointers back from the


View Full Document

MSU CSE 842 - CSE 842 Natural Language Processing

Course: Cse 842-
Pages: 12
Download CSE 842 Natural Language Processing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view CSE 842 Natural Language Processing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view CSE 842 Natural Language Processing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?