DOC PREVIEW
MIT 6 863J - Computation & Hierarchical parsing II

This preview shows page 1-2 out of 7 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Compute initial state set S0Compute initial state set S01. S0← q02. S0← eta-closure (S0)1. S0←q02. S0← eta-closure (S0)q0= [Start→•S, 0]q0= [Start→•S, 0, 0]eta-closure= transitiveclosure of jump arcseta-closure= transitive closureof Predict and CompleteFTN Parser Earley ParserInitialize:Compute Si from Si-1Compute Si from Si-1For each word, wi, 1=1,...,n For each word, wi, 1=1,...,nSi←∪δ(q, wi)q∈Si-1Si←∪δ(q, wi)q∈Si-1= Scan(Si-1)q=itemSi←e-closure(Si)Si←e-closure(Si )e-closure=closure(Predict, Complete)Loop:Accept/reject:If qf ∈ Sn then accept;else rejectIf qf∈Sn then accept;else rejectAccept/reject:qf= [Start→S•, 0]qf= [Start→S•, 0, n]Final:Massachusetts Institute of Technology6.863J/9.611J, Natural Language Processing, Spring, 2001Department of Electrical Engineering and Computer ScienceDepartment of Brain and Cognitive SciencesHandout 8: Computation & Hierarchical parsing II1 FTN parsing redux2 Earley’s algorithmEarley’s algorithm is like the state set simulation of a nondeterministic FTN presented earlier,with the addition of a single new integer representing the starting point of a hierarchicalphrase (since now phrases can start at any point in the input). Note that with simple linearconcatenation this information is implicitly encoded via the word position we are at. Thestopping or end point of a phrase will be encoded by the word position. To proceed, giveninput n, a series of state sets S0, S1, ..., Snis built, where Sicontains all the valid items afterreading i words. The algorithm as presented is a simple recognizer; as usual, parsing involvesmore work.2 6.863J Handout 8, Spring, 2001In theorem-proving terms, the Earley algorithm selects the leftmost nonterminal (phrase)in a rule as the next candidate to see if one can find a “proof” for it in the input. (By varyingwhich nonterminal is selected, one can come up with a different strategy for parsing.)To recognize a sentence using a context-free grammar G andEarley’s algorithm:1 Compute the initial state set, S0:1a Put the start state, (Start →•S, 0, 0), in S0.1b Execute the following steps until no new state triplesare added.1b1 Apply complete to S0.1b2 Apply predict to S0.2 For each word wi, i =1, 2,...,n, build state set Sigivenstate set Si−1.2a Apply scan to state set Si.2b Execute the following steps until no new state triplesare added to state set Si.2b1 Apply complete to Si2b2 Apply predict to Si2c If state set Siis empty, reject the sentence; else in-crement i.2d If i<nthen go to Step 2a; else go to Step 3.3 If state set n includes the accept state (Start → S•, 0,n),then accept; else reject.Defining the basic operations on itemsDefinition 1: Scan: For all states (A → α • tβ,k,i− 1) in state set Si−1,ifwi= t, then add(A → αt •β,k,i) to state set Si.Definition 2: Predict (Push): Given a state (A → α • Bβ,k,i) in state set Si, then add allstates of the form (B →•γ, i, i) to state set Si.Definition 3: Complete (Pop): If state set Sicontains the triple (B → γ•,k,i), then, for allrules in state set k of the form, (A → α • Bβ, l,k), add (A → αB • β, l,i) to state set Si. (Ifthe return value is empty, then do nothing.)Compute initial state set S0Compute initial state set S01. S0← q02. S0← eta-closure (S0)1. S0←q02. S0← eta-closure (S0)q0= [Start→•S, 0]q0= [Start→•S, 0, 0]eta-closure= transitiveclosure of jump arcseta-closure= transitive closureof Predict and CompleteFTN Parser Earley ParserInitialize:Compute Si from Si-1Compute Si from Si-1For each word, wi, 1=1,...,n For each word, wi, 1=1,...,nSi←∪δ(q, wi)q∈Si-1Si←∪δ(q, wi)q∈Si-1= Scan(Si-1)q=itemSi←e-closure(Si)Si←e-closure(Si )e-closure=closure(Predict, Complete)Loop:Accept/reject:If qf ∈ Sn then accept;else rejectIf qf∈Sn then accept;else rejectAccept/reject:qf= [Start→S•, 0]qf= [Start→S•, 0, n]Final:Computation & Hierarchical parsing II 33 Comparison of FTN and Earley state set parsingThe FTN and Earley parsers are almost identical in terms of representations and algorithmicstructure. Both construct a sequence of state sets S0, S1, ..., Sn. Both algorithms consist ofthree parts: an initialization stage; a loop stage, and an acceptance stage. The only differenceis that since the Earley parser must handle an expanded notion of an item (it is now a partialtree rather than a partial linear sequence), one must add a single new integer index to markthe return address in hierarchical structure.Note that prediction and completion both act like -transitions: they spark parser opera-tions without consuming any input; hence, one must close each state set construction underthese operations (= we must add all states we can reach after reading i words, including thosereached under -transitions.)Question: where is the stack in the Earley algorithm? (Since we need a stack for returnpointers.)4 A simple example of the algorithm in actionLet’s now see how this works with a simple grammar and then examine how parses may beretrieved. There have been several schemes proposed for parse storage and retrieval.Here is a simple grammar plus an example parse for John ate ice-cream on the table (am-biguous as to the placement of the Prepositional Phrase on the table).4 6.863J Handout 8, Spring, 2001Start→SS→NP VPNP→Name NP→Det NounNP→Name PP PP→ Prep NPVP→VNP VP→VNPPPV→ate Noun→ice-creamName→John Name→ice-creamNoun→table Det→thePrep→onLet’s follow how this parse works using Earley’s algorithm and the parser used in laboratory2. (The headings and running count of state numbers aren’t supplied by the parser. Also notethat Start is replaced by *DO*. Some additional duplicated states that are printed duringtracing have been removed for clarity, and comments added.)(in-package ’gpsg)(remove-rule-set ’testrules)(remove-rule-set ’testdict)(add-rule-set ’testrules ’CFG)(add-rule-list ’testrules’((S ==> NP VP)(NP ==> name)(NP ==> Name PP)(VP ==> V NP)(NP ==> Det Noun)(PP ==> Prep NP)(VP ==> V NP PP)))(add-rule-set ’testdict ’DICTIONARY)(add-rule-list ’testdict’((ate V)(John Name)(table Noun)(ice-cream Noun)(ice-cream Name)(on Prep)(the Det)))(create-cfg-table ’testg ’testrules ’s 0)? (pprint (p "john ate ice-cream on the table":grammar ’testg :dictionary ’testdict :print-states t))Computation & Hierarchical parsing II 5State set Return ptr Dotted


View Full Document

MIT 6 863J - Computation & Hierarchical parsing II

Documents in this Course
N-grams

N-grams

42 pages

Semantics

Semantics

75 pages

Semantics

Semantics

82 pages

Semantics

Semantics

64 pages

Load more
Download Computation & Hierarchical parsing II
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Computation & Hierarchical parsing II and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Computation & Hierarchical parsing II 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?