MIT 6 863J - Parsing with hierarchical structures – context-free parsing

Unformatted text preview:

6.863J Natural Language Processing Lecture 7: parsing with hierarchical structures – context-free parsing Robert C. Berwick The Menu Bar • Administrivia: • Schedule alert: Lab2 due Weds; Lab 3 out – Monday (chunk parsing to ‘real’ parsing) • Lab time today, tomorrow • Please read notes3.pdf, englishgrammar.pdf (on web) • Agenda: • Marxist analysis – simple & post-modern • What: hierarchical representations; constituents, representation • How: constituent or ‘context-free’ parsing (next time – how to do it fast) • Why: to extract ‘meaning’ 6.863J/9.611J Lecture 7 Sp03Motivation • What, How, and Why word chunks behave as units, like• What: words or endings (morphemes), like ing we have to recover these from input chunks used to discover meaning • How: • Why: • Parsing: mapping from strings to structured representation 6.863J/9.611J Lecture 7 Sp03 Programming languages printf ("/charset [%s", (re_opcode_t) *(p - 1) == charset_not ? "^" : ""); assert (p + *p < pend); for (c = 0; c < 256; c++)if (c / 8 < *p && (p[1 + (c/8)] & (1 << (c % 8)))) {/* Are we starting a range? */if (last + 1 == c && ! inrange) { putchar ('-'); inrange = 1; }/* Have we broken a range? */else if (last + 1 != c && inrange) { putchar (last); inrange = 0; } if (! inrange) putchar (c); last = c; � Easy to parse.} � Designed that way! 6.863J/9.611J Lecture 7 Sp03Natural languages printf "/charset %s", re_opcode_t *p - 1 == charset_not ? "^" : ""; assert p + *p < pend; for c = 0; c < 256; c++ if c / 8 < *p && p1 + c/8 & 1 << c % 8 Are we starting a range? if last +1 == c && ! inrange putchar '-'; inrange = 1; Have we broken a range? else if last + 1 != c && inrange putchar last; inrange = 0; if ! inrange putchar c; last = c; � No {} () [] to indicate scope & precedence � Lots of overloading (arity varies) � Grammar isn’t known in advance! � Context-free grammar not best formalism 6.863J/9.611J Lecture 7 Sp03 6.863J/9.611J Lecture 7 Sp03 How: The parsing problem P A R S E R Grammar s c o r e r correct test trees test sentences accuracy Recent parsers quite accurateSyntactic Parsing • Declarative formalisms like CFGs define the legal strings of a language but don’t specify how to recognize or assign structure to them • Parsing algorithms specify how to recognize the strings of a language and assign each string one or more syntactic structures • Parse trees useful for grammar checking, semantic analysis, MT, QA, information extraction, speech recognition…and almost every task in NLP 6.863J/9.611J Lecture 7 Sp03 Applications of parsing (1/2) � Machine translation (Alshawi 1996, Wu 1997, ...) English treeoperations Chinese � Speech synthesis from parses (Prevost 1996) The government plans to raise income tax. The government plans to raise income tax the imagination. � Speech recognition using parsing (Chelba et al 1998) Put the file in the folder. Put the file and the folder. 6.863J/9.611J Lecture 7 Sp03Applications of parsing � Grammar checking (Microsoft) � Indexing for information retrieval (Woods 72-1997) ... washing a car with a hose ... vehicle maintenance � Information extraction (Keyser, Chomsky ’62 to 6.863J/9.611J Lecture 7 Sp03 Hobbs 1996) �NY Times �archive �query �DatabaseWhy: Q&A systems (lab 4) (top-level) Shall I clear the database? (y or n) y >John saw Mary in the park OK. >Where did John see Mary IN THE PARK. >John gave Fido to Mary OK. >Who gave John Fido I DON'T KNOW >Who gave Mary Fido JOHN >John saw Fido OK. >Who did John see FIDO AND MARY 6.863J/9.611J Lecture 7 Sp03Why: express ‘long distance’ relationships via adjacency • The guy that we know in Somerville likes ice-cream • Who did the guy who lives in Somerville see __? S NP+sg VP+sg The guy S that we know in Som. V likes NP ice-cream 6.863J/9.611J Lecture 7 Sp03 Why: recover meaning from structure John ate ice-cream fi ate(John, ice-cream) -This must be done from structure -Actually want something like lxly ate(x,y) How? 6.863J/9.611J Lecture 7 Sp03Why: recover meaning from structure S VP(NP )= ate (john , icecream) john NP VP= ly.ate (y, ice-cream) V NP ice-cream lxly.ate (y, x) John ate ice-cream 6.863J/9.611J Lecture 7 Sp03 Why: Parsing for the Turing Test � Most linguistic properties are defined over hierarchical structure � One needs to parse to see subtle distinctions Sara likes her. (her „ Sara) Sara thinks that someone likes her. (her = or „ Sara) Sara dislikes anyone’s criticism of her. (her = Sara or her „ Sara) Who did John see? fi For which x, x a person, likes(Bill, x) Distinction here is based on hierarchical structure = scope in natural language 6.863J/9.611J Lecture 7 Sp03Structure must be recovered S S who VP did NP ‘gap’ orV empty element see x 6.863J/9.611J Lecture 7 Sp03 What is the structure that matters? S Turns out to be SCOPE for natural languages! 6.863J/9.611J Lecture 7 Sp03The elements 1. What: hierarchical representations (anything with recursion) using phrases AKA “constituents” 2. How: context-free parsing (plus…) 3. Why: (meaning) 6.863J/9.611J Lecture 7 Sp03 Networks to context-free grammars correspondence (CFGs) and back: 1-1 Sentence: NP: VP: NP VP Det Noun Name Verb NP 6.863J/9.611J Lecture 7 Sp03 SfiNP VP NPfiName NPfiDet Noun VPfiVerb NP + terminal expansion rulesAdded information • FSA represents pure linear relation: what can precede or (follow) what • CFG/RTN adds a new predicate: dominate • Claim: The dominance and precedence relations amongst the words exhaustively describe its syntactic structure • When we parse, we are recovering these predicates 6.863J/9.611J Lecture 7 Sp03 How do we move from linear to hierarchical? sawSentence: Noun guythe “splice out” common phrase: subnets Bush We already have the machinery for this… 6.863J/9.611J Lecture 7 Sp03Use of epsilon transitions (‘jump’ arcs) – they consume no input Sentence: …note that no input is consumed during jump verb determiner noun e e e NP VP e e Verb phrase subnet e Noun phrase subnet S-0 S-1 S-2 NP-0 NP-1 NP-3 VP-0 VP-1 VP-2 6.863J/9.611J Lecture 7 Sp03 This will work… with one catch • Consider tracing through “the guy ate the ice-cream” • What happens when we get to the second noun phrase???? • Where do we return to? • Epsilon transition takes us back to different points 6.863J/9.611J Lecture 7 Sp03What: Context-free grammars (CFG)


View Full Document

MIT 6 863J - Parsing with hierarchical structures – context-free parsing

Documents in this Course
N-grams

N-grams

42 pages

Semantics

Semantics

75 pages

Semantics

Semantics

82 pages

Semantics

Semantics

64 pages

Load more
Download Parsing with hierarchical structures – context-free parsing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Parsing with hierarchical structures – context-free parsing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Parsing with hierarchical structures – context-free parsing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?