Columbia COMS W4705 - Basic Parsing with Context-Free Grammars - D877079

Home> Schools> Columbia University> (COMS) > COMS W4705> Basic Parsing with Context-Free Grammars

DOC PREVIEW

Columbia COMS W4705 - Basic Parsing with Context-Free Grammars

School name Columbia University

Course Coms W4705- Natural Language Processing

Pages 25

This preview shows page 1-2-24-25 out of 25 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Slide 1Analyzing Linguistic UnitsSyntactic ParsingParsing as a Form of SearchCFG for Fragment of EnglishS VP NP Nom V Det N Book that flightRule ExpansionTop-Down ParserTop-Down Search Space for CFG (expanding only leftmost leaves)Bottom-Up ParsingTwo Candidates: One Successful ParseWhat’s right/wrong with….A Top-Down Parsing StrategyTop-Down, Depth-First, Left-Right StrategyFig 10.7Left Corners: Top-Down Parsing with Bottom-Up FilteringLeft CornersLeft-Corner Table for CFGLeft RecursionSlide 20Slide 21Structural ambiguity:Slide 23Slide 24Summing UpCS 4705Basic Parsing with Context-Free GrammarsAnalyzing Linguistic Units•Morphological parsing: –analyze words into morphemes and affixes–rule-based, FSAs, FSTs•Phonological parsing:–analyze sounds into words and phrases•POS Tagging•Syntactic parsing:–identify component parts and how related–to see if a sentence is grammatical–to assign an abstract representation of meaningSyntactic Parsing•Declarative formalisms like CFGs define the legal strings of a language -- but don’t specify how to recognize or assign structure to them•Parsing algorithms specify how to recognize the strings of a language and assign each string one or more syntactic structures•Parsing useful for grammar checking, semantic analysis, MT, QA, information extraction, speech recognition…and almost every task in NLPParsing as a Form of Search•Searching FSAs–Finding the right path through the automaton–Search space defined by structure of FSA•Searching CFGs–Finding the right parse tree among all possible parse trees–Search space defined by the grammar•Constraints provided by the input sentence and the automaton or grammarCFG for Fragment of EnglishS  NP VP VP  VS  Aux NP VPPP -> Prep NPS  VP N  book | flight | meal | moneyNP  Det Nom V  book | include | preferNP PropN Aux  doesNom  N Nom Prep from | to | onNom  N PropN  Houston | TWANom  Nom PP Det  that | this | aVP  V NPTopD BotUp E.g.LC’sSVPNP NomV Det NBook that flightParse Tree for ‘Book that flight’ for Prior CFGRule ExpansionVP  V NP (2)Nom  Nom PPPropN  Houston | TWANom  N (4)Prep from | to | onNom  N NomAux  doesNP PropNV  book | include | preferNP  Det Nom (3)N  book | flight | meal | moneyS  VP (1)PP -> Prep NPS  Aux NP VPVP  VS  NP VPTopD BotUp E.g.LC’sDet  that | this | aTop-Down Parser•Builds from the root S node to the leaves•Assuming we build all trees in parallel: –Find all trees with root S (or all rules w/lhs S)–Next expand all constituents in these trees/rules–Continue until leaves are pos–Candidate trees failing to match pos of input string are rejected (e.g. Book that flight matches only one subtree)Top-Down Search Space for CFG (expanding only leftmost leaves)S S SNP VP Aux NP VP VPS S S S S S NP VP NP VP Aux NP VP Aux NP VP VP VPDet Nom PropN Det Nom PropN V NP V Det NomNBottom-Up Parsing•Parser begins with words of input and builds up trees, applying grammar rules whose rhs match–Book that flightN Det N V Det NBook that flight Book that flight–‘Book’ ambiguous (2 pos appear in grammar)–Parse continues until an S root node reached or no further node expansion possibleTwo Candidates: One Successful Parse SVPVP NP NPNom NomV Det N V Det NBook that flight Book that flightWhat’s right/wrong with….•Top-Down parsers – they never explore illegal parses (e.g. which can’t form an S) -- but waste time on trees that can never match the input•Bottom-Up parsers – they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)•For both: control strategy -- how explore search space?–Pursuing all parses in parallel or backtrack or …?–Which rule to apply next?–Which node to expand next?A Top-Down Parsing Strategy•Depth-first search: –Agenda of search states: expand search space incrementally, exploring most recently generated state (tree) each time–When you reach a state (tree) inconsistent with input, backtrack to most recent unexplored state (tree)•Which node to expand?–Leftmost or rightmost•Which grammar rule to use?–Order in the grammar??Top-Down, Depth-First, Left-Right Strategy•Initialize agenda with ‘S’ tree and ptr to first word and make this current search state (cur)•Loop until successful parse or empty agenda–Apply all applicable grammar rules to leftmost unexpanded node of cur •If this node is a POS category and matches that of the current input, push this onto agenda•O.w. push new trees onto agenda–Pop new cur from agenda•Does this flight include a meal?Fig 10.7CFGLeft Corners: Top-Down Parsing with Bottom-Up Filtering•We saw: Top-Down, depth-first, L2R parsing –Expands non-terminals along the tree’s left edge down to leftmost leaf of tree–Moves on to expand down to next leftmost leaf…–Note: In successful parse, current input word will be first word in derivation of node the parser currently processing–So….look ahead to left-corner of the tree•B is a left-corner of A if A =*=> B•Build table with left-corners of all non-terminals in grammar and consult before applying ruleLeft CornersLeft-Corner Table for CFGCategory Left CornersS Det, PropN, Aux, VNP Det, PropNNom NVP VLeft Recursion•Depth-first search will never terminate if grammar is left recursive (e.g. NP --> NP PP)),(**  •Solutions:–Rewrite the grammar (automatically?) to a weakly equivalent one which is not left-recursivee.g. The man {on the hill with the telescope…}NP  NP PP (Nom plus a sequence of PPs)NP  Nom PPNP  Nom…becomes…NP  Nom NP’NP’  PP NP’ (a sequence of PPs)NP’  e•This may make rules unnatural–Harder to detect and eliminate non-immediate left recursion–NP --> Nom PP–Nom --> NP–Fix depth of search explicitly–Rule ordering: non-recursive rules firstNP --> Det NomNP --> NP PPStructural ambiguity:•Multiple legal structures–Attachment (e.g. I saw a man on a hill with a telescope)–Coordination (e.g. younger cats and dogs)–NP bracketing (e.g. Spanish language teachers)•Solution? –Return all possible parses and disambiguate using “other methods”Summing Up•Parsing is a search problem which may be implemented with many control strategies–Top-Down or Bottom-Up approaches each have problems•Combining the two solves

View Full Document