UT CS 388 - Natural Language Processing: Statistical Parsing

Unformatted text preview:

CS 388: Natural Language Processing: Statistical ParsingStatistical ParsingProbabilistic Context Free Grammar (PCFG)Simple PCFG for ATIS EnglishSentence ProbabilitySyntactic DisambiguationSlide 7Three Useful PCFG TasksPCFG: Most Likely DerivationSlide 10Probabilistic CKYProbabilistic Grammar ConversionProbabilistic CKY ParserSlide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Slide 21Slide 22PCFG: Observation LikelihoodInside AlgorithmProbabilistic CKY Parser for Inside ComputationSlide 26PCFG: Supervised TrainingEstimating Production ProbabilitiesPCFG: Maximum Likelihood TrainingSlide 30Inside-OutsideVanilla PCFG LimitationsExample of Importance of LexicalizationSlide 34Head WordsLexicalized ProductionsSlide 37Parameterizing Lexicalized ProductionsCollins’ ParserSample Production GenerationEstimating Production Generation ParametersMissed Context DependenceSplitting Non-TerminalsParent Annotation ExampleSplit and MergeTreebanksFirst WSJ SentenceWSJ Sentence with Trace (NONE)Parsing Evaluation MetricsComputing Evaluation MetricsTreebank ResultsDiscriminative Parse Reranking2-Stage Reranking ApproachParse RerankingSample Parse Tree FeaturesEvaluation of RerankingOther Discriminative ParsingHuman ParsingGarden Path SentencesCenter EmbeddingDependency GrammarsDependency Graph from Parse TreeUnification GrammarsMildly Context-Sensitive GrammarsStatistical Parsing Conclusions11CS 388: Natural Language Processing:Statistical ParsingRaymond J. MooneyUniversity of Texas at AustinStatistical Parsing•Statistical parsing uses a probabilistic model of syntax in order to assign probabilities to each parse tree.•Provides principled approach to resolving syntactic ambiguity.•Allows supervised learning of parsers from tree-banks of parse trees provided by human linguists.•Also allows unsupervised learning of parsers from unannotated text, but the accuracy of such parsers has been limited.23Probabilistic Context Free Grammar(PCFG)•A PCFG is a probabilistic version of a CFG where each production has a probability.•Probabilities of all productions rewriting a given non-terminal must add to 1, defining a distribution for each non-terminal.•String generation is now probabilistic where production probabilities are used to non-deterministically select a production for rewriting a given non-terminal.Simple PCFG for ATIS EnglishS → NP VP S → Aux NP VP S → VP NP → PronounNP → Proper-NounNP → Det NominalNominal → NounNominal → Nominal NounNominal → Nominal PPVP → VerbVP → Verb NPVP → VP PPPP → Prep NPGrammar0.80.10.10.20.20.60.30.20.50.20.50.31.0Prob++++1.01.01.01.0Det → the | a | that | this 0.6 0.2 0.1 0.1Noun → book | flight | meal | money 0.1 0.5 0.2 0.2Verb → book | include | prefer 0.5 0.2 0.3Pronoun → I | he | she | me 0.5 0.1 0.1 0.3Proper-Noun → Houston | NWA 0.8 0.2Aux → does 1.0Prep → from | to | on | near | through 0.25 0.25 0.1 0.2 0.2Lexicon5Sentence Probability•Assume productions for each node are chosen independently.•Probability of derivation is the product of the probabilities of its productions.P(D1) = 0.1 x 0.5 x 0.5 x 0.6 x 0.6 x 0.5 x 0.3 x 1.0 x 0.2 x 0.2 x 0.5 x 0.8 = 0.0000216D1SVPVerb NP Det NominalNominal PPbookPrep NPthroughHoustonProper-NountheflightNoun0.50.50.60.60.51.00.20.30.50.20.80.1Syntactic Disambiguation•Resolve ambiguity by picking most probable parse tree.66D2VPVerb NP Det NominalbookPrep NPthroughHoustonProper-NountheflightNoun0.50.50.60.61.00.20.30.50.20.8SVP0.1PP0.3P(D2) = 0.1 x 0.3 x 0.5 x 0.6 x 0.5 x 0.6 x 0.3 x 1.0 x 0.5 x 0.2 x 0.2 x 0.8 = 0.00001296Sentence Probability•Probability of a sentence is the sum of the probabilities of all of its derivations.7P(“book the flight through Houston”) = P(D1) + P(D2) = 0.0000216 + 0.00001296 = 0.000034568Three Useful PCFG Tasks•Observation likelihood: To classify and order sentences.•Most likely derivation: To determine the most likely parse tree for a sentence.•Maximum likelihood training: To train a PCFG to fit empirical training data.PCFG: Most Likely Derivation•There is an analog to the Viterbi algorithm to efficiently determine the most probable derivation (parse tree) for a sentence.S → NP VPS → VPNP → Det A NNP → NP PPNP → PropNA → εA → Adj APP → Prep NPVP → V NPVP → VP PP0.90.10.50.30.20.60.41.00.70.3EnglishPCFG ParserJohn liked the dog in the pen.SNP VPJohn V NP PPliked the dog in the penX10PCFG: Most Likely Derivation•There is an analog to the Viterbi algorithm to efficiently determine the most probable derivation (parse tree) for a sentence.S → NP VPS → VPNP → Det A NNP → NP PPNP → PropNA → εA → Adj APP → Prep NPVP → V NPVP → VP PP0.90.10.50.30.20.60.41.00.70.3EnglishPCFG ParserJohn liked the dog in the pen.SNP VPJohn V NP liked the dog in the penProbabilistic CKY•CKY can be modified for PCFG parsing by including in each cell a probability for each non-terminal.•Cell[i,j] must retain the most probable derivation of each constituent (non-terminal) covering words i +1 through j together with its associated probability.•When transforming the grammar to CNF, must set production probabilities to preserve the probability of derivations.Probabilistic Grammar ConversionS → NP VPS → Aux NP VPS → VPNP → PronounNP → Proper-NounNP → Det NominalNominal → Noun Nominal → Nominal NounNominal → Nominal PPVP → VerbVP → Verb NPVP → VP PPPP → Prep NPOriginal Grammar Chomsky Normal FormS → NP VPS → X1 VPX1 → Aux NPS → book | include | prefer 0.01 0.004 0.006S → Verb NPS → VP PPNP → I | he | she | me 0.1 0.02 0.02 0.06NP → Houston | NWA 0.16 .04NP → Det NominalNominal → book | flight | meal | money 0.03 0.15 0.06 0.06Nominal → Nominal NounNominal → Nominal PPVP → book | include | prefer 0.1 0.04 0.06VP → Verb NPVP → VP PPPP → Prep NP0.80.10.10.2 0.2 0.60.30.20.50.20.50.31.00.80.11.00.050.030.60.20.50.50.31.0Probabilistic CKY Parser13 Book the flight through HoustonS :.01, VP:.1, Verb:.5 Nominal:.03Noun:.1Det:.6Nominal:.15Noun:.5NoneNP:.6*.6*.15 =.054Probabilistic CKY Parser14 Book the flight


View Full Document

UT CS 388 - Natural Language Processing: Statistical Parsing

Download Natural Language Processing: Statistical Parsing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Natural Language Processing: Statistical Parsing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Natural Language Processing: Statistical Parsing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?