UT CS 378 - Lexicalized and Probabilistic Parsing - D86809

Home> Schools> University of Texas at Austin> Computer Science (CS) > CS 378> Lexicalized and Probabilistic Parsing

DOC PREVIEW

UT CS 378 - Lexicalized and Probabilistic Parsing

School name University of Texas at Austin

Course Cs 378- Declarative Programming

Pages 27

This preview shows page 1-2-3-25-26-27 out of 27 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Lexicalized and Probabilistic ParsingUsing ProbabilitiesIt’s Mostly About SemanticsHow to Add Semantics to Parsing?The “Modern” ApproachProbabilistic Context-Free GrammarsA Toy ExampleHow Can We Use These?The Probability of Some Parse TAn ExampleAn Example – The ProbabilitiesUsing Probabilities for Language ModelingAdding Probabilities to a ParserLimitations to Attaching Probabilities Just to RulesOften the Probabilities Depend on Lexical ChoicesThe Dogs in Houses ExampleThe Fix – Use the LexiconLexicalized TreesAdding Lexical Items to the RulesCollapsing These CasesComputing Probabilities of HeadsRevised Rule for Probability of a ParseSo We Can Solve the Dumped Sacks ProblemIt’s Mostly About Semantics But It’s Also About PsychologyGarden Path SentencesEmbedding LimitationsBuilding Deterministic ParsersLexicalized and Probabilistic ParsingRead J & M Chapter 12.Using Probabilities•Resolving ambiguities:I saw the Statue of Liberty flying over New York.•Predicting for recognition:I have to go. vs. I half to go. vs.I half way thought I’d go.It’s Mostly About SemanticsHe drew one card.I saw the Statue of Liberty flying over New York.I saw a plane flying over New York.Workers dumped sacks into a bin.Moscow sent more than 100,000 soldiers into Afghanistan.John hit the ball with the bat.John hit the ball with the autograph.Visiting relatives can be trying.Visiting museums can be trying.How to Add Semantics to Parsing?The classic approach to this problem:Ask a semantics module to choose. Two ways to do that:•Cascade the two systems. Build all the parses, then pass them to semantics to rate them. Combinatorially awful.•Do semantics incrementally. Pass constituents, get ratings and filter.In either case, we need to reason about the world.The “Modern” ApproachThe modern approach:Skip “meaning” and the corresponding need for a knowledge base and an inference engine.Notice that the facts about meaning manifest themselves in probabilities of observed sentences if there are enough sentences.Why is this approach in vogue?•Building world models is a lot harder than early researchers realized.•But, we do have huge text corpora from which we can draw statistics.Probabilistic Context-Free GrammarsA PCFG is a context-free grammar in which each rule has been augmented with a probability:A   [p] is the probability that a given nonterminalsymbol A will be rewritten as  via thisrule. Another way to think of this is:P(A  |A)So the sum of all the probabilities of rules with left hand side A must be 1.A Toy ExampleHow Can We Use These? In a top-down parser, we can follow the more likely path first.In a bottom-up parser, we can build all the constituents and then compare them.The Probability of Some Parse TP(T) = ))((Tnnrpwhere p(r(n)) means the probability that rule r will apply to expand the nonterminal n. Note the independence assumption.)(maxarg)(ˆ)(TPSTSTSo what we want is:where (S) is the set of possible parses for S.An ExampleCan you book TWA flights?An Example – The Probabilities = 1.5  10-6 = 1.7  10-6Note how small the probabilities are, even with this tiny grammar.Using Probabilities for Language ModelingSince there are fewer grammar rules than there are word sequences, it can be useful, in language modeling, to use grammar probabilities instead of flat n-gram frequencies. So the probability of some sentence S is the sum of the probabilities of its possible parses:)()()(STTPSPContrast with:)...|()|()|()()(3214213121wwwwPwwwPwwPwPSP Adding Probabilities to a Parser•Adding probabilities to a top-down parser, e.g., Earley:This is easy since we’re going top-down, we can choose which rule to prefer.•Adding probabilities to a bottom-up parser:At each step, build the pieces, then add probabilities to them.Limitations to Attaching Probabilities Just to RulesSometimes it’s enough to know that one rule applies more often than another:Can you book TWA flights?But often it matters what the context is. Consider:S  NP VPNP  Pronoun[.8]NP  LexNP [.2]But, when the NP is the subject, the true probability of a pronoun is .91. When the NP is the direct object, the true probability of a pronoun is .34.Often the Probabilities Depend on Lexical ChoicesI saw the Statue of Liberty flying over New York.I saw a plane flying over New York.Workers dumped sacks into a bin.Workers dumped sacks of potatoes.John hit the ball with the bat.John hit the ball with the autograph.Visiting relatives can be trying.Visiting museums can be trying. There were dogs in houses and cats.There were dogs in houses and cages.The Dogs in Houses ExampleThe problem is that both parses used the same rules so they will get the same probabilities assigned to them.The Fix – Use the LexiconThe lexicon is an approximation to a knowledge base. It will let us treat into and of differently with respect to dumping without any clue what dumping means or what into and of mean.Note the difference between this approach and subcategorization rules, e.g.,dump [SUBCAT NP][SUBCAT LOCATION]Subcategorization rules specify requirements, not preferences.Lexicalized TreesKey idea: Each constituent has a HEAD word:Adding Lexical Items to the RulesVP(dumped)  VBD (dumped) NP (sacks) PP (into) 3  10-10VP(dumped)  VBD (dumped) NP (cats) PP (into) 8  10-10VP(dumped)  VBD (dumped) NP (hats) PP (into) 4  10-10VP(dumped)  VBD (dumped) NP (sacks) PP (above) 1  10-12We need fewer numbers than we would for N-gram frequencies:The workers dumped sacks of potatoes into a bin.The workers dumped sacks of onions into a bin.The workers dumped all the sacks of potatoes into a bin.But there are still too many and most will be 0 in any given corpus.Collapsing These CasesInstead of caring about specific rules like:VP(dumped)  VBD (dumped) NP (sacks) PP (into) 3  10-10Or about very general rules like:VP  VBD NP PPWe’ll do something partway in between:VP(dumped)  VBD NP PP p(r(n) | n, h(n))Computing Probabilities of HeadsWe’ll let the probability of some node n having head h depend on two factors:•the syntactic category of the node, and•the head of the node’s mother (h(m(n)))So we will compute:P(h(n) = wordi | n, h(m(n)))VP (dumped) VP (dumped) NP (sacks) p = p1 p = p2 p = p3PP (into) PP (of) PP (of)So now we’ve got

View Full Document