CMSC723/LING645 FINAL REVIEW: 12/08/04 1. Part-of-Speech Tagging: What is the difference between open and closed word classes? What is the relation between these classes and word frequency in arbitrary documents? POS Ambiguity: How do we tell the difference between “that” and “that” in I consider that odd? 2. Rule-Based Tagger: Provide some rules to correctly tag a word with an appropriate POS given its context. Compare this process (of POS tagging) to context-free parsing. (What do they have in common? How do they differ?) 3. Stochastic Tagger: Why should the HMM approach do better than choosing the most frequent tag? 4. Consider the following data: a. Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN b. People/NNS continue/VBP to/TO inquire/VB the DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN Suppose we know this from the Brown Corpus: P(NN|TO) = .021 P(NN|IN) = .03 P(VB|TO) = .34 P(VB|IN) = .002 P(race|NN) = .00041 P(race|VB) = .00003 Find the most probable POS for a word if the preceding tag is known. For example, suppose we know the POS of the word preceding “race”. What is the most likely POS for “race” in each of these cases—and explain your answers. i. to/TO race/? ii. the/DT race/? Now what if the preceding tag is not known? 5. Transformation-Based Tagger (TBL): Combination of Rule-based and stochastic tagging methodologies a. What does it require? b. How does it work? c. What does it learn? d. How is it iterative? e. What does this system return?6. Problem: Given a particular tagged corpus (that I present), give the highest frequency POS for the word “race”. Show the corpus re-tagged according to highest-frequency tags. Give two example rules that might be learned by a TBL technique to change incorrectly tagged words (in re-tagged corpus) to their correct tags. 7. Context-Free Grammars and Parsing Subcategorization frames: eat: 0 prefer: NP show: NP NP give: NP PPto fly: PPfrom PPto What happens when there is movement? How does this complicate the CFG? Example: Where do you want to fly to ___? 8. How do we represent the auxiliary system in CFG? modal < perfect < {progressive , passive} 9. Earley: How does one retrieve a parse tree from an Earley parse? 10. Provide a first-order predicate logic representation for each of the following sentences: a. All of John’s friends own a car. b. Some of John’s friends don’t own a car. c. A friend or relative of John lends John a car. 11. Provide the truth table for (P∨Q)→R, for all Boolean values of P, Q, and R. 12. Lexical Semantics a. How is polysemy different from synonymy? b. What semantic relations are used in Wordnet? Give an example of each relation. 13. What is the principle of compositionality? Why is it relevant? 14. What is the difference between precision and recall? Under what circumstances might one be more important than the other? 15. What are the advantages of using a parser for entity extraction and what are the problems? 16. What is a question target, and what are ways to determine
View Full Document