CMSC 723 LING 645 Intro to Computational Linguistics November 3 2004 Administrivia Assignment 2 extension Now due one week later NOVEMBER 17 2004 Lecture 9 Dorr Word Classes POS Tagging Chapter 8 Intro to Syntax Start chapter 9 Prof Bonnie J Dorr Dr Christof Monz TA Adam Lee Word Classes and Part of Speech Tagging Definition and Example Motivation Word Classes Rule based Tagging Stochastic Tagging Transformation Based Tagging Tagging Unknown Words An Example WORD the girl kissed the boy on the cheek Definition The process of assigning a part of speech or other lexical class marker to each word in a corpus Jurafsky and Martin WORDS the girl kissed the boy on the cheek TAGS N V P DET Motivation LEMMA the girl kiss the boy on the cheek TAG DET NOUN VPAST DET NOUN PREP DET NOUN Speech synthesis pronunciation Speech recognition class based N grams Information retrieval stemming selection high content words Word sense disambiguation Corpus analysis of language lexicography From http www xrce xerox com competencies content analysis fsnlp tagger en html 1 Word Classes Basic word classes Noun Verb Adjective Adverb Preposition POS based on morphology and syntax Open vs Closed classes Open Nouns Verbs Adjectives Adverbs Closed determiners a an the pronouns she he I prepositions on under over near by Open Class Words Every known human language has nouns and verbs Nouns people places things Classes of nouns proper vs common count vs mass Verbs actions and processes Adjectives properties qualities Adverbs hodgepodge Unfortunately John walked home extremely slowly yesterday Closed Class Words Prepositions from CELEX Idiosyncratic Examples prepositions on under over particles up down on off determiners a an the pronouns she who I conjunctions and but or auxiliary verbs can may should numerals one two three third English Single Word Particles Pronouns in CELEX 2 Conjunctions Auxiliaries Word Classes Tag Sets Word Classes Tag set example Vary in number of tags a dozen to over 200 Size of tag sets depends on language objectives and purpose Some tagging approaches e g constraint grammar based make fewer distinctions e g conflating prepositions conjunctions particles Simple morphology more ambiguity fewer tags Example of Penn Treebank Tagging of Brown Corpus Sentence The DT grand JJ jury NN commented VBD on IN a DT number NN of IN other JJ topics NNS VB DT NN Book that flight PRP PRP The Problem Words often have more than one word class this This is a nice day PRP This day is nice DT You can go this far RB VBZ DT NN VB NN Does that flight serve dinner 3 Word Class Ambiguity in the Brown Corpus Unambiguous 1 tag 35 340 Ambiguous 2 7 tags 4 100 2 tags 3 tags 4 tags 5 tags 6 tags 7 tags 3 760 264 61 12 2 1 Part of Speech Tagging Rule Based Tagger ENGTWOL Stochastic Tagger HMM based Transformation Based Tagger Brill Derose 1988 Rule Based Tagging Sample ENGTWOL Lexicon Basic Idea Assign all possible tags to words Remove tags according to set of rules of type if word 1 is an adj adv or quantifier and the following is a sentence boundary and word 1 is not a verb like consider then eliminate non adv else eliminate adv Typically more than 1000 hand written rules but may be machine learned Stage 1 of ENGTWOL Tagging First Stage Run words through Kimmo style morphological analyzer to get all parts of speech Example Pavlov had shown that salivation Pavlov had shown that salivation PAVLOV N NOM SG PROPER HAVE V PAST VFIN SVO HAVE PCP2 SVO SHOW PCP2 SVOO SVO SV ADV PRON DEM SG DET CENTRAL DEM SG CS N NOM SG Stage 2 of ENGTWOL Tagging Second Stage Apply constraints Constraints used in negative way Example Adverbial that rule Given input that If 1 A ADV QUANT 2 SENT LIM NOT 1 SVOC A Then eliminate non ADV tags Else eliminate ADV 4 Stochastic Tagging Stochastic Tagging cont Based on probability of certain tag occurring given various possibilities Necessitates a training corpus No probabilities for words not in corpus Training corpus may be too different from test corpus HMM Tagger Simple Method Choose most frequent tag in training text for each word Result 90 accuracy Why Baseline Others will do better HMM is an example Start with Bigram HMM Tagger Intuition Pick the most likely tag for this word HMM Taggers choose tag sequence that maximizes this formula P word tag P tag previous n tags Let T t1 t2 tn Let W w1 w2 wn Find POS tags that generate a sequence of words i e look for most probable sequence of tags T underlying the observed words W argmaxT P T W argmaxTP T P W T argmaxtP t1 tn P w1 wn t1 tn argmaxt P t1 P t2 t1 P tn tn 1 P w1 t1 P w2 t2 P wn tn To tag a single word ti argmaxj P tj ti 1 P wi tj How do we compute P ti ti 1 c ti 1t i c t i 1 How do we compute P wi ti c w i t i c t i How do we compute the most probable tag sequence Viterbi An Early Approach to Statistical POS Tagging An Example Secretariat NNP is VBZ expected VBN to TO race VB tomorrow NN People NNS continue VBP to TO inquire VB the DT reason NN for IN the DT race NN for IN outer JJ space NN to TO race the DT race ti argmaxj P tj ti 1 P wi tj max P VB TO P race VB P NN TO P race NN Brown P NN TO 021 P VB TO 34 P race NN 00041 P race VB 00003 000007 00001 PARTS tagger Church 1988 Stores probability of tag given word instead of word given tag P tag word P tag previous n tags Compare to P word tag P tag previous n tags Consider this alternative on your own http www comp lancs ac uk ucrel claws trial html 5 Transformation Based Tagging Brill Tagging Combination of Rule based and stochastic tagging methodologies Like rule based because rules are used to specify tags in a certain environment Like stochastic approach because machine learning is used with tagged corpus as input Input tagged corpus dictionary with most frequent tags TBL Rule Application Tagger labels every word with its most likely tag For example race has the following probabilities in the Brown corpus P NN race 98 P VB race 02 Transformation rules make changes to tags Change NN to VB when previous tag is TO is VBZ expected VBN to TO race NN tomorrow NN becomes is VBZ expected VBN to TO race VB tomorrow NN TBL The Algorithm Step 1 Label every word with most likely tag from dictionary Step 2 Check every possible transformation select one which most improves tagging Step 3 Re tag corpus applying the rules Repeat 2 3 until some criterion is reached e g X correct with respect to training corpus RESULT Sequence of transformation rules Transformation Based Tagging Learning Algorithm Basic
View Full Document
Unlocking...