Unformatted text preview:

600.465 - Intro to NLP - J. Eisner 1 Splitting Words a.k.a. “Word Sense Disambiguation” slide courtesy of D. Yarowsky slide courtesy of D. Yarowsky slide courtesy of D. Yarowsky slide courtesy of D. Yarowsky slide courtesy of D. Yarowskyslide courtesy of D. Yarowsky slide courtesy of D. Yarowsky 600.465 - Intro to NLP - J. Eisner 9 Representing Word as Vector  Could average over many occurrences of the word ...  Each word type has a different vector?  Each word token has a different vector?  Each word sense has a different vector? (for this one, we need sense-tagged training data) (is this more like a type vector or a token vector?)  What is each of these good for? 600.465 - Intro to NLP - J. Eisner 10 Each word type has a different vector  We saw this yesterday  It’s good for grouping words  similar semantics?  similar syntax?  depends on how you build the vector 600.465 - Intro to NLP - J. Eisner 11 Each word token has a different vector  Good for splitting words - unsupervised WSD  Cluster the tokens: each cluster is a sense!  have turned it into the hot dinner-party topic. The comedy is the  selection for the World Cup party, which will be announced on May 1  the by-pass there will be a street party. "Then," he says, "we are going  in the 1983 general election for a party which, when it could not bear to  to attack the Scottish National Party , who look set to seize Perth and  number-crunchers within the Labour party, there now seems little doubt  that had been passed to a second party who made a financial decision  A future obliges each party to the contract to fulfil it by 600.465 - Intro to NLP - J. Eisner 12 Each word sense has a different vector  Represent each new word token as vector, too  Now assign each token the closest sense  (could lump together all tokens of the word in the same document: assume they all have same sense)  have turned it into the hot dinner-party topic. The comedy is the  selection for the World Cup party, which will be announced on May 1  the by-pass there will be a street party. "Then," he says, "we are going  let you know that there’s a party at my house tonight. Directions: Drive  in the 1983 general election for a party which, when it could not bear to  to attack the Scottish National Party , who look set to seize Perth and  number-crunchers within the Labour party, there now seems little doubt ? ?600.465 - Intro to NLP - J. Eisner 13 Where can we get sense-labeled training data?  To do supervised WSD, need many examples of each sense in context  have turned it into the hot dinner-party topic. The comedy is the  selection for the World Cup party, which will be announced on May 1  the by-pass there will be a street party. "Then," he says, "we are going  let you know that there’s a party at my house tonight. Directions: Drive  in the 1983 general election for a party which, when it could not bear to  to attack the Scottish National Party , who look set to seize Perth and  number-crunchers within the Labour party, there now seems little doubt ? ? 600.465 - Intro to NLP - J. Eisner 14 Where can we get sense-labeled training data?  Sources of sense-labeled training text:  Human-annotated text - expensive  Bilingual text (Rosetta stone) – can figure out which sense of “plant” is meant by how it translates  Dictionary definition of a sense is one sample context  Roget’s thesaurus entry of a sense is one sample context hardly any data per sense – but we’ll use it later to get unsupervised training started  To do supervised WSD, need many examples of each sense in context 600.465 - Intro to NLP - J. Eisner 15 A problem with the vector model  Bad idea to treat all context positions equally:  Possible solutions:  Faraway words don’t count as strongly?  Words in different positions relative to plant are different elements of the vector?  (i.e., (pesticide, -1) and (pesticide,+1) are different features)  Words in different syntactic relationships to plant are different elements of the vector? 600.465 - Intro to NLP - J. Eisner 16 Just one cue is sometimes enough ... slide courtesy of D. Yarowsky (modified) 600.465 - Intro to NLP - J. Eisner 17 An assortment of possible cues ... slide courtesy of D. Yarowsky (modified) generates a whole bunch of potential cues – use data to find out which ones work best 600.465 - Intro to NLP - J. Eisner 18 An assortment of possible cues ... slide courtesy of D. Yarowsky (modified) merged ranking of all cues of all these types only a weak cue ... but we’ll trust it if there’s nothing better600.465 - Intro to NLP - J. Eisner 19 Final decision list for lead (abbreviated) slide courtesy of D. Yarowsky (modified) To disambiguate a token of lead :  Scan down the sorted list  The first cue that is found gets to make the decision all by itself  Not as subtle as combining cues, but works well for WSD Cue’s score is its log-likelihood ratio: log [ p(cue | sense A) [smoothed] / p(cue | sense B) ] slide courtesy of D. Yarowsky (modified) very readable paper at http://cs.jhu.edu/~yarowsky/acl95.ps sketched on the following slides ... unsupervised learning! 600.465 - Intro to NLP - J. Eisner 21 First, find a pair of “seed words” that correlate well with the 2 senses unsupervised learning!  If “plant” really has 2 senses, it should appear in  2 dictionary entries: Pick content words from those  2 thesaurus entries: Pick synonyms from those  2 different clusters of documents: Pick representative words from those  2 translations in parallel text: Use the translations as seed words  Or just have a human name the seed words (maybe from a list of words that occur unusually often near “plant”) make that “minimally supervised” 600.465 - Intro to NLP - J. Eisner 22 target word: plant table taken from Yarowsky (1995) Yarowsky’s bootstrapping algorithm (life, manufacturing) life (1%) manufacturing (1%) 98% unsupervised learning! 600.465 - Intro to NLP - J. Eisner 23 figure taken from Yarowsky (1995) Yarowsky’s bootstrapping algorithm (life, manufacturing) Learn a classifier that distinguishes A from B. It


View Full Document

Johns Hopkins EN 600 465 - Splitting Words

Download Splitting Words
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Splitting Words and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Splitting Words 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?