UT CS 388 - Vector space models of word meaning

Unformatted text preview:

Vector space models of word meaning Katrin Erk Geometric interpretation of lists of feature value pairs In cognitive science representation of a concept through a list of feature value pairs Geometric interpretation Consider each feature as a dimension Consider each value as the coordinate on that dimension Then a list of feature value pairs can be viewed as a point in space Example Gardenfors color represented through dimensions 1 brightness 2 hue 3 saturation Where do the features come from How to construct geometric meaning representations for a large amount of words Have a lexicographer come up with features a lot of work Do an experiment and have subjects list features a lot of work Is there any way of coming up with features and feature values automatically Vector spaces Representing word meaning without a lexicon Context words are a good indicator of a word s meaning Take a corpus for example Austen s Pride and Prejudice Take a word for example letter Count how often each other word co occurs with letter in a context window of 10 words on either side Some co occurrences letter in Pride and Prejudice by 15 which 16 him 16 with 16 elizabeth 17 but 17 he 17 be 18 s 20 on 20 jane 12 when 14 not 21 for 21 mr 22 this 23 as 23 you 25 from 28 i 28 had 32 that 33 in 34 was 34 it 35 his 36 she 41 her 50 a 52 and 56 of 72 to 75 the 102 Using context words as features co occurrence counts as values Count occurrences for multiple words arrange t a r g e t w o r d s in a table context words For each target word vector of counts Use context words as dimensions Use co occurrence counts as co ordinates For each target word co occurrence counts define point in vector space Vector space representations Viewing letter and surprise as vectors points in vector space Similarity between them as distance in space letter surprise What have we gained Representation of a target word in context space can be computed completely automatically from a large amount of text As it turns out similarity of vectors in context space is a good predictor for semantic similarity Words that occur in similar contexts tend to be similar in meaning The dimensions are not meaningful by themselves in contrast to dimensions like hue brightness saturation for color Cognitive plausibility of such a representation What do we mean by similarity of vectors Euclidean distance letter surprise What do we mean by similarity of vectors Cosine similarity letter surprise Parameters of vector space models W Lowe 2001 Towards a theory of semantic space A semantic space defined as a tuple A B S M B base elements We have seen context words A mapping from raw co occurrence counts to something else for example to correct for frequency effects We shouldn t base all our similarity judgments on the fact that every word co occurs frequently with the S similarity measure We have seen cosine similarity Euclidean distance M transformation of the whole space to different dimensions typically dimensionality reduction A variant on B the base elements Term x document matrix Represent document as vector of weighted terms Represent term as vector of weighted documents Another variant on B the base elements Dimensions not words in a context window but dependency paths starting from the target word Pado Lapata 07 A possibility for A the transformation of raw counts Problem with vectors of raw counts Distortion through frequency of target word Weigh counts The count on dimension and will not be as informative as that on the dimension angry For example using Pointwise Mutual Information between target and context word A possibility for M the transformation of the whole space Singular Value Decomposition SVD dimensionality reduction Latent Semantic Analysis LSA also called Latent Semantic Indexing LSI Do SVD on term x document representation to induce latent dimensions that correspond to topics that a document can be about Landauer Dumais 1997 Using similarity in vector spaces Search information retrieval Given query and document collection Use term x document representation Each document is a vector of weighted terms Also represent query as vector of weighted terms Retrieve the documents that are most similar to the query Using similarity in vector spaces To find synonyms Synonyms tend to have more similar vectors than non synonyms Synonyms occur in the same contexts But the same holds for antonyms In vector spaces good and evil are the same more or less So vector spaces can be used to build a thesaurus automatically Using similarity in vector spaces In cognitive science to predict human judgments on how similar pairs of words are on a scale of 1 10 priming An automatically extracted thesaurus Dekang Lin 1998 For each word automatically extract similar words vector space representation based on syntactic context of target dependency parses similarity measure based on mutual information Lin s measure Large thesaurus used often in NLP applications Automatically inducing word senses All the models that we have discussed up to now one vector per word word type Sch tze 1998 one vector per word occurrence token She wrote an angry letter to her niece He sprayed the word in big letters The newspaper gets 100 letters from readers every day Make token vector by adding up the vectors of all other content words in the sentence Cluster token vectors Clusters induced word senses Summary vector space models Count words parse tree snippets documents where the target word occurs View context items as dimensions target word as vector point in semantic space Distance in semantic space similarity between words Uses Search Inducing ontologies Modeling human judgments of word similarity


View Full Document

UT CS 388 - Vector space models of word meaning

Download Vector space models of word meaning
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Vector space models of word meaning and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Vector space models of word meaning 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?