Semantic DistanceCMSC 723: Computational Linguistics I ― Session #10Jimmy LinJimmy LinThe iSchoolUniversity of MarylandWednesday, November 4, 2009Material drawn from slides by Saif Mohammad and Bonnie DorrProgression of the Course| Wordsz Finite-state morphologyz Part-of-speech tagging (TBL + HMM)| Structurez CFGs + parsing (CKY, Earley)z N-gram language models|Meaning!|Meaning!Today’s Agenda| Lexical semantic relations| WordNetod et| Computational approaches to word similarityLexical Semantic RelationsWhat’s meaning?| Let’s start at the word level…| How do you define the meaning of a word?o do you de e t e ea g o a o d| Look it up in the dictionary!Well that really doesn’thelpWell, that really doesn t help…Approaches to meaning| Truth conditional|Semantic networkSe a t c et oWord Senses| “Word sense” = distinct meaning of a word|Same word, different sensesSa e o d, d e e t se sesz Homonyms (homonymy): unrelated senses; identical orthographic form is coincidentalEl“fiilititti”“idfi”fbk•Example: “financial institution” vs. “side of river” for bankz Polysemes (polysemy): related, but distinct senses• Example: “financial institution” vs. “sperm bank”Mt ( t )“t di”t hi ll bfzMetonyms (metonymy): “stand in”, technically, a sub-case of polysemy• Examples: author for works or author, building for organization, capital it f tcity for government| Different word, same sensezSynonyms (synonymy)zSynonyms (synonymy)Just to confuse you…| Homophones: same pronunciation, different orthography, different meaningz Examples: would/wood, to/too/two| Homographs: distinct senses, same orthographic form, different pronunciationdifferent pronunciationz Examples: bass (fish) vs. bass (instrument)Relationship Between Senses| IS-A relationshipsz From specific to general (up): hypernym (hypernymy)• Example: bird is a hypernym of robinz From general to specific (down): hyponym (hyponymy)• Example: robin is a hyponym of birdpypy| Part-Whole relationshipsz wheel is a meronym of car (meronymy)z car is a holonym of wheel (holonymy)WordNet TourMaterial drawn from slides by Christiane FellbaumWhat is WordNet?| A large lexical database developed and maintained at Princeton University| Includes most English nouns, verbs, adjectives, adverbs| Electronic format makes it amenable to automatic manipulation: used in many NLP applications| “WordNets” generically refers to similar resources in other languagesWordNet: History| Research in artificial intelligence:z How do humans store and access knowledge about concept?z Hypothesis: concepts are interconnected via meaningful relationsz Useful for reasoning|TheWordNetproject started in1986|The WordNetproject started in 1986z Can most (all?) of the words in a language be represented as a semantic network where words are interlinked by meaning? z If so, the result would be a large semantic network …Synonymy in WordNet| WordNet is organized in terms of “synsets”z Unordered set of (roughly) synonymous “words” (or multi-word phrases)| Each synset expresses a distinct meaning/conceptWordNet: ExampleNoun{pipe, tobacco pipe} (a tube with a small bowl at one end; used for {p p p p } (smoking tobacco) {pipe, pipage, piping} (a long tube made of metal or plastic that is used to carry water or oil or gas etc.) {pipe tube} (a hollow cylindrical shape){pipe, tube} (a hollow cylindrical shape) {pipe} (a tubular wind instrument) {organ pipe, pipe, pipework} (the flues and stops on a pipe organ) Verb{shriek, shrill, pipe up, pipe} (utter a shrill cry) {pipe} (transport by pipeline) “pipe oil, water, and gas into the desert”{pipe} (play on a pipe)“pipe a tune”{pipe} (play on a pipe) pipe a tune{pipe} (trim with piping) “pipe the skirt”Observations about sense granularity?The “Net” Part of WordNet{conveyance; transport}{vehicle}{bumper}{hinge; flexible joint}hyperonymhyperonym{motor vehicle; automotive vehicle}{bumper}{car door} {doorlock}hyperonymhyperonymmeronym meronymmeronym{car; auto; automobile; machine; motorcar}{car window}{car mirror}{armrest}yp ymeronym{cruiser; squad car; patrol car; police car; prowl car} {cab; taxi; hack; taxicab; }hyperonymhyperonymWordNet: SizePart of speech Word form SynsetsNoun 117,798 82,115Verb 11,529 13,767Adjective 21,479 18,156Adverb4 4813 621Adverb4,4813,621Total 155,287 117,659http://wordnet.princeton.edu/MeSH| Medical Subject Headings: another example of a theasuriz http://www.nlm.nih.gov/mesh/MBrowser.html| Thesauri, ontologies, taxonomies, etc.Word SimilarityIntuition of Semantic SimilaritySemantically closebkSemantically distantdtbzbank–moneyz apple–fruit ztree–forestzdoctor–beerz painting–Januaryzmoney–riverztree–forestz bank–riverzpen–paperzmoney–riverz apple–penguinz nurse–fruitpppz run–walk z mistake–errorz pen–riverz clown–tramwayz car–wheel z car–algebra19Why?| Meaningz The two concepts are close in terms of their meaning| World knowledgez The two concepts have similar properties, often occur together, or occur in similar contextsoccur in similar contexts| PsychologyzWe often think of the two concepts togetherzWe often think of the two concepts together20Two Types of Relations| Synonymy: two words are (roughly) interchangeable| Semantic similarity (distance): somehow “related”Sometimes explicit lexical semantic relationship often notzSometimes, explicit lexical semantic relationship, often, not21Validity of Semantic Similarity| Is semantic distance a valid linguistic phenomenon?| Experiment (Rubenstein and Goodenough, 1965)pe e t ( ube ste a dGoode oug,965)z Compiled a list of word pairsz Subjects asked to judge semantic distance (from 0 to 4) for each of the word pairsthe word pairs| Results:zRank correlation between subjects is~09zRank correlation between subjects is 0.9z People are consistent!22Why do this?| Task: automatically compute semantic similarity between words| Theoretically useful for many applications:z Detecting paraphrases (i.e., automatic essay grading, plagiarism detection)detection)z Information retrievalz Machine translationz …| Solution in search of a problem?Types of Evaluations| Intrinsicz Internal to the task itselfz With respect to some pre-defined criteria| Extrinsicz Impact on end-to-end taskAnalogy with cooking…24Evaluation: Correlation with Humans| Ask automatic method to rank word pairs in order of semantic distance| Compare this ranking with human-created ranking| Measure
View Full Document