Stanford CS 224 - Automatic Hypernym Classification - D756228

Home> Schools> Stanford University> Computer Science (CS) > CS 224> Automatic Hypernym Classification

DOC PREVIEW

Stanford CS 224 - Automatic Hypernym Classification

School name Stanford University

Course Cs 224- N Natural Language Processing with Deep Learning

Pages 9

This preview shows page 1-2-3 out of 9 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 9 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 9 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 9 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 9 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

CS 224N Class ProjectAutomatic Hypernym ClassificationRion L. Snow and Kayur D. PatelDepartment of Computer ScienceStanford UniversityStanford, CA 94305{rion,kdpatel}@cs.stanford.eduAbstractHypernym classification is the task of deciding whether, given twowords, one word “is a kind of” the other. We present a classifier thatlearns the noun hypernym relation based on automatically-discoveredlexico-syntactic patterns between a set of provided hyponym/hypernymnoun pairs. This classifier is shown to outperform two previous methodsfor automatically identifying hypernym pairs (using WordNet as a goldstandard), and is shown to outperform those methods as well as WordNeton a hand-labeled data set.1 IntroductionThe classification of general relationships between concepts has long been a subject ofintense study in linguistics. WordNet [9] is one of the largest such projects for catalogu-ing relationships between concepts in English and several other languages. Chief amongthe relationships catalogued between pairs of nouns are the synonym (same meaning as),antonym (opposite meaning as), hypernym (is a kind of), holonym (is a part of), and coor-dinate term (is the same kind of thing as) relations. Machine learning techniques have beenapplied to capture a subset of these relationships automatically; in particular, the problemof discovering sets of synonyms and coordinate terms has been studied at length (see forexample, [5]). Work in automatically inducing other relationships has been less success-ful; in particular, in the study of the hypernym relationship some hand-designed patternshave been found to be effective in discovering novel hyponym/hypernym pairs[4]; further,preliminary hypernym ontologies have been constructed using a small number of hand-constructed patterns [1], [2]. However, no reliable method for automatically discoveringsuch patterns has yet been implemented. Some algorithms have been sketched, however;most notably the algorithm originally proposed for discovering new patterns in [4]:“...In order to find new patterns automatically, we sketch the following procedure:1. Decide on a lexical relation, R, that is of interest, e.g., group/member” (inour formulation this is a subset of the hyponymy relation).2. Gather a list of terms for which this relation is known to hold, e.g., England-country”. This list can be found automatically using the method describedhere, bootstrapping from patterns found by hand, or by bootstrapping froman existing lexicon or knowledge base.3. Find places in the corpus where these expressions occur syntactically nearone another and record the environment.4. Find the commonalities among these environments and hypothesize thatcommon ones yield patterns that indicate the relation of interest.5. Once a new pattern has been positively identified, use it to gather moreinstances of the target relation and go to Step 2.”We apply this algorithm for discovering the lexico-syntactic patterns relating noun hy-pernym pairs. As mentioned in [4], the problem of ”finding the commonalities” in word”environments” is underdetermined, i.e. there is no canonical way to reliably capture the’commonalities’ among ’word environments’ in natural text. Nonetheless recent advancesin automatic parsing technology allows us to represent natural language sentences in astructurally reliable way; we propose one possible method of reliably ’finding commonali-ties’ in the following section.2 Automatically Discovering Hypernym RelationshipsRecording the lexico-syntactic environment between a specific pair of words has thus farbeen limited to collecting the counts of hand-designed features, for example, [12], [3].Unfortunately this method of lexicon construction is tedious and subject to the bias ofthe designer; further these lexicons are necessarily a very small subset of the actual ’pat-terns’ found to occur in natural text. Some recent attempts have applied a novel methodfor automatically discovering relationships between pairs of nouns in the context of auto-matic inference rule discovery [5], using the dependency relations produced by MINIPAR,a broad-coverage principle-based parser for English described at length in [7]. We proposea similar, entirely automatic method of capturing the word environment between relatedwords in a repeatable, consistent fashion. In particular, we use Lin’s MINIPAR parser toproduce directed dependency trees of sentences in text, and then for each sentence recordas our ’environment’ the shortest path in the dependency tree between the nouns of interest(with optional ’satellite’ nodes). For example, given the sentence fragment (from the TIP-STER 1 corpus) ”Oxygen is the most abundant element on the moon,” MINIPAR yields thedependency tree partially depicted in Figure 1:beoxygenVBE:s:NelementVBE:pred:NtheN:det:DetmostN:post:PostDetabundantN:mod:AonN:mod:PrepFigure 1: Dependency Tree from MINIPARWe then remove the noun pair information to produce a general path, for example the firstfeature above becomes simply: ”-N:VBE:V,be,be,V:VBE:N”. The full list of extractedfeatures is given in Figure 1.We apply MINIPAR in this way to a corpus of over 6 million newswire sentences (con-sisting of articles from the Associated Press, Wall Street Journal, and Los Angeles Times,drawn from the Tipster 1, 2, 3, and Trec 5 corpora). From these we construct a featurelexicon containing all dependency paths discovered between all pairs of nouns, such thatthe path occurs between at least five unique noun pairs in our corpus. This feature lexiconconsists of about 70,000 dependency paths. We then construct feature count vectors forevery pair of nouns occurring within any sentence in our corpus; each row in the vector isTable 1: Paths extracted from Figure 1OXYGEN,-N:VBE:V,BE,BE,V:VBE:N,ELEMENT]OXYGEN,-N:VBE:V,BE,BE,V:VBE:N,ELEMENT,(THE,DET:DET:N)]OXYGEN,-N:VBE:V,BE,BE,V:VBE:N,ELEMENT,(MOST,POSTDET:POST:N)]OXYGEN,-N:VBE:V,BE,BE,V:VBE:N,ELEMENT,(ABUNDANT,A:MOD:N)]OXYGEN,-N:VBE:V,BE,BE,V:VBE:N,ELEMENT,(ON,PREP:MOD:N)]simply the count of the number of occurrences of a particular feature in conjunction withthe noun pair. Using this formalism we have been able to capture a wide variety of repeat-able patterns between hyponym/hypernym noun pairs; in particular, we have been able to’rediscover’ many of the hand-designed patterns proposed in [4], in addition to a numberof new patterns not discussed by Hearst.Table 2: Exact Corpora

View Full Document