Unformatted text preview:

Slide 1Extracting relations from textExtracting Relation Triples from TextWhy Relation Extraction?Automated Content Extraction (ACE)Automated Content Extraction (ACE)UMLS: Unified Medical Language SystemExtracting UMLS relations from a sentenceDatabases of Wikipedia RelationsRelation databases that draw from WikipediaOntological relationsHow to build relation extractorsSlide 13Slide 14Rules for extracting IS-A relationRules for extracting IS-A relationHearst’s Patterns for extracting IS-A relationsHearst’s Patterns for extracting IS-A relationsExtracting Richer Relations Using RulesSlide 20What relations hold between 2 entities?Extracting Richer Relations Using Rules and Named EntitiesHand-built patterns for relationsSlide 24Slide 25Supervised machine learning for relationsHow to do classification in supervised relation extractionAutomated Content Extraction (ACE)Relation ExtractionWord Features for Relation ExtractionSlide 31Parse Features for Relation ExtractionGazetteer and trigger word features for relation extractionSlide 34Classifiers for supervised methodsEvaluation of Supervised Relation ExtractionSummary: Supervised Relation ExtractionSlide 38Slide 39Seed-based or bootstrapping approaches to relation extractionRelation Bootstrapping (Hearst 1992)BootstrappingDipre: Extract <author,book> pairsSnowballDistant SupervisionDistant supervision paradigmDistantly supervised learning of relation extraction patternsUnsupervised relation extractionSlide 49Slide 50Relation ExtractionWhat is relation extraction?Dan JurafskyExtracting relations from text•Company report: “International Business Machines Corporation (IBM or the company) was incorporated in the State of New York on June 16, 1911, as the Computing-Tabulating-Recording Co. (C-T-R)…”•Extracted Complex Relation:Company-Founding Company IBM Location New York Date June 16, 1911 Original-Name Computing-Tabulating-Recording Co.•But we will focus on the simpler task of extracting relation triplesFounding-year(IBM,1911)Founding-location(IBM,New York)Dan JurafskyExtracting Relation Triples from Text The Leland Stanford Junior University, commonly referred to as Stanford University or Stanford, is an American private research university located in Stanford, California … near Palo Alto, California… Leland Stanford…founded the university in 1891Stanford EQ Leland Stanford Junior UniversityStanford LOC-IN CaliforniaStanford IS-A research universityStanford LOC-NEAR Palo AltoStanford FOUNDED-IN 1891Stanford FOUNDER Leland StanfordDan JurafskyWhy Relation Extraction?•Create new structured knowledge bases, useful for any app•Augment current knowledge bases•Adding words to WordNet thesaurus, facts to FreeBase or DBPedia•Support question answering•The granddaughter of which actor starred in the movie “E.T.”?(acted-in ?x “E.T.”)(is-a ?y actor)(granddaughter-of ?x ?y)•But which relations should we extract?4Dan JurafskyAutomated Content Extraction (ACE)ARTIFACTGENERALAFFILIATIONORGAFFILIATIONPART-WHOLEPERSON-SOCIALPHYSICALLocatedNearBusinessFamilyLasting PersonalCitizen-Resident-Ethnicity-ReligionOrg-Location-OriginFounderEmploymentMembershipOwnershipStudent-AlumInvestorUser-Owner-Inventor-ManufacturerGeographicalSubsidiarySports-Affiliation17 relations from 2008 “Relation Extraction Task”Dan JurafskyAutomated Content Extraction (ACE)•Physical-Located PER-GPE He was in Tennessee•Part-Whole-Subsidiary ORG-ORG XYZ, the parent company of ABC•Person-Social-Family PER-PER John’s wife Yoko•Org-AFF-Founder PER-ORGSteve Jobs, co-founder of Apple…• 6Dan JurafskyUMLS: Unified Medical Language System•134 entity types, 54 relationsInjury disrupts Physiological FunctionBodily Location location-of Biologic FunctionAnatomical Structure part-of OrganismPharmacologic Substance causes Pathological FunctionPharmacologic Substance treats Pathologic FunctionDan JurafskyExtracting UMLS relations from a sentence Doppler echocardiography can be used to diagnose left anterior descending artery stenosis in patients with type 2 diabetes Echocardiography, Doppler DIAGNOSES Acquired stenosis8Dan JurafskyDatabases of Wikipedia Relations9Relations extracted from InfoboxStanford state CaliforniaStanford motto “Die Luft der Freiheit weht”…Wikipedia InfoboxDan JurafskyRelation databases that draw from Wikipedia•Resource Description Framework (RDF) triplessubject predicate objectGolden Gate Park location San Franciscodbpedia:Golden_Gate_Park dbpedia-owl:location dbpedia:San_Francisco•DBPedia: 1 billion RDF triples, 385 from English Wikipedia•Frequent Freebase relations:people/person/nationality, location/location/containspeople/person/profession, people/person/place-of-birthbiology/organism_higher_classification film/film/genre10Dan JurafskyOntological relations•IS-A (hypernym): subsumption between classes•Giraffe IS-A ruminant IS-A ungulate IS-A mammal IS-A vertebrate IS-A animal… •Instance-of: relation between individual and class•San Francisco instance-of cityExamples from the WordNet ThesaurusDan JurafskyHow to build relation extractors1. Hand-written patterns2. Supervised machine learning3. Semi-supervised and unsupervised •Bootstrapping (using seeds)•Distant supervision•Unsupervised learning from the webRelation ExtractionWhat is relation extraction?Relation ExtractionUsing patterns to extract relationsDan JurafskyRules for extracting IS-A relationEarly intuition from Hearst (1992) •“Agar is a substance prepared from a mixture of red algae, such as Gelidium, for laboratory or industrial use”•What does Gelidium mean? •How do you know?`Dan JurafskyRules for extracting IS-A relationEarly intuition from Hearst (1992) •“Agar is a substance prepared from a mixture of red algae, such as Gelidium, for laboratory or industrial use”•What does Gelidium mean? •How do you know?`Dan JurafskyHearst’s Patterns for extracting IS-A relations(Hearst, 1992): Automatic Acquisition of Hyponyms“Y such as X ((, X)* (, and|or) X)”“such Y as X”“X or other Y”“X and other Y”“Y including X”“Y, especially X”Dan JurafskyHearst’s Patterns for extracting IS-A relationsHearst pattern Example occurrencesX and other Y ...temples, treasuries, and other important civic buildings.X or other Y Bruises, wounds, broken bones or other injuries...Y such


View Full Document

Stanford CS 124 - Relation Extraction

Download Relation Extraction
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Relation Extraction and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Relation Extraction 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?