Stanford CS 157 - Lecture 17 - Information Integratio

Unformatted text preview:

Information IntegrationProduct SearchProduct InformationSlide 4Syntactic Search EnginesToo Many ResultsToo Few ResultsNo IntegrationContent versus FormStructured DataSlide 11ComplicationsDatabasesFragmentationConceptual HeterogeneityRelational Logic for MappingRules in EpilogTranslation Between SchemasMaster SchemaReference SchemaView InversionSlide 22Example of Complete InversionExamples of Semi-complete InversionDisjunctive Source DescriptionQuery PlansNicer QueryComplexity ResultsPotential Application AreasDemonstration ArchitecturePowerPoint PresentationSlide 32Slide 33Slide 34Slide 35Slide 36Slide 37Slide 38Slide 39Slide 40Slide 41Slide 42Slide 43Slide 44Slide 45Information IntegrationComputational Logic Lecture 17Michael Genesereth Spring 20042Product SearchGive me a list of 15 inch aluminum skillets with nonstick coating rated at least 4 out of 5 by Consumer Reports that retail for under $30 and are currently in stock.3Product InformationDemographicDataCompanyDataPriceSheetsVendorCatalogsConsumerReportsRatingsRetailerProductDataCurrencyConversionTablesInventoryData4Information IntegrationSide-by-sideComparisonInfomasterManufacturer 1Manufacturer 2Marketplace DataIntegratedSearchProduct analysisSatisfactionRatingsSupplier 1Supplier 2Supplier 3Supplier 45Syntactic Search EnginesGoogleDocument DocumentDocumentDocument DocumentSearch Words Document References6Too Many ResultsQuery: Who is older -- Jane or John?Search Words: John Jane olderDocument Fragments:..John is older than Jane.....John is older than Jill......Jim is older than Jane...Jill wants to know whether John is older than Jane...7Too Few ResultsQuery: Is it the case that John is older than Jane?Document fragments:..Jane is younger than John.....John is more advanced in years than Jane......John is the father of Jane...8No IntegrationQuery: Is it the case that John is older than Jane?Documents:...John is older than Jill......Jill is older than Jane...9Content versus FormThose who will not reasonPerish in the act;Those who will not actPerish for that reason.ThosewhowillnotreasonPerishintheact;ThosewhowillnotactPerishforthatreason.Semantic View Syntactic View10Structured DataFree Form Text Easy to Use but limited capability Too Few answers, too many answers Impossible to Aggregate effectivelyStructured Data Taxonomy, Attributes, Typed Values Powerful search possible Aggregation possibleAdding tags allows machine to understand so we can search and integrate.11Structured Data•Much Data Managed in “Structured” Form–Files (tab - delimited text, …)–Databases (Catalogs, Directories, …)–Application Programs (SAP, Baan, Peoplesoft, Siebel…)•Trend Toward More Structure –File Standards, e.g. XML–Database Protocols, e.g. ODBC–Application Protocols, e.g. LDAP, ADAP, etc.12•Difficulties of Using Multiple Sources –Distribution•Network delays•Part-time operation–Platform Heterogeneity (dozens)•Differences in protocol•Differences in format (XML, HTML, spreadsheet, ODBC,…)–Conceptual Heterogeneity (thousands)•Differences in schema and vocabulary•Relative incompleteness•Life is Change–Technological innovation–New Buyers, Suppliers, Market MakersComplications“73% of business managers surveyed said they could not access data in their own corporate databases. “Gartner Group Report“73% of business managers surveyed said they could not access data in their own corporate databases. “Gartner Group Report13Databasesname manager office phone John Jill MJH222 38086Jane Jerry Cedar12 57493Jill MJH222 Jerry 420-032 5677714FragmentationHorizontal fragmentationVertical Fragmentationname manager office phone John Jill MJH222 38086Jane Jerry Cedar12 57493name manager office phone Jill MJH222 Jerry 420-032 56777name office phone John MJH222 38086Jane Cedar12 57493Jill MJH222 Jerry 420-032 56777name manager John JillJane JerryJillJerry15Conceptual Heterogeneityname manager office phone John Jill MJH222 38086Jane Jerry Cedar12 57493Jill MJH222 Jerry 420-032 56777name employee location telephone John MJH222 7238086Jane Cedar12 7257493Jill John MJH222 Jerry Jane 420-032 7256777“The biggest problem facing anyone who wants to search multiple structured databases. . .is that many organizations use different words to describe the same thing. “Martin Marshall, Communications Week “The biggest problem facing anyone who wants to search multiple structured databases. . .is that many organizations use different words to describe the same thing. “Martin Marshall, Communications Week16Relational Logic for Mappingemployee(X,Y):-manager(Y,X).name manager office phone John Jill MJH222 38086Jane Jerry Cedar12 57493Jill MJH222 Jerry 420-032 56777name employee location telephone John MJH222 7238086Jane Cedar12 7257493Jill John MJH222 Jerry Jane 420-032 725677717Rules in EpilogSafe, Horn Rules grandparent(X,Z) :- parent(X,Y), parent(Y,Z).Existential Variables parent(X,f(X,Z)) :- grandparent(X,Z).Disjunction/Classical Negation father(X,Y) | mother(X,Y) :- parent(X,Y). father(X,Y) :- parent(X,Y), ~mother(X,Y). mother(X,Y) :- parent(X,Y), ~father(X,Y).Recursion ancestor(X,Y) :- parent(X,Y). ancestor(X,Z) :- parent(X,Y), ancestor(Y,Z).18Translation Between SchemasSchema Schema SchemaSchema Schema Schema19Master SchemaSchema Schema SchemaSchema Schema SchemaMaster Schema20Reference SchemaSchemaSchemaSchema SchemaSchemaReference SchemaSchemaSchemaRulesRulesRulesRulesRules Rules Rules21View InversionClientSource Source SourceSourceReference SchemaClient ClientRulesRulesRulesRulesRules Rules RulesClientSource Source SourceSourceReference SchemaClient ClientRulesRulesRulesRulesRules Rules Rules22View InversionExampleAutomated View Inversion Predicate Completion on relations in Reference Model Simplification within and across rules using Model Elimination Correct (all answers justified by data) Complete (all such answers)employee(X,Y):-manager(Y,X).manager(Y,X):-employee(X,Y).23Example of Complete


View Full Document

Stanford CS 157 - Lecture 17 - Information Integratio

Documents in this Course
Lecture 1

Lecture 1

15 pages

Equality

Equality

32 pages

Lecture 19

Lecture 19

100 pages

Epilog

Epilog

29 pages

Equality

Equality

34 pages

Load more
Download Lecture 17 - Information Integratio
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 17 - Information Integratio and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 17 - Information Integratio 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?