DOC PREVIEW
UTD CS 7301 - LECTURE NOTES

This preview shows page 1-2-3-26-27-28 out of 28 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Ontology AlignmentProblem StatementChallenges & SolutionsLCS Algorithm for Multiple OntologiesLargest Common Subgraph (LCS) Algorithm between two OntologiesData Structure for LCS AlgorithmNode Similarity: Instance-based Representing types using N-grams*Node Similarity: Instance-based Visualizing Entropy and Conditional EntropyNode Similarity: Faults of this MethodSlide 10Slide 11Slide 12Slide 13Structural SimilaritySlide 15Similarity Results of Pairwise Ontology Matching(I3CON Benchmark)Ontology Matching Vector Space Model (VSM)Slide 18Aligned ConceptsAligned ConceptsSlide 21ReificationOWL - 2Ontology Extraction from Text DocumentsSlide 25Concept AssignmentSlide 27Slide 28Ontology AlignmentOntology AlignmentProblem StatementProblem StatementGiven N Ontologies (O1 ,…, On)◦In a Particular Domain ◦Different Level of CoverageGoal◦Evaluate Commonality of Entities◦Rank EntitiesChallenges & SolutionsChallenges & SolutionsOntology Alignments◦Largest Common Subgraph (LCS)◦Vector Space Model (TF/ IDF)Accuracy of Entities in Aligned Concepts◦Ranking EntitiesLCS Algorithm for Multiple LCS Algorithm for Multiple OntologiesOntologiesFind the LCS for two OntologiesAlign LCS with other OntologiesLargest Common Subgraph Largest Common Subgraph (LCS) Algorithm between two (LCS) Algorithm between two OntologiesOntologiesData Structure for LCS Data Structure for LCS Algorithm Algorithm C1C2C3C4C5C6C7C’1C’2C’3C’4C’5C’6Similarity Measure for Corresponding EntitiesNode Similarity + Structural SimilarityC1(C1,C’1, .95),(C1,C’6,.77),(C1,C’3,.71),(C1,C’4,.65),(C1,C’5,.54),(C1,C’2,.34)C2(C2,C’3, .85),(C2,C’2,.67),(C2,C’1,.51),(C2,C’4,.45),(C2,C’5,.24),(C2,C’6,.14)C3(C3,C’4, .90),(C3,C’1,.67),(C3,C’3,.51),(C3,C’2,.45),(C3,C’5,.34),(C3,C’6,.24)C4(C4,C’2, .95),(C4,C’1,.65),(C4,C’3,.51),(C4,C’4,.45),(C4,C’5,.23),(C4,C’6,.14)C5(C5,C’4, .80),(C5,C’1,.67),(C5,C’3,.65),(C5,C’2,.35),(C5,C’5,.34),(C5,C’6,.24)C6(C6,C’1, .20),(C6,C’1,.15),(C6,C’3,.12),(C6,C’2,.12),(C6,C’5,.09),(C6,C’6,.08)C7(C7,C’4, .31),(C7,C’1,.25),(C7,C’3,.23),(C7,C’2,.15),(C7,C’5,.14),(C7,C’6,.12)Node Similarity: Instance-based Node Similarity: Instance-based Representing types using N-grams*Representing types using N-grams*Node Similarity (Name-Match)◦Find Common N-gram (N = 2) for corresponding columnsStrName FENAME StatusLOCUST-GROVE DRLOCUST GROVEBUILTLOUISE LN LOUISE BUILTStreet LaddressRaddressTRAIL RANGE DR 16001798CR45/MANET CT2500 2598CAN-gram types from A.StrName = {LO, OC, CU,ST,…..}N-gram types from B.Street = {TR, RA, R4, 5/,…..}CB*Jeffrey Partyka, Neda Alipanah, Latifur Khan, Bhavani Thuraisingham & Shashi Shekhar, “Content Based Ontology Matching for GIS Datasets“, ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM GIS 2008), Page: 407-410, Irvine, California, USA, November 2008.Node Similarity: Instance-Node Similarity: Instance-basedbasedVisualizing Entropy and Conditional Visualizing Entropy and Conditional EntropyEntropyH(C) = –Σpi log pi for all x є C1 U C2H(C | T) = H (C,T) – H(C) for all x є C1 U C2 and t є TNode Similarity: Faults of Node Similarity: Faults of this Methodthis Method• Semantically similar columns are not guaranteed to have a high similarity score City CountryDallas USAHouston USAKingston JamaicaHalifax CanadaMexico CityMexicoctyName countryShanghai ChinaBeijing ChinaTokyo JapanNew Delhi IndiaKuala LumpurMalaysia2-grams extracted from A: {Da, al, la, as, Ho, ou, us…} A є O1 B є O2 2-grams extracted from B: {Sh, ha, an, ng, gh, ha, ai, Be, ei, ij…}: Column 1: Column 2Similarity = H(C|T) / H(C) C1 є O1 C2 є O2 Step3Step3: Calculate SimilarityStep1Step1: Extract distinct keywords from compared columnsStep2Step2: Group distinct keywords together into semantic clustersKeywords extracted from columns = {Johnson, Rd., School, 15th,…}“Rd.”,”Dr.”,”St.”,”Pwy”,…“Johnson”,”School”,”Dr.”….C1C2C1 U C2roadName CityJohnson Rd. PlanoSchool Dr. RichardsonZeppelin St. LakehurstRoad CountyCuster Pwy Collin15th St. CollinParker Rd. CollinNode Similarity: Instance-Node Similarity: Instance-basedbasedK-medoid + NGD instance similarityK-medoid + NGD instance similarityNode Similarity: Instance-Node Similarity: Instance-basedbased Problems with K-medoid + NGD*Problems with K-medoid + NGD*It is possible that two different geographic entities (ie: Dallas, TX and Dallas County) in the same location will have a very low computed NGD value, and thus, be mistaken for being similar:roadName CityJohnson Rd. PlanoSchool Dr. RichardsonZeppelin St. LakehurstAlma Dr. RichardsonPreston Rd. AddisonDallas Pkwy DallasRoad CountyCuster Pwy Cooke15th St. CollinParker Rd. CollinAlma Dr. CollinCampbell Rd. DentonHarry Hines Blvd.Dallas*Jeffrey Partyka, Latifur Khan, Bhavani Thuraisingham, “Semantic Schema Matching Without Shared Instances,” to appear in Third IEEE International Conference on Semantic Computing, Berkeley, CA, USA - September 14-16, 2009.NodeNode Similarity: Instance-basedSimilarity: Instance-basedUsing geographic type information*Using geographic type information*We use a gazetteer to determine the geographic type of an instance: O1O2Geotypes*Jeffrey Partyka, Latifur Khan, Bhavani Thuraisingham, “Geographically-Typed Semantic Schema Matching,” to appear in ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM GIS 2009), Seattle, Washington, USA, November 2009.Node Similarity: Instance-basedNode Similarity: Instance-basedResults of Geographic Matching Over 2 Results of Geographic Matching Over 2 Separate Road Network Data SourcesSeparate Road Network Data SourcesStructural Similarity Structural Similarity ◦Structural Similarity MeasurementI. Neighbor SimilarityC’1C’3C’4C’5C1C2C3C5C6Structural Similarity Structural Similarity Structural Similarity MeasurementI. Properties SimilarityC1C2C3C4C5C6C7C’1C’2C’3C’4C’5C’6isAisAisAsubClasshasFlavorhasColorsubClassisAhasFlavorhasFlavorhasFoodhasDrinksubclassRTC1 = [3isA, 2subClass,1hasFlavor,1hasColor, 0 hasFood,1 hasTopping] RTC2 = [1isA, 1subClass,2hasFlavor,0hasColor,1hasFood] hasToppingSimilaritySimilarityResults of Pairwise Ontology Results of Pairwise Ontology Matching(I3CON Matching(I3CON Benchmark)Benchmark)Matching using Name Similarity + RTSMatching usingName Similarity +


View Full Document

UTD CS 7301 - LECTURE NOTES

Documents in this Course
Load more
Download LECTURE NOTES
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view LECTURE NOTES and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view LECTURE NOTES 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?