DOC PREVIEW
UT Dallas SE 5V81 - Sample Problem Question and Answer

This preview shows page 1 out of 3 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 3 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 3 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Sample Ontology Alignment and Semantic Similarity ProblemYou are given the following 2 ontologies describing pizza:Ontology O1>Pizza >> Cheese >> Pepperoni >> Sausage >> Veggie >>> Spinach >>> Broccoli >> Sicilian Notes:- Each word above within an ontology represents a label for a concept. For example, ‘Cheese’ in O1 represents the concept with the label “Cheese”-In O1, any concept with a ‘>>’ to its left is a child of the concept labeled “Pizza”. Also, “Spinach” and “Broccoli” are children of “Veggie”. >, >> and >>> have the same meaning among the concepts in O2. - Square pizza and Sicilian pizza are two different names for the same kind of pizza(a: Looking at the two ontologies and their concept labels, which of the following matching techniques would be most suitable in calculating the semantic similarity between a given concept in O1 and a given concept in O2? (a: Hamming Distance (b: Jaro similarity X (c: WordNet similarity measures (d: Stanford NER-based entity extraction(b: Calculate the Hamming distance between the concept “Veggie” in O1 and the concept “Vegetable” in O2Hamming Distance = Hamming Distance = (3 + 3) / 9 = .667(c: Calculate the N-gram similarity between the combined labels of all child concepts of “Veggie”in O1 and the combined labels of all child concepts of “Vegetable” in O2Sample Ontology Alignment and Semantic Similarity ProblemsOntology O2>Pizza >> Plain >> Pepperoni >> Hamburger >> Vegetable >>> Spinach >>> Broccoli >>> Salad >> SquareN-Gram similarity = The child concepts of “Vegetable” form the concatenated string: “SpinachBroccoli”, and the child concepts of “Veggie” for the concatenated string: “SpinachBroccoliSalad”Let N = 2 (this wasn’t stated in the original problem, my mistake )2-grams of child concepts of “Vegetable” = {“Sp”, “pi”,”in”,”na”,”ac” ”ch”,”hB”,”Br”,”ro”,”oc”,”cc”,”co”,”ol”,”li”}2-grams of child concepts of “Veggie” = { Sp”, “pi”,”in”,”na”,”ac” ”ch”,”hB”,”Br”,”ro”,”oc”,”cc”,”co”,”ol”,”li”,”iS”,”Sa”,”al”,”la”,”ad”}N-gram sim = 14 / (15 – 2 + 1) = 14/14 = 1Even though they are not identical strings, this formula gives a value of 1 for their N-gram similarity. This formula treats cases where one string is a substring of another as having identical N-gram similarity.(d: We are told the following:-That the concept “Sicilian” in O1 has a property called “description” with the string value = “Our square, delicious, deep dish pizza is one of a kind!” -The concept “Square” in O2 has a property called “description” with the string value = “The cheesiest, yummiest, square pizza of its kind!”-The stopword list used in our ontology alignment algorithm contains the following words: {“and”, “or”, “its”, “not”, “but”, “the”, “a”, “this”, “that”, “an”, “is”, “of”, “our”}Calculate the Jaccard similarity between Sicilian.description and Square.description, being sure toapply the appropriate normalization.Jaccard Sim = First, we apply normalization and make all words lowercase and remove all commas andother punctuation. Then we filter out stopwords. After normalization and stopword filtering, Sicilian.description now has the following words = { “square” “delicious“ “deep” “dish” “pizza” “one” “kind”} – 7 unique wordsSquare.description now has the following words = { “cheesiest” “yummiest” “square” “pizza” “kind”} – 5 total words, but only 2 uniquewords since “square”, “pizza”, and “kind” are already in Sicilian.descriptionSample Ontology Alignment and Semantic Similarity ProblemsJaccard Sim = 3 / 9 = .33 (e: Calculate the cosine similarity between Sicilian.description and Square.descriptionCosine Similarity = We apply normalization to convert all words to lowercase first. Next, we create the word vectors for each property string value.Sicilian.descrip vector = [“square”:1 “delicious”:1 “deep”:1 “dish”:1 “pizza”:1 “one” :1 “kind”:1, “cheesiest”:0, “yummiest”:0]Square.description vector = [“square”:1 “delicious”:0 “deep”:0 “dish”:0 “pizza”:1 “one” :0 “kind”:1, “cheesiest”:1, “yummiest”:1]Dot product of Sicilian.description vector and Square.description vector is = (1 * 1) + (1 * 0) + (1 * 0) + (1 * 0) + (1 * 1) + (1 * 0) + (1 * 1) + (0 * 1) + (0 * 1) = 3Norm of Sicilian.description = √7 Norm of Square.description = √5Cosine sim = 3 / (√35), which is about equal to .5 (the vectors have some similarity to each other)Sample Ontology Alignment and Semantic Similarity


View Full Document

UT Dallas SE 5V81 - Sample Problem Question and Answer

Download Sample Problem Question and Answer
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Sample Problem Question and Answer and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Sample Problem Question and Answer 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?