DOC PREVIEW
LEHIGH CSE 335 - Forms of Retrieval

This preview shows page 1-2-3 out of 10 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Forms of RetrievalSlide 2Range Searchk-d TreesDefinition: k-d TreesBWB-CheckBOB-CheckExampleUsing Decision Trees as IndexProperties of Retrieval with Indexed CasesForms of Retrieval•Sequential Retrieval•Two-Step Retrieval•Retrieval with Indexed CasesRetrieval with Indexed CasesSources:–Textbook, Chapter 7–Davenport & Prusack’s book on Advanced Data Structures–Samet’s book on Data StructuresRange SearchRed light on? YesBeeping? Yes…Transistor burned!Space of known problemsk-d Trees•Idea: Partition of the case base in smaller fragments•Representation of a k-dimensional space in a binary tree•Similar to a decision tree: comparison with nodes•During retrieval:Search for a leaf, butUnlike decision trees backtracking may occurDefinition: k-d Trees•Given:K types: T1, …, Tk for the attributes A1, …, AkA case base CB containing cases in T1 …  Tk A parameter b (size of bucket)•A K-D tree T(CB) for a case base CB is a binary tree defined as follows:If |CB| < b then T(CB) is a leaf node (a bucket)Else T(CB) defines a tree such that:The root is marked with an attribute Ai and a value v in Ai andThe 2 k-d trees T({c  CB: c.i-attribute < v}) and T({c  CB: c.i-attribute  v}) are the left and right subtrees of the rootBWB-Check•Ball-With in-Bounds check:Suppose that algorithm reaches a leave node M (with at most b cases) while searching for the most similar case to PLet c be a case in B such that dist(c,P) is the smallestThen c is a candidate NN for P For each boundary B of M, dist(P,B) > dist(c,P) then c is the NNBut if for any boundary B of M, if dist(P,B) < dist(c,P) then the algorithm needs to backtrack and check if in the regions of B, there is a better candidateFor computing distance, simply use: f-1 be the inverse of the distance-similarity compatible function:distance(P,C) = f-1(sim(P,C))BOB-Check•Ball-Out of-Bounds check:Used during backtrackingChecks if for the boundary B defined in the node: dist(P,B) < dist(c,P) Where c is our current candidate for best case (e.g., the closest case to P in the initial bucket)If the condition is true, The algorithm needs to check if in those boundary’s regions, there is a better candidateExample(0,0)(0,100)(25,35)Omaha(5,45)Denver(35,40)Chicago(50,10)Mobile(90,5)MiamiAtlanta (85,15)(80,65)Buffalo(60,75)Toronto(100,0)A1<3535DenverOmahaA2<4040A1<8585MobileAtlantaMiamiA1<6060ChicagoTorontoBuffaloNotes:•Priority lists are used for computing kNN P(32,45)Using Decision Trees as IndexAiv1v2…vnStandard Decision TreeAiv1v2…vnVariant: InReCA TreeunknownCan be combined with numeric attributesAiv1>v1v2…>vnunknownNotes:•Supports Hamming distance•May require backtracking (using BOB-check) Operates in a similar fashion as k-d trees•Priority lists are used for computing kNNProperties of Retrieval with Indexed Cases•Advantage:•Disadvantages:Efficient retrievalIncremental: don’t need to rebuild index again every time a new case is entered-error does not occurCost of construction is highOnly work for monotonic similarity


View Full Document

LEHIGH CSE 335 - Forms of Retrieval

Download Forms of Retrieval
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Forms of Retrieval and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Forms of Retrieval 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?