Unformatted text preview:

CSCI585 Nearest Neighbor Queries Instructor Cyrus Shahabi CSCI585 Nearest Neighbor Search Retrieve the nearest neighbor of query point Q Simple Strategy convert the nearest neighbor search to range search Guess a range around Q that contains at least one object say O if the current guess does not include any answers increase range size until an object found Compute distance d between Q and O re execute the range query with the distance d around Q Compute distance of Q from each retrieved object The object at minimum distance is the nearest neighbor 2 Na ve Approach CSCI585 A B A B C 1 F 2 E B 3 4 F 5 6 G C F E 4 5 E H G 1 2 3 10 11 12 H 7 8 9 6 Query Point Q C G 11 12 10 7 8 9 H Issues how to guess range The retrieval may be sub optimal if incorrect range guessed Would be a problem in high dimensional spaces CSCI585 Given a query location q find the nearest object a q Depth First and Best First Search using R trees Goal avoid visiting nodes that cannot contain results 4 CSCI585 Basic Pruning Metric MINDIST Minimum distance between q and an MBR p E1 mindist E1 q q mindist E1 q is a lower bound of d o q for every object o in E1 If we have found a candidate NN p we can prune every MBR whose mindist d p q 5 CSCI585 MINDIST Property MINDIST is a lower bound of any k NN distance p1 p2 t1 t2 p1 p2 p1 p2 s1 s2 p1 p2 p1 p2 6 Depth First DF NN Algorithm Roussoulos et al SIGMOD 1995 CSCI585 y axis 10 g k E5 i E4 E 1 query d E3 a b 2 Main idea Starting from the root visit nodes according to their mindist in depth first manner l E6 e f 4 E7 h 8 6 m E2 j c x axis 0 2 4 Root E 1 1 E 3 5 E 1 a 75 E 3 b E 4 9 c d E 4 10 8 6 Note distances not actually stored inside nodes Only for illustration E 2 2 E 5 5 E 6 2 e f E 5 g E 7 13 i 2 h E 6 E 2 j 10 k 13 l E 7 m DF Search Visit E1 CSCI585 y axis 10 g h 8 6 b 2 E7 l k E5 E4 4 m E6 e f d E2 E1 E3 i j query a c x axis 0 2 4 Root E 1 1 E 3 5 E 1 a 85 E 3 b E 4 9 c d E 4 6 10 8 E 2 2 E 5 5 E 6 2 e f E 5 g E 7 13 i 2 h E 6 E 2 j k l E 7 m DF Search Find Candidate NN a CSCI585 y axis 10 g h 8 6 b 2 E7 l k E5 E4 4 m E6 e f d E2 E1 E3 i j query a c x axis 0 2 4 Root E 1 1 E 3 5 E 1 a 95 E 3 b E 4 9 c d E 4 6 10 8 First Candidate NN a with distance 5 E 2 2 E 5 5 E 6 2 e f E 5 g E 7 13 i 2 h E 6 E 2 j k l E 7 m CSCI585 DF Search Backtrack to Root and Visit E2 y axis 10 g h 8 6 b 2 E7 l k E5 E4 4 m E6 e f d E2 E1 E3 i j query a c x axis 0 2 4 Root E 1 1 E 3 5 E 1 a 5 10 E 3 b E 4 9 c d E 4 6 10 8 First Candidate NN a with distance 5 E 2 2 E 5 5 E 6 2 e f E 5 g E 7 13 i 2 h E 6 E 2 j k l E 7 m DF Search Find Actual NN i CSCI585 y axis 10 g h 8 6 b 2 E7 l k E5 E4 4 m E6 e f d E2 E1 E3 i j query a First Candidate NN a with distance 5 c x axis 0 2 4 Root E 1 1 E 3 5 E 1 a 5 11 E 3 b E 4 9 c d E 4 6 10 8 E 2 2 E 5 5 E 6 2 e f E 5 Actual NN i with distance 2 g E 7 13 i 2 h E 6 E 2 j k l E 7 m Optimality CSCI585 y axis E2 10 g b 2 k E5 E4 4 l E6 e f d E7 h 8 6 m E1 E3 i j query a c x axis 0 2 4 6 8 10 Question Which is the minimal set of nodes that must be visited by any NN algorithm Answer The set of nodes whose MINDIST is smaller than or equal to the distance between q and its NN e g E1 E2 E6 12 CSCI585 A Better Strategy for KNN search A sorted priority queue based on MINDIST Nodes traversed in order Stops when there is an object at the top of the queue 1 NN found k NN can be computed incrementally I O optimal 13 Priority Queue CSCI585 A A B B C 1 B E F 2 E E F 4 3 C G F 5 6 H G 1 2 3 10 11 12 H 7 8 9 4 5 A 6 Query Point Q B C E C F C 5 6 4 F H 5 G 6 4 F 7 5 8 G 9 6 C 1211 10 G 7 8 9 H 1NN 4 F 14 CSCI585 Best First BF NN Algorithm Optimal Hjaltason and Samet TODS 1998 Keep a heap H of index entries and objects ordered by MINDIST Initially H contains the root While H Extract the element with minimum MINDIST If it is an index entry insert its children into H If it is an object return it as NN End while 15 BF Search Visit root CSCI585 y axis 10 g E2 h 8 6 E4 d E1 E3 Action l i Heap E 1 E 2 1 2 Visit Root k E5 b 2 E7 E6 e f 4 m j query a c x axis 0 2 4 10 8 6 Root E 1 1 E 3 5 E 1 a b E 4 9 c d E 5 5 E 6 2 e f 16 E 3 E 4 E 2 2 E 5 g E 7 13 i 2 h E 6 E 2 j 10 k 13 l E 7 m BF Search Visit E1 CSCI585 y axis 10 g E2 h 8 6 E4 d E1 E3 Action l i Heap E 1 E 2 …


View Full Document

USC CSCI 585 - Session16

Loading Unlocking...
Login

Join to view Session16 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Session16 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?