LEHIGH CSE 335 - K-Nearest Neighbors (29 pages)

Previewing pages 1, 2, 3, 27, 28, 29 of 29 page document View the full content.
View Full Document

K-Nearest Neighbors



Previewing pages 1, 2, 3, 27, 28, 29 of actual document.

View the full content.
View Full Document
View Full Document

K-Nearest Neighbors

34 views

Other


Pages:
29
School:
Lehigh University
Course:
Cse 335 - Intellignt Decision Supprt Sys

Unformatted text preview:

K Nearest Neighbors kNN Given a case base CB a new problem P and a similarity metric sim Obtain the k cases in CB that are most similar to P according to sim Reminder we used a priority list with the top k most similar cases obtained so far Forms of Retrieval Sequential Retrieval Two Step Retrieval Retrieval with Indexed Cases Retrieval with Indexed Cases Sources Bergman s b ook Davenport Prusack s book on Advanced Data Structures Samet s book on Data Structures Range Search Space of known problems Red light on Yes Beeping Yes Transistor burned K D Trees Idea Partition of the case base in smaller fragments Representation of a k dimension space in a binary tree Similar to a decision tree comparison with nodes During retrieval Search for a leaf but Unlike decision trees backtracking may occur Definition K D Trees Given K types T1 Tk for the attributes A1 Ak A case base CB containing cases in T1 Tk A parameter b size of bucket A K D tree T CB for a case base CB is a binary tree defined as follows If CB b then T CB is a leaf node a bucket Else T CB defines a tree such that The root is marked with an attribute Ai and a value v in Ai and The 2 k d trees T c CB c i attribute v and T c CB c i attribute v are the left and right subtrees of the root Example A1 0 100 35 5 45 35 40 Denver P 32 45 Chicago 25 35 Omaha 50 10 Mobile 60 75 Toronto 80 65 Buffalo Denver Omaha A2 40 40 A1 Atlanta 85 15 90 5 Miami 35 85 85 Mobile Atlanta Miami 0 0 100 0 Notes Supports Euclidean distance May require backtracking Closest city to P 32 45 Priority lists are used for computing kNN 60 Chicago A1 60 Toronto Buffalo Using Decision Trees as Index Standard Decision Tree Variant InReCA Tree Ai v1 v2 Ai vn v1 v2 unknown vn Ai v1 unknown v n v1 v2 Can be combined with numeric attributes Notes Supports Hamming distance May require backtracking Operates in a similar fashion as kd trees Priority lists are used for computing kNN Variation Point QuadTree Particularly suited for performing range search i



View Full Document

Access the best Study Guides, Lecture Notes and Practice Exams

Loading Unlocking...
Login

Join to view K-Nearest Neighbors and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view K-Nearest Neighbors and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?