DOC PREVIEW
UT Dallas CS 6375 - knn

This preview shows page 1-2-3-4 out of 11 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

11Machine LearningCS6375 --- Spring 2015aInstance-Based Learning (kNN)Instructor: Yang Liu2Instance-Based Learning (IBL)Unlike most learning algorithms, case-based approaches (also called exemplar-based or instance-based) do not construct an abstract hypothesis but instead base classification of test instances on similarity to specific training instances.Training is typically very simple: Just store the training instancesGeneralization is postponed until a new instance must be classified. Therefore, case-based methods are sometimes called “lazy” learners.Given a new instance, its relationship to the stored examples is examined in order to assign a target function value for the new instance.23Distance MetricsCase-based methods assume a function for calculating the (dis)similarity of two instances.similarity metric: computes similarity between instancesdistance metric: computes dissimilarity between instancesFor continuous feature vectors, just use Euclidean distanceFor discrete features, just assume distance between two values is 0 if they are the same, 1 if differentTo compensate for differences in units, scale all continuous values to normalize their values to be between 0 and 1.( )∑=−=NiiicattrcattrccDist122121)()(),(41-Nearest Neighborworks reasonably well if no attribute or class noise( ))()),(()()(),(122121jjtesttestjjNiiivalueorclasspredictionccDistMINghborNearestNeicattrcattrccDist==−=∑=351-Nearest Neighbor (graphically)6Decision Boundary for K=1Is piecewise linear; each piece is a hyperplane that is perpendicular to the bisector of pairs of points from different classes Voronoi diagram47k-Nearest NeighborCalculate the distance between a test instance and every training instancePick the k closest training examples and assign the test example to the most common category or the average value among these “nearest” neighbors”Voting multiple neighbors helps increase resistance to noise. For binary classification tasks, odd values of kare normally used to avoid ties. 8K-Nearest Neighbor Using a Majority Voting Scheme59Effect of “k”Large k:less sensitive to noisebetter probability estimates for discrete classeslarger training sets allow larger values of kSmall k:captures fine structure of space bettermay be necessary with small training setsBalance must be made between large and small k10Effect of “k”611From Hastie, Tibshirani, Friedman 2001 p41912Distance-Weighted kNNTradeoff between small and large k can be difficultuse large k, but more emphasis on nearer neighbors?predictiontest=wi∗ classii=1k∑wii=1k∑(orwi∗ valueii=1k∑wii=1k∑)wk=1Dist(ck,ctest)Now can even consider using all the training instances(Shepard’s method)713K-NN Using a Weighted-sum Voting Scheme14Nearest Neighbor ClassificationKey components:“Similar” item: We need a functional definition of “similarity” if we want to apply this automatically.How many neighbors do we consider?Does each neighbor get the same weight?How do we make the final decision?815K-NN vs. Other Techniques• Instance-based methods do not need a training phase, unlike decision trees and Bayes classifiers• However, the nearest-neighbors-search step can be expensive for large/high-dimensional datasets• Instance-based learning is non-parametric, i.e., no prior model assumptions• No foolproof way to pre-select K … but can be selected using cross-validation (later lectures)16Problems• Can be slow to find nearest neighbor in high dim space (check with each instance)• Need to store all the training data• Need to specify the distance function• No probabilistic output917Improve Speed and Space• Compute partial distances using a subset of dimensions, and eliminating the points with partial distances greater than the full distance of the current closest points • Use search trees that are hierarchically structured so that only a subset of the training points are considered during search• Editing the training set by eliminating the points that are surrounded by other training points with the same class label (keep those close to the boundaries)18Comment on Similarity• Is hard to define• For real-valued feature vectors, can use Euclidean distance • Scale matters1019Similarity•Mahalanobis distance can put different weights on different dimensions (Σ is the covariance matrix)•Euclidean distance is Σ=1•If the covariance matrix is diagonal:)()(),(12vuvuvudT−∑−=−∑=−=Niiiivuvud1222)(),(σ20Feature Normalization• Be careful about scaling• Feature normalization can be used to approximately equalize ranges of the features and make them have approximately the same effect in the distance computation• Methods:– Linear scale to unit range – Linear scale to unit variance– Rank normalization1121Lazy and Eager Learning• Lazy: wait for query before generalizing – kNN• Eager: generalize before seeing query– ID3, Naïve bayes, neural network• Difference? – Eager learner must create global approximation – Lazy learner can create many local approximations, may represent more complex


View Full Document

UT Dallas CS 6375 - knn

Documents in this Course
ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

intro

intro

19 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

hw2

hw2

2 pages

hw1

hw1

4 pages

hw0

hw0

2 pages

hw5

hw5

2 pages

hw3

hw3

3 pages

20.mdp

20.mdp

19 pages

19.em

19.em

17 pages

16.svm-2

16.svm-2

16 pages

15.svm-1

15.svm-1

18 pages

14.vc

14.vc

24 pages

9.hmm

9.hmm

28 pages

5.mle

5.mle

16 pages

3.bayes

3.bayes

19 pages

2.dtree

2.dtree

41 pages

1.intro

1.intro

19 pages

21.rl

21.rl

18 pages

CNF-DNF

CNF-DNF

2 pages

ID3

ID3

4 pages

mlHw6

mlHw6

3 pages

MLHW3

MLHW3

4 pages

MLHW4

MLHW4

3 pages

ML-HW2

ML-HW2

3 pages

vcdimCMU

vcdimCMU

20 pages

hw0

hw0

2 pages

hw3

hw3

3 pages

hw2

hw2

2 pages

hw1

hw1

4 pages

9.hmm

9.hmm

28 pages

5.mle

5.mle

16 pages

3.bayes

3.bayes

19 pages

2.dtree

2.dtree

41 pages

1.intro

1.intro

19 pages

15.svm-1

15.svm-1

18 pages

14.vc

14.vc

24 pages

hw2

hw2

2 pages

hw1

hw1

4 pages

hw0

hw0

2 pages

hw3

hw3

3 pages

9.hmm

9.hmm

28 pages

5.mle

5.mle

16 pages

3.bayes

3.bayes

19 pages

2.dtree

2.dtree

41 pages

1.intro

1.intro

19 pages

Load more
Download knn
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view knn and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view knn 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?