WSU CSE 6363 - Instance Based Learning

Unformatted text preview:

Instance Based Learning• k-Nearest Neighbor• Locally weighted regression• Radial basis functions• Case-based reasoning• Lazy and eager learning1Instance-Based LearningKey idea: just store all training examples xi,f(xi)Nearest neighbor:• Given query instance xq, first locate nearest trainingexample xn, then estimateˆf(xq) ← f(xn)k-Nearest neighbor:• Given xq, take vote among its k nearest nbrs (ifdiscrete-valued target function)• take mean of f values of k nearest nbrs (ifreal-valued)ˆf(xq) ←ki=1f(xi)k2When To Consider Nearest Neighbor• Instances map to points in n• Less than 20 attributes per instance• Lots of training dataAdvantages:• Training is very fast• Learn complex target functions• Don’t lose informationDisadvantages:• Slow at query time• Easily fooled by irrelevant attributes3Voronoi Diagram++−−−+−−+xq4Behavior in the LimitConsider p(x) defines probability that instance x will belabeled 1 (positive) versus 0 (negative).Nearest neighbor:• As number of training examples →∞, approachesGibbs AlgorithmGibbs: with probability p(x) predict 1, else 0k-Nearest neighbor:• As number of training examples →∞and k getslarge, approaches Bayes optimalBayes optimal: if p(x) >.5 then predict 1, else 0Note Gibbs has at most twice the expected error ofBayes optimal5Distance-Weighted kNNMight want to weight nearer neighbors more heavily...ˆf(xq) ←ki=1wif(xi)ki=1wiwherewi≡1d(xq,xi)2and d(xq,xi) is distance between xqand xiNote now it makes sense to use all training examplesinstead of just k→ Shepard’s method6Curse of DimensionalityImagine instances described by 20 attributes, but only 2are relevant to target functionCurse of dimensionality: nearest nbr is easily misleadwhen high-dimensional XOne approach:• Stretch jth axis by weight zj, where z1,...,znchosen to minimize prediction error• Use cross-validation to automatically choose weightsz1,...,zn• Note setting zjto zero eliminates this dimensionaltogethersee [Moore and Lee, 1994]7Locally Weighted RegressionNote kNN forms local approximation to f for eachquery point xqWhy not form an explicit approximationˆf(x) for regionsurrounding xq• Fit linear function to k nearest neighbors• Fit quadratic, ...• Produces “piecewise approximation” to fSeveral choices of error to minimize:• Squared error over k nearest neighborsE1(xq) ≡12x∈ k nearest nbrs of xq(f(x) −ˆf(x))2• Distance-weighted squared error over all nbrsE2(xq) ≡12x∈D(f(x) −ˆf(x))2K(d(xq,x))• ...8Radial Basis Function Networks• Global approximation to target function, in terms oflinear combination of local approximations• Used, e.g., for image classification• A different kind of neural network• Closely related to distance-weighted regression, but“eager” instead of “lazy”9Radial Basis Function Networks......f(x)w1w0wk11a (x)2a (x)na (x)where ai(x) are the attributes describing instance x, andf(x)=w0+ku=1wuKu(d(xu,x))One common choice for Ku(d(xu,x)) isKu(d(xu,x)) = e12σ2ud2(xu,x)10Training Radial Basis Function Net-worksQ1: What xuto use for each kernel functionKu(d(xu,x))• Scatter uniformly throughout instance space• One for each cluster of instances (use prototypes)• Or use training instances (reflects instancedistribution)Q2: How to train weights (assume here Gaussian Ku)• First choose variance (and perhaps mean) for eachKu– e.g., use EM• Then hold Kufixed, and train linear output layer– efficient methods to fit linear function11Case-Based ReasoningCan apply instance-based learning even when X = n→ need different “distance” metricCase-Based Reasoning is instance-based learning appliedto instances with symbolic logic descriptions((user-complaint error53-on-shutdown)(cpu-model PowerPC)(operating-system Windows)(network-connection PCIA)(memory 48meg)(installed-applications Excel Netscape VirusScan)(disk 1gig)(likely-cause ???))12Case-Based Reasoning in CADETCADET: 75 stored examples of mechanical devices• each training example:  qualitative function,mechanical structure• new query: desired function,• target value: mechanical structure for this functionDistance metric: match qualitative function descriptions13Case-Based Reasoning in CADETA stored case: ++++Function:T−junction pipeTQ= temperature= waterflowStructure:++++−++++Function:Structure: ?+A problem specification: Water faucetQ ,T121Q ,TQ ,T233QQTTQT121233CCtfQTccQThhQTmm14Case-Based Reasoning in CADET• Instances represented by rich structural descriptions• Multiple cases retrieved (and combined) to formsolution to new problem• Tight coupling between case retrieval and problemsolvingBottom line:• Simple matching of cases useful for tasks such asanswering help-desk queries• Area of ongoing research15Lazy and Eager LearningLazy: wait for query before generalizing• k-Nearest Neighbor, Case based reasoningEager: generalize before seeing query• Radial basis function networks, ID3, C4.5,Backpropagation, NaiveBayes, ...Does it matter?• Eager learner creates one global approximation• Lazy learner can create many local approximations• If they use same H, lazy can represent more complexfunctions (e.g., consider H = linear


View Full Document

WSU CSE 6363 - Instance Based Learning

Download Instance Based Learning
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Instance Based Learning and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Instance Based Learning 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?