CORNELL CS 472 - Instance-Based Learning - D1717742

Home> Schools> Cornell University> Computer Science (CS) > CS 472> Instance-Based Learning

DOC PREVIEW

CORNELL CS 472 - Instance-Based Learning

School name Cornell University

Course Cs 472- Foundations of Artif Intllgnce

Pages 4

This preview shows page 1 out of 4 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Instance-Based LearningCS472/CS473 – Fall 2005What is Learning?• Examples– Riding a bike (motor skills)– Telephone number (memorizing)– Read textbook (memorizing and operationalizing rules)– Playing backgammon (strategy)– Develop scientific theory (abstraction)– Language– Recognize fraudulent credit card transactions–Etc.(One) Definition of LearningDefinition [Mitchell]:A computer program is said to learn from • experience E with respect to some class of • tasks T and • performance measure P, if its performance at tasks in T, as measured by P, improves with experience E. Examples• Spam Filtering– T: Classify emails HAM / SPAM – E: Examples (e1,HAM),(e2,SPAM),(e3,HAM),(e4,SPAM), ...– P: Prob. of error on new emails• Personalized Retrieval– T: find documents the user wants for query– E: watch person use Google (queries / clicks)– P: # relevant docs in top 10• Play Checkers–T: Play checkers– E: games against self– P: percentage winsHow can an Agent Learn?Learning strategies and settings• rote learning• learning from instruction• learning by analogy• learning from observation and discovery• learning from examples–Carbonell, Michalski & Mitchell.Inductive Learning / Concept Learning• Task: – Learn (to imitate) a function f: X Æ Y• Training Examples:– Learning algorithm is given the correct value of the function for particular inputs Æ training examples–An example is a pair (x, f(x)), where x is the input and f(x) is the output of the function applied to x.• Goal: – Learn a function h: X Æ Y that approximates f: X Æ Y as well as possible.Concept Learning ExampleFood (3) Chat (2) Fast (2) Price (3) Bar (2) BigTipgreat yes yes normal no yes great no yes normal no yes mediocre yes no high no no great yes yes normal yes yes Instance Space X: Set of all possible objects described by attributes (often called features). Target Function f: Mapping from Attributes to Target Feature (often called label) (f is unknown)Hypothesis Space H: Set of all classification rules hiwe allow.Training Data D: Set of instances labeled with Target FeatureClassification and Regression TasksNaming: If Y is a the real numbers, then called “regression”. If Y is a discrete set, then called “classification”.Examples:• Steering a vehicle: image in windshield → direction to turn the wheel (how far)• Medical diagnosis: patient symptoms → has disease / does not have disease• Forensic hair comparison: image of two hairs → match or not• Stock market prediction: closing price of last few days →market will go up or down tomorrow (how much)• Noun phrase coreference: description of two noun phrases in a document → do they refer to the same real world entityInductive Learning Algorithm• Task:– Given: collection of examples– Return: a function h (hypothesis) that approximates f• Inductive Learning Hypothesis: Any hypothesis found to approximate the target function well over a sufficiently large set of training examples will also approximate the target function well over any other unobserved examples.• Assumptions of Inductive Learning:– The training sample represents the population– The input features permit discriminationInductive Learning SettingTask:• Learner induces a general rule h from a set of observed examples that classifies new examples accurately.New examplesh: X Æ YInstance-Based Learning• Idea: – Similar examples have similar label.– Classify new examples like similar training examples.• Algorithm:– Given some new example x for which we need to predict its class y– Find most similar training examples–Classify x “like” these most similar examples• Questions:– How to determine similarity?– How many similar training examples to consider?– How to resolve inconsistencies among the training examples?K-Nearest Neighbor (KNN)• Given: Training data– Attribute vectors: – Target attribute:• Parameter:– Similarity function:– Number of nearest neighbors to consider: k• Prediction rule– New example x’– K-nearest neighbors: k training examples with smallestKNN Example• New examples:– (great, no, no, normal, no)– (mediocre, yes, no, normal, no) Food (3) Chat (2) Fast (2) Price (3) Bar (2) BigTip1 great yes yes normal no yes 2 great no yes normal no yes 3 mediocre yes no high no no 4 great yes yes normal yes yes Types of Attributes• Symbolic (nominal) – EyeColor {brown, blue, green}• Boolean– anemic {TRUE,FALSE}• Numeric– Integer: age [0, 105]– Real: length• Structural– Natural language sentence: parse tree– Protein: sequence of amino acidsKNN for Real-Valued Attributes• Similarity Functions:– Gaussian: –Cosine: attribute_1attribute_2+++++++++ooooooooooo+++oooSelecting the Number of Neighbors• Increase k:– Makes KNN less sensitive to noise • Decrease k:– Allows capturing finer structure of spaceÎPick k not too large, but not too small (depends on data)attribute_1attribute_2+++++++++ooooooooooo+++oooExample: Effect of kHastie, Tibshirani, Friedman 2001Advantages and Disadvantages of KNN• Simple algorithm• Need similarity measure and attributes that “match”target function.• For large training sets, requires large memory is slow when making a prediction.• Prediction accuracy can quickly degrade when number of attributes grows.Curse-of-Dimensionality• Prediction accuracy can quickly degrade when number of attributes grows.– Irrelevant attributes easily “swamp” information from relevant attributesÎWhen many irrelevant attributes, similarity measure becomes less reliable• Remedy– Try to remove irrelevant attributes in pre-processing step– Weight attributes differently– Increase k (but not too much)Remarks on KNN• Memorizes all observed instances and their class• Is this rote learning?• Is this really learning?• When does the induction take

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1 out of 4 pages.

CORNELL CS 472 - Instance-Based Learning

Sign up for free to view:

Please select your school