DOC PREVIEW
UT CS 343 - Machine Learning

This preview shows page 1-2-16-17-18-34-35 out of 35 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 343: Artificial Intelligence Machine LearningWhat is Learning?ClassificationProblem Solving / Planning / ControlSample Category Learning ProblemHypothesis SelectionGeneralizationHypothesis SpaceInductive Learning HypothesisEvaluation of Classification LearningCategory Learning as SearchLearning by EnumerationEfficient LearningConjunctive Rule LearningLimitations of Conjunctive RulesDecision TreesProperties of Decision Tree LearningTop-Down Decision Tree InductionSlide 19Decision Tree Induction PseudocodePicking a Good Split FeatureEntropyEntropy Plot for Binary ClassificationInformation GainHypothesis Space SearchAnother Red-Circle Data SetWeka J48 Trace 1A Data Set Requiring DisjunctionWeka J48 Trace 2Weka J48 Trace 3Evaluating Inductive HypothesesK-Fold Cross ValidationK-Fold Cross Validation CommentsLearning CurvesCross Validation Learning Curves1CS 343: Artificial IntelligenceMachine LearningRaymond J. MooneyUniversity of Texas at Austin2What is Learning?•Herbert Simon: “Learning is any process by which a system improves performance from experience.”•What is the task?–Classification–Problem solving / planning / control3Classification•Assign object/event to one of a given finite set of categories.–Medical diagnosis–Credit card applications or transactions–Fraud detection in e-commerce–Worm detection in network packets–Spam filtering in email–Recommended articles in a newspaper–Recommended books, movies, music, or jokes–Financial investments–DNA sequences–Spoken words–Handwritten letters–Astronomical images4Problem Solving / Planning / Control•Performing actions in an environment in order to achieve a goal.–Solving calculus problems–Playing checkers, chess, or backgammon–Balancing a pole–Driving a car or a jeep–Flying a plane, helicopter, or rocket–Controlling an elevator–Controlling a character in a video game–Controlling a mobile robot5Sample Category Learning Problem•Instance language: <size, color, shape>–size  {small, medium, large}–color  {red, blue, green}–shape  {square, circle, triangle}•C = {positive, negative}•D:Example Size Color Shape Category1 small red circle positive2 large red circle positive3 small red triangle negative4 large blue circle negative6Hypothesis Selection•Many hypotheses are usually consistent with the training data.–red & circle–(small & circle) or (large & red) –(small & red & circle) or (large & red & circle)–not [ ( red & triangle) or (blue & circle) ]–not [ ( small & red & triangle) or (large & blue & circle) ]•Bias–Any criteria other than consistency with the training data that is used to select a hypothesis.7Generalization•Hypotheses must generalize to correctly classify instances not in the training data.•Simply memorizing training examples is a consistent hypothesis that does not generalize.•Occam’s razor:–Finding a simple hypothesis helps ensure generalization.8Hypothesis Space•Restrict learned functions a priori to a given hypothesis space, H, of functions h(x) that can be considered as definitions of c(x).•For learning concepts on instances described by n discrete-valued features, consider the space of conjunctive hypotheses represented by a vector of n constraints <c1, c2, … cn> where each ci is either:–?, a wild card indicating no constraint on the ith feature–A specific value from the domain of the ith feature–Ø indicating no value is acceptable•Sample conjunctive hypotheses are–<big, red, ?>–<?, ?, ?> (most general hypothesis)–< Ø, Ø, Ø> (most specific hypothesis)9Inductive Learning Hypothesis•Any function that is found to approximate the target concept well on a sufficiently large set of training examples will also approximate the target function well on unobserved examples.•Assumes that the training and test examples are drawn independently from the same underlying distribution.•This is a fundamentally unprovable hypothesis unless additional assumptions are made about the target concept and the notion of “approximating the target function well on unobserved examples” is defined appropriately (cf. computational learning theory).10Evaluation of Classification Learning•Classification accuracy (% of instances classified correctly).–Measured on an independent test data.•Training time (efficiency of training algorithm).•Testing time (efficiency of subsequent classification).11Category Learning as Search•Category learning can be viewed as searching the hypothesis space for one (or more) hypotheses that are consistent with the training data.•Consider an instance space consisting of n binary features which therefore has 2n instances.•For conjunctive hypotheses, there are 4 choices for each feature: Ø, T, F, ?, so there are 4n syntactically distinct hypotheses.•However, all hypotheses with 1 or more Øs are equivalent, so there are 3n+1 semantically distinct hypotheses.•The target binary categorization function in principle could be any of the possible 22^n functions on n input bits.•Therefore, conjunctive hypotheses are a small subset of the space of possible functions, but both are intractably large.•All reasonable hypothesis spaces are intractably large or even infinite.12Learning by Enumeration•For any finite or countably infinite hypothesis space, one can simply enumerate and test hypotheses one at a time until a consistent one is found. For each h in H do: If h is consistent with the training data D, then terminate and return h.•This algorithm is guaranteed to terminate with a consistent hypothesis if one exists; however, it is obviously computationally intractable for almost any practical problem.13Efficient Learning•Is there a way to learn conjunctive concepts without enumerating them?•How do human subjects learn conjunctive concepts?•Is there a way to efficiently find an unconstrained boolean function consistent with a set of discrete-valued training instances?•If so, is it a useful/practical algorithm?14Conjunctive Rule Learning•Conjunctive descriptions are easily learned by finding all commonalities shared by all positive examples.•Must check consistency with negative examples. If inconsistent, no conjunctive rule exists. Example Size Color Shape Category1 small red circle positive2 large red circle positive3 small red triangle negative4 large blue circle negativeLearned rule: red & circle → positive15Limitations of Conjunctive Rules•If a


View Full Document

UT CS 343 - Machine Learning

Download Machine Learning
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Machine Learning and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Machine Learning 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?