DOC PREVIEW
CMU CS 15381 - 041207decisionTrees2

This preview shows page 1-2-3-4-5-6 out of 18 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Artificial Intelligence: Representation and Problem Solving15-381April 12, 2007Decision Trees 2Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Decision Trees 2220 questions•Consider this game of 20 questions on the web:20Q.net Inc.Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Decision Trees 2Pick your poison•How do you decide if a mushroom is edible?•What’s the best identification strategy?•Let’s try decision trees.3“Death Cap”Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Decision Trees 2Some mushroom data (from the UCI machine learning repository)4EDIBLE?CAP-SHAPECAP-SURFACECAP-COLORODORSTALK-SHAPEPOPULATIONHABITAT• •!•1edibleflatfibrousrednonetaperingseveralwoods• •!•2poisonousconvexsmoothredfoultaperingseveralpaths• •!•3edibleflatfibrousbrownnonetaperingabundantgrasses• •!•4edibleconvexscalygraynonetaperingseveralwoods• •!•5poisonousconvexsmoothredfoultaperingseveralwoods• •!•6edibleconvexfibrousgraynonetaperingseveralwoods• •!•7poisonousflatscalybrownfishytaperingseveralleaves• •!•8poisonousflatscalybrownspicytaperingseveralleaves• •!•9poisonousconvexfibrousyellowfoulenlargingseveralpaths• •!•10poisonousconvexfibrousyellowfoulenlargingseveralwoods• •!•11poisonousflatsmoothbrownspicytaperingseveralwoods• •!•12edibleconvexsmoothyellowanisetaperingseveralwoods• •!•13poisonousknobbedscalyredfoultaperingseveralleaves• •!•14poisonousflatsmoothbrownfoultaperingseveralleaves• •!•15poisonousflatfibrousgrayfoulenlargingseveralwoods• •!•16ediblesunkenfibrousbrownnoneenlargingsolitaryurban• •!•17poisonousflatsmoothbrownfoultaperingseveralwoods• •!•18poisonousconvexsmoothwhitefoultaperingscatteredurban• •!•19poisonousflatscalyyellowfoulenlargingsolitarypaths• •!•20edibleconvexfibrousgraynonetaperingseveralwoods• •!•• •!•• •!•• •!•• •!•• •!•• •!•• •!•• •!•• •!•Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Decision Trees 2An easy problem: two attributes provide most of the information5!"#$"%"&$'(#)*' !"#$"%"&$+++"(",+-.+/0++1++23+++$!",'!!,#%4!5"*",+-.+/0++1++6++7++8++2++9++:3Poisonous: 44Edible: 46ODOR isalmond, anise, or nonePoisonous: 1Edible: 46SPORE-PRINT-COLOR isgreenyesnoyesnoPoisonous: 43Edible: 0Poisonous: 1Edible: 0Poisonous: 0Edible: 46100% classification accuracyon a 100 examples.Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Decision Trees 2Same problem with no odor or spore-print-color6!"#$%! &'#(')'*(&'#(')'*( !"#$%! !"#$%!&'#(')'*(&'#(')'*( !"#$%!+++,#%%!-'%'.+/0+12++3++45+++,#%%!(&6-#),+7+8 +++(96%:!(*.;6-!!6$'<!!.#),+7+2+++-6&!-'%'.+/0+18++=++2++>++?5 +++,#%%!-'%'.+7+8@+++,#%%!(#A!+7+=+++(96%:!(*.;6-!!6$'<!!.#),+7+8100% classification accuracyon a 100 examples.Pretty good, right?What if we go off hunting with this decision tree?Performance on another set of 100 mushrooms:80%Why?Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Decision Trees 2Not enough examples?70200400 600 80010001200140016001800200080828486889092949698100# training examples%correct on another set of t he same sizeTraining errorTesting errorWhy is the testing error always lower than the training error?Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Decision Trees 2The Overfitting Problem: Example82Side example with both discrete and continuous attributes: Predicting MPG (‘GOOD’ or ‘BAD’) from attributes:Cylinders HorsepowerAccelerationMaker (discrete)DisplacementThe Overfitting Problem: Example• Suppose that, in an ideal world, class B is everything such that X2>= 0.5 and class A is everything with X2< 0.5• Note that attribute X1is irrelevant• Seems like generating a decision tree would be trivialClass BClass A•Suppose that, in an ideal world, class B is everything such that X2>= 0.5 and class A is everything with X2< 0.5 •Note that attribute X1 is irrelevant •Generating a decision tree would be trivial, right?The following examples are from Prof Hebert.Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Decision Trees 2The Overfitting Problem: Example•But in the real world, our observations have variability.•They can also be corrupted by noise.•Thus, the observed pattern is more complex than it appears.3The Overfitting Problem: Example• However, we collect training examples from the perfect world through some imperfect observation device• As a result, the training data is corrupted by noise.The Overfitting Problem: Example• Because of the noise, the resulting decision tree is far more complicated than it should be• This is because the learning algorithm tries to classify all of the training set perfectly ! This is a fundamental problem in learning: overfitting9Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Decision Trees 2The Overfitting Problem: Example•noise makes the decision tree more complex than it should be•The algorithm tries to classify all of the training set perfectly•This is a fundamental problem in learning and is called overfitting3The Overfitting Problem: Example• However, we collect training examples from the perfect world through some imperfect observation device• As a result, the training data is corrupted by noise.The Overfitting Problem: Example• Because of the noise, the resulting decision tree is far more complicated than it should be• This is because the learning algorithm tries to classify all of the training set perfectly ! This is a fundamental problem in learning: overfitting103The Overfitting Problem: Example• However, we collect training examples from the perfect world through some imperfect observation device• As a result, the training data is corrupted by noise.The Overfitting Problem: Example• Because of the noise, the resulting decision tree is far more complicated than it should be• This is because the learning algorithm tries to classify all of the training set perfectly ! This is a fundamental problem in learning: overfittingMichael S. Lewicki ! Carnegie MellonArtificial Intelligence: Decision Trees 2The Overfitting Problem: Example•noise makes the decision tree more complex than it should be•The algorithm tries to classify all of the training set perfectly•This is a fundamental problem in learning and is called overfitting3The Overfitting Problem: Example• However, we collect training examples


View Full Document

CMU CS 15381 - 041207decisionTrees2

Documents in this Course
Planning

Planning

19 pages

Planning

Planning

19 pages

Lecture

Lecture

42 pages

Lecture

Lecture

27 pages

Lecture

Lecture

19 pages

FOL

FOL

41 pages

lecture

lecture

34 pages

Exam

Exam

7 pages

Lecture

Lecture

22 pages

Handout

Handout

11 pages

Midterm

Midterm

14 pages

lecture

lecture

83 pages

Handouts

Handouts

38 pages

mdp

mdp

37 pages

HW2

HW2

7 pages

nn

nn

25 pages

lecture

lecture

13 pages

Handout

Handout

5 pages

Lecture

Lecture

27 pages

Lecture

Lecture

62 pages

Lecture

Lecture

5 pages

Load more
Download 041207decisionTrees2
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view 041207decisionTrees2 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view 041207decisionTrees2 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?