DOC PREVIEW
UT Dallas CS 6375 - 2.dtree

This preview shows page 1-2-3-19-20-39-40-41 out of 41 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

11CS 6375 Machine LearningDecision Trees Instructor: Yang Liu2Supervised ClassifierX1X2…. XMRef class label23Decision Tree Example (I)• Three variables:– Attribute 1: Hair = {blond, dark}– Attribute 2: Height = {tall, short}– Class: Country = {G, P}4The class of a new input can be classified by following thetree all the way down to a leaf and by reporting the outputof the leaf. For example:(B,T) is classified as(D,S) is classified as35Decision TreesDecision Trees are classifiers.Instances represented as feature vectors. Nodes are (equality and inequality) tests for feature values. There is one branch for each value of the feature.Leaves specify the categories (labels).Can categorize instances into multiple disjoint categories. 6General Case (Discrete Attributes)• We have R observations from training data– Each observation has M attributes X1,..,XM– Each Xican take Nidistinct discrete values– Each observation has a class attribute Y with C distinct (discrete) values– Problem: Construct a sequence of tests on the attributes such that, given a new input (x’1,..,x’M), the class attribute y is correctly predictedX=attributes of training data (RxM) Y=Class of training data (R)Note: There are other notations used for data representation47General Decision Tree (Discrete Attributes)8Decision Tree Example (II)59The class of a new input can be classified by following thetree all the way down to a leaf and by reporting the outputof the leaf. For example:(0.2,0.8) is classified as(0.8,0.2) is classified as10General Case (Continuous Attributes)• We have R observations from training data– Each observation has M attributes X1,..,XM– Each Xican take continuous values– Each observation has a class attribute Y with C distinct (discrete) values– Problem: Construct a sequence of tests of the form Xi< ti? on the attributes such that, given a new input (x’1,..,x’M), the class attribute y is correctly predictedX=attributes of training data (RxM) Y=Class of training data (R)611General Decision Tree (Continuous Attributes)Side: Decision Tree Decision Boundaries Decision trees divide the feature space into axis-parallel rectangles, and label each rectangle with one of the class labels. 12713Decision BoundariesAlso called decision surfaces (high dimensional space)Take binary classification as an example: it is a hypersurface that partitions the feature space into two sets, one for each class. If the decision surface is a hyperplane, then the classification problem is linear, and the classes are linearly separable. Instances on one side of the hyperplane belong to one class, those on the other side belong to the other class.+-14Basic Questions• How to choose the attribute/value to split at each level of the tree?• When to stop splitting? When should a node be declared a leaf?• If a leaf node is impure, how should the class label be assigned?815How to choose the attribute/value to split on at each level of the tree?• Two classes (red circles/green crosses)• Two attributes: X1and X2• 11 points in training data• Idea: Construct a decision tree such that the leaf nodes predict correctly the class for all the training examples16How to choose the attribute/value to split on at each level of the tree?917This node is“pure” becausethere is onlyone class left No ambiguity inthe class labelThis node isalmost “pure” Littleambiguity in theclass labelThese nodes contain amixture of classes Do not disambiguatebetween the classes18This node is“pure” becausethere is onlyone class left No ambiguity inthe class labelThis node isalmost “pure” Littleambiguity in theclass labelThese nodes contain amixture of classes Do not disambiguatebetween the classesWe want to find the most compact, smallest sizetree (Occam’s razor), that classifies the trainingdata correctly. We want to find the split choicesthat will get us the fastest to pure nodes1019Entropy• Entropy is a measure of the impurity of a distribution, defined as:• Pi= probability of occurrence of value i– High entropy  All the classes are (nearly) equally likely– Low entropy  A few classes are likely; most of theclasses are rarely observed– assume 0 log20 = 020EntropyThe entropycaptures thedegree of “purity”of the distribution1121Example Entropy Calculation22Conditional EntropyThe average entropy (or “conditional entropy”) after splitting is:Entropy before splitting: HAfter splitting, a fractionPLof the data goes tothe left node, which hasentropy HLAfter splitting, a fractionPRof the data goes tothe right node, which hasentropy HR1223Information GainWe want nodes as pure as possible We want to reduce the entropy as much as possible We want to maximize the difference between the entropy of the parent node and the expected entropy of the childrenMaximize:24Notations• Entropy: H(Y) = Entropy of the distribution of classes at a node• Conditional Entropy:– Discrete: H(Y|Xj) = Entropy after splitting with respect to variable j– Continuous: H(Y|Xj,t) = Entropy after splitting with respect to variable j with threshold t• Information gain:– Discrete: IG(Y|Xj) = H(Y) - H(Y|Xj) – Continuous: IG(Y|Xj,t) = H(Y) - H(Y|Xj,t)1325Information GainWe want nodes as pure as possible We want to reduce the entropy as much as possible We want to maximize the difference between the entropy of the parent node and the expected entropy of the childrenMaximize:Information Gain (IG) = Amount by which the ambiguity is decreased by splitting the node26142728Day Outlook Temperature Humidity Wind PlayTennis 1 Sunny Hot High Weak No2 Sunny Hot High Strong No3 Overcast Hot High Weak Yes4 Rain Mild High Weak Yes5 Rain Cool Normal Weak Yes6 Rain Cool Normal Strong No7 Overcast Cool Normal Strong Yes 8 Sunny Mild High Weak No9 Sunny Cool Normal Weak Yes10 Rain Mild Normal Weak Yes 11 Sunny Mild Normal Strong Yes12 Overcast Mild High Strong Yes13


View Full Document

UT Dallas CS 6375 - 2.dtree

Documents in this Course
ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

hw2

hw2

2 pages

hw1

hw1

4 pages

hw0

hw0

2 pages

hw5

hw5

2 pages

hw3

hw3

3 pages

20.mdp

20.mdp

19 pages

19.em

19.em

17 pages

16.svm-2

16.svm-2

16 pages

15.svm-1

15.svm-1

18 pages

14.vc

14.vc

24 pages

9.hmm

9.hmm

28 pages

5.mle

5.mle

16 pages

3.bayes

3.bayes

19 pages

2.dtree

2.dtree

41 pages

1.intro

1.intro

19 pages

21.rl

21.rl

18 pages

CNF-DNF

CNF-DNF

2 pages

ID3

ID3

4 pages

mlHw6

mlHw6

3 pages

MLHW3

MLHW3

4 pages

MLHW4

MLHW4

3 pages

ML-HW2

ML-HW2

3 pages

vcdimCMU

vcdimCMU

20 pages

hw0

hw0

2 pages

hw3

hw3

3 pages

hw2

hw2

2 pages

hw1

hw1

4 pages

9.hmm

9.hmm

28 pages

5.mle

5.mle

16 pages

3.bayes

3.bayes

19 pages

1.intro

1.intro

19 pages

15.svm-1

15.svm-1

18 pages

14.vc

14.vc

24 pages

hw2

hw2

2 pages

hw1

hw1

4 pages

hw0

hw0

2 pages

hw3

hw3

3 pages

9.hmm

9.hmm

28 pages

5.mle

5.mle

16 pages

3.bayes

3.bayes

19 pages

2.dtree

2.dtree

41 pages

1.intro

1.intro

19 pages

Load more
Download 2.dtree
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view 2.dtree and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view 2.dtree 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?