DOC PREVIEW
Pitt CS 2750 - Decision trees

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1CS 2750 Machine LearningCS 2750 Machine LearningLecture 21Milos [email protected] Sennott SquareDecision treesCS 2750 Machine LearningAnnouncement• Term projects:– Reports due on Wednesday, April 21 at 12:30pm– Project presentations: Wednesday, April 21, 12:30-4pm– Example project reports are on the course web site.2CS 2750 Machine LearningDecision trees• Back to the supervised learning • An alternative approach to what we have seen so far:– Partition the input space to regions– Regress or classify independently in every region1110000000111000012x1xCS 2750 Machine LearningDecision trees• The partitioning idea is used in the decision tree model:– Split the space recursively according to inputs in x– Regress or classify at the bottom of the tree03=xxtf01=x02=xttffExample:Binary classification Binary attributes1001010321,, xxx}1,0{3CS 2750 Machine LearningDecision treesHow to construct the decision tree?•Top-bottom algorithm:– Finds the best split condition (quantified based on the impurity measure)– Stops when no improvement possible•Impurity measure:– Measures how well are the two classes separated – Ideally we would like to separate all 0s and 1• Splits of finite vs. continuous value attributesContinuous value attributes conditions: 5.03≤xCS 2750 Machine LearningImpurity measureLet•Impurity measure defines how well are the classes separated• In general the impurity measure should satisfy:– Largest when data are split evenly for attribute values– Should be 0 when all data belong to the same class||||DDpii=|| D- Total number of data entries||iD - Number of data entries classified as i- ratio of instances classified as i classesofnumber1=ip4CS 2750 Machine LearningImpurity measures• There are various impurity measures used in the literature–Entropy based measure (Quinlan, C4.5)– Gini measure (Breiman, CART)∑=−==kiiippDEntropyDI1log)()(∑=−==kiipDGiniDI121)()(0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 100.10.20.30.40.50.60.7Example for k=2CS 2750 Machine LearningImpurity measures• Gain due to split – expected reduction in the impurity measure (entropy example))(||||)(),()(∑∈−=AValuesvvvDEntropyDDDEntropyADGain||vD- a partition of D with the value of attribute A = v03=xxtf003=xxtf01=xtf0)(DEntropy)(DEntropy)(tDEntropy )(fDEntropy5CS 2750 Machine LearningDecision tree learning• Greedy learning algorithm:Repeat until no or small improvement in the purity– Find the attribute with the highest gain– Add the attribute to the tree and split the set accordingly• Builds the tree in the top-down fashion– Gradually expands the leaves of the partially built tree• The method is greedy– It looks at a single attribute and gain in each step– May fail when the combination of attributes is needed to improve the purity (parity functions)CS 2750 Machine LearningDecision tree learning• Limitations of greedy methodsInitial state:1111000000001111||||DDpii=211680==p211681==p6CS 2750 Machine LearningDecision tree learning• Limitations of greedy methods• Complete space:1111000000001111210=p211=pWhat happensin new regions?CS 2750 Machine LearningDecision tree learning• Limitations of greedy methods• Complete space:1111000000001111210=p211=pWhat happensin new regions?210=p211=p210=p211=pNo improvement in the impurity measure !!!7CS 2750 Machine LearningDecision tree learning• Now what happens here if we evaluate this candidate?1111000000001111CS 2750 Machine LearningDecision tree learning• Limitations of greedy methods1111000000001111210=p211=p210=p211=p8CS 2750 Machine LearningDecision tree learning• Limitations of greedy methodsThe combination of two or more attributes improves the impurity1111000000001111CS 2750 Machine LearningDecision tree learningBy reducing the impurity measure we can grow very large treesProblem: Overfitting• We may split and classify very well the training set, but we maydo worse in terms of the generalization error Solutions to the overfitting problem:•Solution 1.– Prune branches of the tree built in the first phase– Use validation set to test for the overfit•Solution 2. – Test for the overfit in the tree building phase– Stop building the tree when performance on the validation set


View Full Document

Pitt CS 2750 - Decision trees

Download Decision trees
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Decision trees and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Decision trees 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?