Pitt CS 2750 - Decision trees - D2249681

Home> Schools> University of Pittsburgh> Computer Science (CS) > CS 2750> Decision trees

DOC PREVIEW

Pitt CS 2750 - Decision trees

School name University of Pittsburgh

Course Cs 2750- Machine Learning

Pages 8

This preview shows page 1-2-3 out of 8 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

1CS 2750 Machine LearningCS 2750 Machine LearningLecture 21Milos [email protected] Sennott SquareDecision treesCS 2750 Machine LearningAnnouncement• Term projects:– Reports due on Wednesday, April 21 at 12:30pm– Project presentations: Wednesday, April 21, 12:30-4pm– Example project reports are on the course web site.2CS 2750 Machine LearningDecision trees• Back to the supervised learning • An alternative approach to what we have seen so far:– Partition the input space to regions– Regress or classify independently in every region1110000000111000012x1xCS 2750 Machine LearningDecision trees• The partitioning idea is used in the decision tree model:– Split the space recursively according to inputs in x– Regress or classify at the bottom of the tree03=xxtf01=x02=xttffExample:Binary classification Binary attributes1001010321,, xxx}1,0{3CS 2750 Machine LearningDecision treesHow to construct the decision tree?•Top-bottom algorithm:– Finds the best split condition (quantified based on the impurity measure)– Stops when no improvement possible•Impurity measure:– Measures how well are the two classes separated – Ideally we would like to separate all 0s and 1• Splits of finite vs. continuous value attributesContinuous value attributes conditions: 5.03≤xCS 2750 Machine LearningImpurity measureLet•Impurity measure defines how well are the classes separated• In general the impurity measure should satisfy:– Largest when data are split evenly for attribute values– Should be 0 when all data belong to the same class||||DDpii=|| D- Total number of data entries||iD - Number of data entries classified as i- ratio of instances classified as i classesofnumber1=ip4CS 2750 Machine LearningImpurity measures• There are various impurity measures used in the literature–Entropy based measure (Quinlan, C4.5)– Gini measure (Breiman, CART)∑=−==kiiippDEntropyDI1log)()(∑=−==kiipDGiniDI121)()(0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 100.10.20.30.40.50.60.7Example for k=2CS 2750 Machine LearningImpurity measures• Gain due to split – expected reduction in the impurity measure (entropy example))(||||)(),()(∑∈−=AValuesvvvDEntropyDDDEntropyADGain||vD- a partition of D with the value of attribute A = v03=xxtf003=xxtf01=xtf0)(DEntropy)(DEntropy)(tDEntropy )(fDEntropy5CS 2750 Machine LearningDecision tree learning• Greedy learning algorithm:Repeat until no or small improvement in the purity– Find the attribute with the highest gain– Add the attribute to the tree and split the set accordingly• Builds the tree in the top-down fashion– Gradually expands the leaves of the partially built tree• The method is greedy– It looks at a single attribute and gain in each step– May fail when the combination of attributes is needed to improve the purity (parity functions)CS 2750 Machine LearningDecision tree learning• Limitations of greedy methodsInitial state:1111000000001111||||DDpii=211680==p211681==p6CS 2750 Machine LearningDecision tree learning• Limitations of greedy methods• Complete space:1111000000001111210=p211=pWhat happensin new regions?CS 2750 Machine LearningDecision tree learning• Limitations of greedy methods• Complete space:1111000000001111210=p211=pWhat happensin new regions?210=p211=p210=p211=pNo improvement in the impurity measure !!!7CS 2750 Machine LearningDecision tree learning• Now what happens here if we evaluate this candidate?1111000000001111CS 2750 Machine LearningDecision tree learning• Limitations of greedy methods1111000000001111210=p211=p210=p211=p8CS 2750 Machine LearningDecision tree learning• Limitations of greedy methodsThe combination of two or more attributes improves the impurity1111000000001111CS 2750 Machine LearningDecision tree learningBy reducing the impurity measure we can grow very large treesProblem: Overfitting• We may split and classify very well the training set, but we maydo worse in terms of the generalization error Solutions to the overfitting problem:•Solution 1.– Prune branches of the tree built in the first phase– Use validation set to test for the overfit•Solution 2. – Test for the overfit in the tree building phase– Stop building the tree when performance on the validation set

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3 out of 8 pages.

Pitt CS 2750 - Decision trees

Sign up for free to view:

Please select your school