Unformatted text preview:

1CS 2750 Machine LearningCS 2750 Machine LearningLecture 19Milos [email protected] Sennott SquareDecision treesCS 2750 Machine LearningAnnouncement• Term project:– Reports due on Wednesday, April 23 at 2pm– Project presentations: • When: Friday, April 25, 2003 at 1pm• Where: 5313 Sennott Square• 10 minutes ppt presentations– Example project reports are on the course web site.2CS 2750 Machine LearningDecision trees• Back to the supervised learning • An alternative approach to what we have seen so far:– Partition the input space to regions– Regress or classify independently in every region1110000000111000012x1xCS 2750 Machine LearningDecision trees• The partitioning idea is used in the decision tree model:– Split the space recursively according to inputs in x– Regress or classify at the bottom of the tree03=xxtf01=x02=xttffExample:Binary classification Binary attributes1001010321,, xxx}1,0{3CS 2750 Machine LearningDecision treesHow to construct the decision tree?•Top-bottom algorithm:– Find the best split condition (quantified based on the impurity measure)– Stops when no improvement possible•Impurity measure:– Measures how well are the two classes separated – Ideally we would like to separate all 0s and 1• Splits of finite vs. continuous value attributesContinuous value attributes conditions: 5.03≤xCS 2750 Machine LearningImpurity measureLet•Impurity measure defines how well e classes are separated• In general the impurity measure should satisfy:– Largest when data are split evenly for attribute values– Should be 0 when all data belong to the same class||||DDpii=|| D- Total number of data entries||iD - Number of data entries classified as i- ratio of instances classified as i classesofnumber1=ip4CS 2750 Machine LearningImpurity measures• There are various impurity measures used in the literature–Entropy based measure (Quinlan, C4.5)– Gini measure (Breiman, CART)∑=−==kiiippDEntropyDI1log)()(∑=−==kiipDGiniDI121)()(0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 100.10.20.30.40.50.60.7Example for k=2CS 2750 Machine LearningImpurity measures• Gain due to split – expected reduction in the impurity measure (entropy example))(||||)(),()(∑∈−=AValuesvvvDEntropyDDDEntropyADGain||vD- a partition of D with the value of attribute A = v03=xxtf003=xxtf01=xtf0)(DEntropy)(DEntropy)(tDEntropy )(fDEntropy5CS 2750 Machine LearningDecision tree learning• Greedy learning algorithm:Repeat until no or small improvement in the purity– Find the attribute with the highest gain– Add the attribute to the tree and split the set accordingly• Builds the tree in the top-down fashion– Gradually expands the leaves of the partially built tree• The method is greedy– It looks at a single attribute and gain in each step– May fail when the combination of attributes is needed to improve the purity (parity functions)CS 2750 Machine LearningDecision tree learning• Limitations of greedy methodsCases in which a combination of two or more attributes improves the impurity11110000000011116CS 2750 Machine LearningDecision tree learningBy reducing the impurity measure we can grow very large treesProblem: Overfitting• We may split and classify very well the training set, but we maydo worse in terms of the generalization error Solutions to the overfitting problem:•Solution 1.– Prune branches of the tree built in the first phase– Use validation set to test for the overfit•Solution 2. – Test for the overfit in the tree building phase– Stop building the tree when performance on the validation set deteriorates CS 2750 Machine LearningMixture of experts• Clustering before classification /regression:– The reduction is not tuned towards the prediction task – Two or more clusters may be covered by a simple predictor•Solution:– Cover different input regions with many (simple) networks– A kind of predictive clustering with regard to the prediction accuracy•Mixture of expertsExpert = network learnerx7CS 2750 Machine LearningMixture of experts• Gating network : decides what expert to useExpert 1Expert 2Expert kkgxGatingnetworky. . .2g1gkggg ,...,21- gating


View Full Document

Pitt CS 2750 - Machine Learning

Download Machine Learning
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Machine Learning and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Machine Learning 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?