Machine Learning Decision Trees Overfitting Recommended reading Mitchell Chapter 3 Machine Learning 10 701 Tom M Mitchell Center for Automated Learning and Discovery Carnegie Mellon University September 13 2005 Machine Learning Study of algorithms that improve their performance at some task with experience Learning to Predict Emergency C Sections Sims et al 2000 9714 patient records each with 215 features Object Detection Prof H Schneiderman Example training images for each orientation Text Classification Company home page vs Personal home page vs Univeristy home page vs Reading a noun vs verb Rustandi et al 2005 Growth of Machine Learning Machine learning is preferred approach to Speech recognition Natural language processing Computer vision Medical outcomes analysis Robot control This trend is accelerating Improved machine learning algorithms Improved data capture networking faster computers Software too complex to write by hand New sensors IO devices Demand for self customization to user environment Decision tree learning How would you represent AB CD E Each internal node test one attribute Xi Each branch from a node selects one value for Xi Each leaf node predict Y or P Y X leaf ID3 C4 5 node Root Entropy Entropy H X of a random variable X H X is the expected number of bits needed to encode a randomly drawn value of X under most efficient code Why Information theory Most efficient code assigns log2P X i bits to encode the message X i So expected number of bits is Sample Entropy Assume X values known labels Y encoded What you should know Well posed function approximation problems Instance space X Sample of labeled training data D xi yi Hypothesis space H f X Y Learning is a search optimization problem over H Various objective functions Today minimize training error 0 1 loss Decision tree learning Greedy top down learning of decision trees ID3 C4 5 Overfitting and tree rule post pruning Extensions
View Full Document