MSU CSE 847 - Statistical Classification - D2965899

Home> Schools> Michigan State University> Computer Science & Engineering (CSE) > CSE 847> Statistical Classification

DOC PREVIEW

MSU CSE 847 - Statistical Classification

School name Michigan State University

Course Cse 847-

Pages 45

This preview shows page 1-2-3-21-22-23-43-44-45 out of 45 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 45 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 45 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 45 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 45 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 45 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 45 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 45 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 45 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 45 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 45 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Statistical ClassificationClassification ProblemsExamples of Classification ProblemSlide 4Slide 5Slide 6Slide 7Learning from ExamplesSlide 9Slide 10K Nearest Neighbor (kNN) ApproachCross ValidationLeave-One-Out MethodSlide 14Slide 15Slide 16Slide 17Slide 18Probabilistic interpretation of KNNWeighted kNNEstimate 2 in the Weight FunctionSlide 22Slide 23Slide 24Slide 25OptimizationChallenges in OptimizationGradient AscentGradient Ascent (cont’d)Gradient Ascent: Line SearchSlide 31Slide 32ML = Statistics + OptimizationInstance-Based Learning (Ch. 8)When to Consider Nearest Neighbor ?KD Tree for NN SearchNN Search by KD TreeSlide 38Slide 39Slide 40Slide 41Slide 42Slide 43Curse of DimensionalitySlide 45Statistical ClassificationRong JinClassification ProblemsYXf :XInputYOutput?• Given input X={x1, x2, …, xm}• Predict the class label y  Y• Y = {-1,1}, binary class classification problems• Y = {1, 2, 3, …, c}, multiple class classification problems• Goal: need to learn the function: f: X  YExamples of Classification ProblemText categorization:Input features X:Word frequency{(campaigning, 1), (democrats, 2), (basketball, 0), …}Class label y: Y = +1: ‘politics’Y = -1: ‘non-politics’Doc: Months of campaigning and weeks of round-the-clock efforts in Iowa all came down to a final push Sunday, …Topic:PoliticsNon-politicsExamples of Classification ProblemText categorization:Input features X:Word frequency{(campaigning, 1), (democrats, 2), (basketball, 0), …}Class label y: Y = +1: ‘politics’Y = -1: ‘not-politics’Doc: Months of campaigning and weeks of round-the-clock efforts in Iowa all came down to a final push Sunday, …Topic:PoliticsNon-politicsExamples of Classification ProblemImage Classification:Input features XColor histogram{(red, 1004), (red, 23000), …}Class label yY = +1: ‘bird image’ Y = -1: ‘non-bird image’Which images are birds, which are not?Examples of Classification ProblemImage Classification:Input features XColor histogram{(red, 1004), (blue, 23000), …}Class label yY = +1: ‘bird image’ Y = -1: ‘non-bird image’Which images are birds, which are not?Classification ProblemsYXf :XInputYOutput?Doc: Months of campaigning and weeks of round-the-clock efforts in Iowa all came down to a final push Sunday, …Politics Not-politicsf: doc  topicBirds Not-Birdsf: image  topicHow to obtain f ?Learn classification function f from examplesLearning from ExamplesTraining examples: Identical Independent Distribution (i.i.d.)Each training example is drawn independently from the identical sourceTraining examples are similar to testing examples( ) ( ) ( ){ }( )1 1 2 2,1 ,2 ,, , , ,..., ,: the number of training examples: , ,...,: Binary class: { 1,-1} Multiple class: {1, 2,..., }train n ndi i i i diD x y x y x ynx x x xyc=��= +=YYYr r rrLearning from ExamplesTraining examples: Identical Independent Distribution (i.i.d.)Each training example is drawn independently from the identical source( ) ( ) ( ){ }( )1 1 2 2,1 ,2 ,, , , ,..., ,: the number of training examples: , ,...,: Binary class: { 1,-1} Multiple class: {1, 2,..., }train n ndi i i i diD x y x y x ynx x x xyc=��= +=YYYr r rrLearning from ExamplesGiven training examplesGoal: learn a classification function f(x):XY that is consistent with training examplesWhat is the easiest way to do it ?( ) ( ) ( ){ }1 1 2 2, , , ,..., ,train n nD x y x y x y=r r rK Nearest Neighbor (kNN) Approach (k=1)(k=4) How many neighbors should we count ?Cross ValidationDivide training examples into two setsA training set (80%) and a validation set (20%)Predict the class labels of the examples in the validation set by the examples in the training setChoose the number of neighbors k that maximizes the classification accuracyLeave-One-Out MethodFor k = 1, 2, …, KErr(k) = 0;1. Randomly select a training data point and hide its class label2. Using the remaining data and given K to predict the class label for the left data point3. Err(k) = Err(k) + 1 if the predicted label is different from the true labelRepeat the procedure until all training examples are testedChoose the k whose Err(k) is minimalLeave-One-Out MethodFor k = 1, 2, …, KErr(k) = 0;1. Randomly select a training data point and hide its class label2. Using the remaining data and given K to predict the class label for the left data point3. Err(k) = Err(k) + 1 if the predicted label is different from the true labelRepeat the procedure until all training examples are testedChoose the k whose Err(k) is minimalLeave-One-Out MethodFor k = 1, 2, …, KErr(k) = 0;1. Randomly select a training data point and hide its class label2. Using the remaining data and given k to predict the class label for the left data point3. Err(k) = Err(k) + 1 if the predicted label is different from the true labelRepeat the procedure until all training examples are testedChoose the k whose Err(k) is minimal(k=1)Leave-One-Out MethodFor k = 1, 2, …, KErr(k) = 0;1. Randomly select a training data point and hide its class label2. Using the remaining data and given k to predict the class label for the left data point3. Err(k) = Err(k) + 1 if the predicted label is different from the true labelRepeat the procedure until all training examples are testedChoose the k whose Err(k) is minimal(k=1)Err(1) = 1Leave-One-Out MethodFor k = 1, 2, …, KErr(k) = 0;1. Randomly select a training data point and hide its class label2. Using the remaining data and given k to predict the class label for the left data point3. Err(k) = Err(k) + 1 if the predicted label is different from the true labelRepeat the procedure until all training examples are testedChoose the k whose Err(k) is minimalErr(1) = 1Leave-One-Out MethodFor k = 1, 2, …, KErr(k) = 0;1. Randomly select a training data point and hide its class label2. Using the remaining data and given k to predict the class label for the left data point3. Err(k) = Err(k) + 1 if the predicted label is different from the true labelRepeat the procedure until all training examples are testedChoose the k whose Err(k) is minimalErr(1) = 3Err(2) = 2Err(3) = 6k = 2Probabilistic interpretation of KNNEstimate the probability density function Pr(y|x) around the location of xCount of data points in class y in the neighborhood of xBias and variance tradeoffA small neighborhood 

View Full Document