DOC PREVIEW
MSU CSE 847 - online
Course Cse 847-
Pages 32

This preview shows page 1-2-15-16-31-32 out of 32 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Slide 1Batch LearningOnline LearningObjectiveLoss FunctionsLinear ClassifiersSeparable SetInseparable SetsWhy Online Learning?Update RulesPerceptronGeometrical InterpretationMistake Bound: Separable CaseMistake Bound: Separable CaseMistake Bound: Inseparable CaseMistake Bound: Inseparable CaseMistake Bound: Inseparable CasePerceptron with ProjectionRemarksPerceptronAggressive PerceptronRegret BoundLearning a ClassifierLearning with Expert AdviceHedgeHedge AlgorithmHedge AlgorithmMistake BoundMistake BoundMistake BoundMistake BoundMistake BoundOnline LearningRong JinBatch Learning•Given a collection of training examples D•Learning a classification model from D•What if training examples are received one at each time ?Online LearningFor t=1, 2, … T •Receive an instance•Predict its class label•Receive the true class label•Encounter loss•Update the classification model4Objective•Minimize the total loss•Loss function•Zero-One loss:•Hinge loss:5Loss Functions11Zero-One LossHinge Loss6•Restrict our discussion to linear classifier•Prediction:•Confidence: Linear Classifiers7Separable Set8Inseparable Sets9Why Online Learning?FastMemory efficient - process one example at a timeSimple to implementFormal guarantees – Regret/Mistake bounds Online to Batch conversionsNo statistical assumptionsAdaptiveNot as good as a well designed batch algorithms10Update Rules•Online algorithms are based on an update rule which defines from (and possibly other information)•Linear Classifiers : find from based on the inputSome Update Rules :–Perceptron (Rosenblat)–ALMA (Gentile)–ROMMA (Li & Long)–NORMA (Kivinen et. al)–MIRA (Crammer & Singer)–EG (Littlestown and Warmuth)–Bregman Based (Warmuth)PerceptronInitialize For t=1, 2, … T •Receive an instance•Predict its class label•Receive the true class label•If then12Geometrical InterpretationMistake Bound: Separable Case•Assume the data set D is linearly separable with margin , i.e., •Assume•Then, the maximum number of mistakes made by the Perceptron algorithm is bounded byMistake Bound: Separable CaseMistake Bound: Inseparable Case•Let be the best linear classifier•We measure our progress by•Consider we make a mistake forMistake Bound: Inseparable Case• Result 1:Mistake Bound: Inseparable Case•Result 2Perceptron with ProjectionInitialize For t=1, 2, … T •Receive an instance•Predict its class label•Receive the true class label•If then•If then19Remarks•Mistake bound is measured for a sequence of classifiers•Bound does not depend on dimension of the feature vector•The bound holds for all sequences (no i.i.d. assumption). •It is not tight for most real world data. But, it can not be further improved in general.PerceptronInitialize For t=1, 2, … T •Receive an instance•Predict its class label•Receive the true class label•If thenConservative: updates the classifier only when it misclassifiesAggressive PerceptronInitialize For t=1, 2, … T •Receive an instance•Predict its class label•Receive the true class label•If thenRegret BoundLearning a Classifier•The evaluation (mistake bound or regret bound) concerns a sequence of classifiers•But, by the end of the day, which classifier should used ? The last? By Cross Validation ?Learning with Expert Advice•Learning to combine the predictions from multiple experts•An ensemble of d experts: •Combination weights:•Combined classifierHedge Simple Case•There exists one expert, denoted by , who can perfectly classify all the training examples•What is your learning strategy ? Difficult case•What if we don’t have such a perfect expert ?Hedge Algorithm +1 -1 +1+1Hedge AlgorithmInitialize For t=1, 2, … T •Receive a training example•Prediction •If thenFor i=1, 2, …, d•If thenMistake BoundMistake Bound •Measure the progress•Lower boundMistake Bound•Upper boundMistake Bound•Upper boundMistake


View Full Document

MSU CSE 847 - online

Course: Cse 847-
Pages: 32
Download online
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view online and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view online 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?