DOC PREVIEW
UT Dallas CS 6375 - AdaBoosting2

This preview shows page 1 out of 2 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 2 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 2 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Real AdaBoostingThere are many extensions to AdaBoosting. One extension that was found to outperform standard Ad-aBoosting in many practical cases is described here. As in the previous case we assume access to weakclassifiers hithat produce as output +1 or −1 values.Input: the training data, given as N pairs (xi, yi), where xiis the attributes vector, and yiis the desiredoutput, either +1 or −1. The number of iterations T .Output: A function fT(x) that can be used to classify an attributes vector x.If fT(x) < 0 classify x as −1. If fT(x) > 0 classify x as 1.Initialization: Associate a probability pi=1Nwith the ith example.Iterate: For t = 1, . . . T compute the hypothesis ht, the function gt, and an update to the probabilitiesp1, . . . , pNby the following steps:a. Select (at random with replacements) a subset Stof the training examples. The ith example isselected with probability pi.b. For each classifier hjcompute the following values:P+r(j) =Ppiwith summation is over examples xisatisfying hj(xi) = 1, yi= 1P−r(j) =Ppiwith summation is over examples xisatisfying hj(xi) = −1, yi= −1P+w(j) =Ppiwith summation is over examples xisatisfying hj(xi) = −1, yi= 1P−w(j) =Ppiwith summation is over examples xisatisfying hj(xi) = 1, yi= −1G(j) =qP+r(j)P−w(j) +qP+w(j)P−r(j)Choose as hta classifier hjthat minimizes G(j). If G(t) ≥ 0.5 go back to Step a.c. Calculate the weights c+t, c−tand the function gtas follows:c+t=12lnP+r(t) + P−w(t) + , c−t=12lnP+w(t) + P−r(t) + , gt(x) =(c+tif ht(x) = 1c−tif ht(x) = −1where  is a small positive number. For large values of N the recommended value is: =14Nwhere N is the sample sized. Update the probabilities.new pre-normalized pi= pie−yigt(xi)normalized pi=pre-normalized piZtwhere Ztis a normalization factor chosen so thatPipi= 1.Termination:fT(x) =TXt=1gt(x)As with standard AdaBoosting we can show that ft+1performs better than ftin terms of an upper boundon the error in classifying the training data. To measure the algorithm performance define the followingerror indicator:wrong(f, xi, yi) =(1 if f does not classify xias yi0 if f classifies xias yiThe fractional error of fTis: ET=PNi=1wrong(fT,xi,yi)NTheorem:ET≤TYt=12qP+r(t)P−w(t) +qP+w(t)P−r(t)Proof: First observe that the following inequality holds: wrong(fT, xi, yi) ≤ e−yifT(xi). Therefore:ET≤1NNXi=1e−yifT(xi)=1NNXi=1e−yiPTt=1gt(xi)(1)Let ptidenote the value of piat the beginning of iteration t. Then:pT +1i=pTie−yigT(xi)ZT=pT −1ie−yigT −1(xi)e−yigT(xi)ZTZT −1=p1ie−yiPTt=1gt(xi)QTt=1Zt=1Ne−yiPTt=1gt(xi)QTt=1ZtCombining this with (1) we have:ET≤NXi=1pT +1iTYt=1Zt=TYt=1Zt(2)This bound on ETis in terms of the Zt. It remains to express it in terms of the P values. Define:Qt(i) = pie−yigt(xi). Then:Zt=NXi=1Qt(i) =Xht(xi) = 1yi= 1Qt(i) +Xht(xi) = −1yi= −1Qt(i) +Xht(xi) = −1yi= 1Qt(i) +Xht(xi) = 1yi= −1Qt(i)= e−c+tP+r(t) + ec−tP−r(t) + e−c−tP+w(t) + ec+tP−w(t)Direct computation shows that Ztis minimized for c+t=12lnP+r(t)P−w(t), and c−t=12lnP+w(t)P−r(t), which gives:Zt= 2qP+r(t)P−w(t) +qP+w(t)P−r(t)SmoothingWith the above formulas for c+t, c−tit may happen that some of the terms are too small or even zero, inwhich case the c values will be very large or infinite. In practice, such large predictions may cause numericalproblems as well as overfitting. The alternative smoothed formulas are: c+t=12lnP+r(t)+P−w(t)+, c−t=12lnP+w(t)+P−r(t)+,where it can be shown that with  < 1/2N this has a minimal effect on the Z values.Reference:R. E. Schapire and Y. Singer. Improved Boosting Algorithms Using Confidence-rated Predictions,Journal of Machine Learning, vol 37,


View Full Document

UT Dallas CS 6375 - AdaBoosting2

Documents in this Course
ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

hw2

hw2

2 pages

hw1

hw1

4 pages

hw0

hw0

2 pages

hw5

hw5

2 pages

hw3

hw3

3 pages

20.mdp

20.mdp

19 pages

19.em

19.em

17 pages

16.svm-2

16.svm-2

16 pages

15.svm-1

15.svm-1

18 pages

14.vc

14.vc

24 pages

9.hmm

9.hmm

28 pages

5.mle

5.mle

16 pages

3.bayes

3.bayes

19 pages

2.dtree

2.dtree

41 pages

1.intro

1.intro

19 pages

21.rl

21.rl

18 pages

CNF-DNF

CNF-DNF

2 pages

ID3

ID3

4 pages

mlHw6

mlHw6

3 pages

MLHW3

MLHW3

4 pages

MLHW4

MLHW4

3 pages

ML-HW2

ML-HW2

3 pages

vcdimCMU

vcdimCMU

20 pages

hw0

hw0

2 pages

hw3

hw3

3 pages

hw2

hw2

2 pages

hw1

hw1

4 pages

9.hmm

9.hmm

28 pages

5.mle

5.mle

16 pages

3.bayes

3.bayes

19 pages

2.dtree

2.dtree

41 pages

1.intro

1.intro

19 pages

15.svm-1

15.svm-1

18 pages

14.vc

14.vc

24 pages

hw2

hw2

2 pages

hw1

hw1

4 pages

hw0

hw0

2 pages

hw3

hw3

3 pages

9.hmm

9.hmm

28 pages

5.mle

5.mle

16 pages

3.bayes

3.bayes

19 pages

2.dtree

2.dtree

41 pages

1.intro

1.intro

19 pages

Load more
Download AdaBoosting2
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view AdaBoosting2 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view AdaBoosting2 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?