DOC PREVIEW
UT Dallas CS 6375 - 8.CrossValidation

This preview shows page 1-2-3 out of 9 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

11Machine LearningCS6375 --- Spring 2015aCross-Validation Instructor: Yang Liu2Avoiding Overfitting• We have a choice of different techniques:Decision trees, Nearest neighbors, Bayes classifier, Neural networks …• For each we have different levels of complexity:– Depth of trees– Number of neighbors in K-NN– Number of layers and hidden units– …..• How to choose the right one?• Overfitting: A complex enough model (e.g., large enough trees,..) will always be able to fit the training data well23Example• Construct a predictor of y from x given this training data4Whichmodel isbest forpredicting yfrom x ????35Whichmodel isbest forpredicting yfrom x ????We want the model that generatesthe best predictions on future data.Not necessarily the one with thelowest error on training data6Using a Test Set1. Use a portion (e.g., 30%) of the data astest data2. Fit a model tothe remainingtraining data3. Evaluate theerror on thetest data478Using a Test Set:+ Simple- Wastes a large % of the data- May get lucky with oneparticular subset of the data59“Leave One Out” Cross-Validation• For k=1 to R– Train on all thedata leaving out(xk,yk)– Evaluate erroron (xk,yk)• Report theaverage errorafter trying allthe data points1061112713“Leave One Out” Cross-Validation• For k=1 to R– Train on all thedata leaving out(xk,yk)– Evaluate erroron (xk,yk)• Report theaverage errorafter trying allthe data points“Leave One Out” Cross-Validation:+ Does not waste data+ Average over large number of trials- Expensive14K-Fold Cross-Validation• Randomly dividethe data set into Ksubsets• For each subset S:– Train on the data not in S– Test on the data in S• Return the average error over the K subsetsExample: K = 3, each color corresponds to a subset81516Classification Problems• The exact same approaches apply for cross-validation except that the error is the number of data points that are misclassified.917Example: CV for KNN• For each kNN, evaluate the error using K-fold Cross-Validation• Choose the one with the minimum cross-validation error18Cross-Validation


View Full Document

UT Dallas CS 6375 - 8.CrossValidation

Documents in this Course
ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

hw2

hw2

2 pages

hw1

hw1

4 pages

hw0

hw0

2 pages

hw5

hw5

2 pages

hw3

hw3

3 pages

20.mdp

20.mdp

19 pages

19.em

19.em

17 pages

16.svm-2

16.svm-2

16 pages

15.svm-1

15.svm-1

18 pages

14.vc

14.vc

24 pages

9.hmm

9.hmm

28 pages

5.mle

5.mle

16 pages

3.bayes

3.bayes

19 pages

2.dtree

2.dtree

41 pages

1.intro

1.intro

19 pages

21.rl

21.rl

18 pages

CNF-DNF

CNF-DNF

2 pages

ID3

ID3

4 pages

mlHw6

mlHw6

3 pages

MLHW3

MLHW3

4 pages

MLHW4

MLHW4

3 pages

ML-HW2

ML-HW2

3 pages

vcdimCMU

vcdimCMU

20 pages

hw0

hw0

2 pages

hw3

hw3

3 pages

hw2

hw2

2 pages

hw1

hw1

4 pages

9.hmm

9.hmm

28 pages

5.mle

5.mle

16 pages

3.bayes

3.bayes

19 pages

2.dtree

2.dtree

41 pages

1.intro

1.intro

19 pages

15.svm-1

15.svm-1

18 pages

14.vc

14.vc

24 pages

hw2

hw2

2 pages

hw1

hw1

4 pages

hw0

hw0

2 pages

hw3

hw3

3 pages

9.hmm

9.hmm

28 pages

5.mle

5.mle

16 pages

3.bayes

3.bayes

19 pages

2.dtree

2.dtree

41 pages

1.intro

1.intro

19 pages

Load more
Download 8.CrossValidation
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view 8.CrossValidation and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view 8.CrossValidation 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?