Unformatted text preview:

Testing classifier accuracy cse634OverviewIntroductionTraining and Testing Training and TestingResubstitution (N ; N)Re-substitution Error RateRe-substitution Error RateWhy not always 0%?Training and test set Training and test setTraining and testingHoldout Holdout (N- N/3 ; N/3)Repeated Holdoutk-fold cross-validation (N-N/k ; N/k)k-fold cross-validation predictive accuracy computaionStratified cross-validation10 folds cross-validationImprove cross-validationA particular form of cross-validationLeave-one-out (N-1 ; 1)Leave-one-out (N-1 ; 1)Leave-one-out ProcedureLeave-one-out (N-1 ; 1)Testing classifier accuracy cse634Professor Anita WasilewskaOverview• Introduction• Basic Concept on Training and Testing • Resubstitution (N ; N)• Holdout (2N/3 ; N/3)• x-fold cross-validation (N-N/x ; N/x)• Leave-one-out (N-1 ; 1)IntroductionPredictive Accuracy EvaluationThe main methods of predictive accuracy evaluations are:• Resubstitution (N ; N)• Holdout (2N/3 ; N/3)• x-fold cross-validation (N-N/x ; N/x)• Leave-one-out (N-1 ; 1)where N is the number of records (instances) in the datasetTraining and Testing• REMEMBER: we must know the classification (class attribute values) of all instances (records) used in the test procedure.• Basic ConceptsSuccess: instance (record) class is predicted correctlyError: instance class is predicted incorrectlyError rate: a percentage of errors made over the whole set of instances (records) used for testingPredictive Accuracy: a percentage of well classified data in the testing data set.Training and Testing• Example:Testing Rules (testing record #1) = record #1.class - SuccTesting Rules (testing record #2) not= record #2.class - ErrorTesting Rules (testing record #3) = record #3.class - SuccTesting Rules (testing record #4) = instance #4.class - SuccTesting Rules (testing record #5) not= record #5.class - ErrorError rate: 2 errors: #2 and #5Error rate = 2/5=40%Predictive Accuracy: 3/5 = 60%Resubstitution (N ; N)Re-substitution Error Rate• Re-substitution error rate is obtained from training data• Training Data Error: uncertainty of the rules• The error rate is not always 0%, but usually (and hopefully) very low!• Resubstitution error rate indicates only how good (bad ) are our results (rules, patterns, NN) on the TRAINING data; expresses some knowledge about the algorithm used.Re-substitution Error Rate• Re-substitution Error Rate is usually used as the performance measure:The training error rate reflects imprecision of the training results: the lower, the betterIn the case of rules it is called rules accuracy.Predictive accuracy reflects how good are the training results with respect to the test data: the higher, the better(N:N) re-substitution does not compute predictive accuracy• Re-substitution error rate = training data error rateWhy not always 0%?• The error/error rate on the training data is not always 0% because algorithms involve different (often statistical) parameters and measures that lead to uncertainties• It is used for “parameters tuning”• The error on the training data is NOT a good indicator of performance on future data since it does not measure any not yet seen data.• Solution:Split data into training and test setTraining and test set• Training and Test data may differ in nature,but must have the same format.Example:Given customer data from two different towns A and B.We train the classifier with the data from town A and we test it on data from town B, and vice-versaTraining and test set• It is important that the test data is not used in any way to create the training rules• In fact, learning schemes operate in three stages:Stage 1: build the basic structure (training)Stage 2: optimize parameter settings; can use (N:N) re-substitution (parameter tuning)Stage 3: use test data to compute predictive accuracy/error rateProper procedure uses three sets: training data, validation data and test data• validation data is used for parameter tuning, not test data; validation data can be the training data, or a subset of training data.The test data cannot be used for parameter tuning!Training and testing• Generally, the larger is the training set, the better is the classifier • The larger the test data the more accurate the predictive accuracy, or error estimation• Remember: the error rate of Re-substitution(N;N) can tell us ONLY whether the algorithm used in the training is good or not, or how good it is.• Holdout procedure: a method of splitting original data into training and test set• Dilemma: ideally both training and test set should be large! What to do if the amount of data is limited?• How to split the data into training and test subsets (disjoint)?HoldoutHoldout (N- N/3 ; N/3)• The holdout method reserves a certain amount of data for testing and uses the remainder for training – so they are disjoint! • Usually, one third (N/3) of data is used for testing, and the rest (N -N/3) = (2N/3) for training• The choice of records for the train and testdata is essential, so we usually perform a cycle:Train-and-test; repeatRepeated Holdout• Holdout can be made more reliable by repeating the process with different sub- samples (subsets of data):1. In each iteration, a certain proportion is randomly selected for training, the rest of data is used for testing2. The error rates or predictive accuracy on the different iterations are averaged to yield an overall error rate, or predictive accuracy • Repeated holdout still not optimum: the different test sets overlapk-fold cross-validation (N-N/k ; N/k)• The cross-validation is used to prevent the overlap of the test sets.• first step: split data into k disjoint subsets • D1, … Dk, of equal size, called folds.• second step: use each subset in turn for testing, the remainder for training.• Training and testing is performed k times.• Each sample (record) is used the same number of times for training and once for testing.k-fold cross-validation predictive accuracy computaion• The predictive accuracy estimate is the overall number of correct classifications from all iterations, divided by the total number of records in the initial dataStratified cross-validation• In the stratified cross-validation, the folds are stratified; i.e. • The class distribution of the tuples (records) in each fold is approximately the same as in the initial


View Full Document
Download Testing classifier accuracy
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Testing classifier accuracy and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Testing classifier accuracy 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?