New version page

UCSD CSE 252C - The Geometry of ROC Space

Upgrade to remove ads

This preview shows page 1-2-3-4-5-6 out of 19 pages.

Save
View Full Document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience

Upgrade to remove ads
Unformatted text preview:

The Geometry of ROC Space:Understanding Machine LearningMetrics through ROC Isometrics,by Peter A. FlachInternational Conference onMachine Learning (ICML-2003)http://www.cs.bris.ac.uk/Publications/Papers/1000704.pdfRobin HewittRelevance for Computer VisionThe application context is visual object recognition in a social robot.Robots and builders: Bruce with Leaf, Alex with Rocky, and Robin with Mabel.How can we translate user feedback into improved recognition behavior?Relevance for Computer VisionHighLowRecognition ThresholdCan we provide simple, intuitive tuning for a complex recognition system?Yes Mabel, that’s right!No Mabel, that’s not me!With more positive and negative examples, more sophisticated recognition algorithms can be deployed.When should transitions occur?Methods based in ROC-curve analysis provide tools for• Creating optimal hybrid classifiers• Transitioning between classifiers• Providing a simple, one-knob interface for “tuning” a complex, hybrid classifier.• Separating use context – cost and class distribution – from classifier design.Relevance for Computer VisionROC CurvesXnoise levelevent levelp(x)5HFHLYHU &KDUDFWHULVWLFV6LJQDO 5HFHLYHU6LJQDO (YHQWEventNoiseNoiseNoiseEventXWLPH5HFHLYHU·V ,QWHUQDO ; /HYHOVROC stands for Receiver Operating Characteristics.The terms and concepts come from signal-detection theory. ROC CurvesEach point on a receiver’s ROC curve represents one cutoff threshold, X*. The red hatched area corresponds to the False Positive Rate. The green hatched area corresponds to the True Positive Rate. The ROC curve portrays (FPR, TPR) values at all possible threshold levels for a classifier.Xp(x)X*noise event0 101)DOVH 3RVLWLY H 5DWH7UXH 3RVLWLYH 5DWHX*ROC CurvesDiscriminability, d, is the distance between the two distributions – the distribution for the value of X when there’s no signal and the distribution for the value of X when there is a signal.Xp(x)noise levelevent levelEventNoiseNoiseNoiseEventXWLPH5HFHLYHU·V ,QWHUQDO ; /HYHOVThe (0,1) point represents the Perfect Classifier. It’s never wrong.0 101)DOVH 3RVLWLY H 5DWH7UXH 3RVLWLYH 5DWH(0,1)The ascending diagonal (0,0) to (1,1) represents d= 0. The internal levels for Xin such a classifier carry no information for distinguishing between signal and noise.Curves for real classifiers fall somewhere in between.Any point that falls below the ascending diagonal can be mirrored across the point (0.5, 0.5). The information is there, but the class labels are reversed.0.5XX(1,1)ROC CurvesROC CurvesTraditional practice has been to assume every classifier’s internal signal and noise distributions are Gaussians with a shared standard deviation.,21σµµ−=′dIn that case, so all distribution pairs could be rescaled to astandard height, 0, and discriminibility viewed as the absolute difference between rescaled means:0210σµµσσ−=′dThe difference between classifiers would then reduce to the linear distance between the rescaled means.0 101)DOVH 3RVLWLY H 5DWH7UXH 3RVLWLYH 5DWHROC CurvesIf this assumption about classifiers were true, for every pair of classifiers, the one with the larger distance between the rescaled means would have a higher TPR value at every FPR value.X*In this case, all possible ROC curves would nest.ROC MetricsUnder this assumptioní Classifier accuracy has only one degree of freedom (1DOF)í Area Under the Curve (AUC) is a sufficient metric for classifiers.In fact, this is exactly how ROC curves have traditionally been used. AUC is used to compare classifiers. It ranges from 0.5 along the ascending diagonal to 1.0 for the Perfect Classifier.The Gini coefficient is similar: GINI = 2AUC – 1.0 101)DOVH 3RVLWLY H 5DWH7UXH 3RVLWLYH 5DWHAUC = 0.5AUC = 1.00 101)DOVH 3RVLWLY H 5DWH7UXH 3RVLWLYH 5DWHROC MetricsHere are two classifiers with ROC curves that don’t nest. Which is better?What if the normal-distribution-equal-sigma assumption for a classifier isn’t valid?AUC value can’t answer that. In this case, they have equal AUC. But even if their AUC values differed, the fact that the curves cross one another means that a 1DOF metric isn’t sufficient.Inherent DimensionalityTrue PositivesFalse PositivesTrue NegativesFalse Negatives$VVLJQHG &ODVV$FWXDO &ODVVPPNNConfusion Matrix DefinitionsPOS = Number of positive examples.NEG = Number of negative examples.TPR = True Positive Rate = TP/POS.FPR = False Positive Rate = FP/NEG.TPR, FPR, and NEG/POS.(Flach uses pos = POS/(POS+NEG).)3DOF TotalThe overall space for quality metrics is three dimensional.Inherent DimensionalityTrue PositivesFalse PositivesTrue NegativesFalse Negatives$VVLJQHG &ODVV$FWXDO &ODVVPPNNROC curves are two dimensional: TPR v. FPR.A confusion matrix plots as one point on an ROC curve.0 101)DOVH 3RVLWLY H 5DWH7UXH 3RVLWLYH 5DWHConfusion Matrix TPR, FPR SpaceXROC Metrics – CostDimensionality analysis indicates cost is a 3D function. To understand its shape, we find the equation for iso-cost curves in ROC-curve (FPR, TPR) space. This is a level-curve problem.Flach merges cost and class distribution as skew. Total estimated cost is, however, itself a potential classifier metric. It’s also a convenient starting point for analyzing the geometry of 3D ROC metrics. Letcfp= false positive cost,cfn= false negative cost.Ctot= total expected cost,Lc(FPR) = level curve for cost,Find()()0)(,=FPRdFPRLFPRdCctotROC Metrics – CostFNcFPcCfnfptot+=()).1( , TPRPOScNEGFPRcTPRFPRCfnfptot−+•=so,Find()()0)(,=FPRdFPRLFPRdCctot()()0)()()()()()(,=∂∂+∂∂=FPRddLTPRCFPRdFPRdFPRCFPRdFPRLFPRdCctottotctotBy the chain rule,0)(=•−•FPRddLPOScNEGccfnfpTotal costReplace FP and FN with FPR and TPR,NEGFPFPR =POSTPTPR =,TPPOSFN−=( ).POScNEGcFPRddLfnfpc=ROC Metrics – Cost0)(=•−•FPRddLPOScNEGccfnfp.POScNEGcmfnfp=Use context determines the four values on the righthand side of this expression. For a given use context then, iso-cost curves in (FPR,TPR) space are lines with slope, The value of Ctotfor a level curve is specified by its TPR intercept as()).1(,0 TPRPOScTPRCfntot−=For a given slope, choosing the iso-cost line with the largest-possible TPR intercept minimizes Ctot. ROC Metrics – Cost0 101)DOVH 3RVLWLY H 5DWH7UXH 3RVLWLYH 5DWH0 101)DOVH 3RVLWLY H 5DWH7UXH 3RVLWLYH 5DWHA family of iso-cost lines with identical slope, m1. The line with highest TPR intercept that’s tangent to an ROC curve minimizes total expected cost in this use context. In this situation, the classifier


View Full Document
Download The Geometry of ROC Space
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view The Geometry of ROC Space and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view The Geometry of ROC Space 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?