Optimizing F-Measure

Home> Academic Documents> Optimizing F-Measure

DOC PREVIEW

This preview shows page 1-2-3-4-5-6 out of 18 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 18 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 18 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 18 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 18 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 18 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 18 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 18 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Optimizing F-Measure with Support Vector MachinesOverviewRoadmapThe Classification ProblemSlide 5Misclassification Count SVMApprox Misclassification Count SVMStandard “Soft Margin” SVMWeighted Standard SVMMeasures of successF-measureConstructing an F-measure SVMSlide 13The F-measure Maximizing SVMWeighted misclassification count SVMImplications of resultSlide 17Conclusions / SummaryOptimizing F-Measurewith Support Vector MachinesDavid R. MusicantVipin KumarAysel OzgurFLAIRS 2003Tuesday, May 13, 2003Carleton CollegeSlide 2OverviewClassification algorithms often evaluated by test set accuracyTest set accuracy can be a poor measure when one of the classes is rareSupport Vector Machines (SVMs) are designed to optimize test set accuracySVMs have been used in an ad-hoc manner on datasets with rare classesOur new results: current ad-hoc heuristic techniques can be theoretically justified.Slide 3RoadmapThe Traditional SVM and variantsPrecision, Recall, and F-measure metricsThe F-measure Maximizing SVMEquivalence of traditional SVM and F-measure SVM (for the right parameters)Implications and ConclusionsSlide 4The Classification ProblemSeparating Surface:A+A-•= “margin”Slide 5The Classification ProblemGiven m points in the n dimensional space RnEach point represented as xiMembership of each point Ai in the classes A+ or A- is specified by yi = § 1Separate by two bounding planes such that:More succinctly:for i=1,2,…,m.Slide 6Misclassification Count SVM(¢)* is the step function (1 if  > 0, 0 otherwise)“Push the planes apart, and minimize number of misclassified points.” –C balances two competing objectives–Minimizing w 0 w pushes planes apart–Problem NP-complete, objective non-differentiableSlide 7  > 0 is an arbitrary fixed constant that determines closeness of approximation.This is still difficult to solve.Approx Misclassification Count SVM•where we use some differentiable approximation, such asSlide 8Standard “Soft Margin” SVM“Push the planes apart, and minimize distance of misclassified points.” We minimize total distances from misclassified points to bounding planes, not actual number of them.Much more tractable, does quite well in optimizing accuracyDoes poorly when one class is rareSlide 9Weighted Standard SVM“Push the planes apart, and minimize weighted distance of misclassified points.” Allows one to choose different C values for the two classes.Often used to weight rare class more heavily.How do we measure success when one class is rare? Assuming that A+ is the rare class…Slide 10Measures of successPrecision and Recall are better descriptors when one class is rare.Slide 11F-measureF-measure: commonly used “average” of precision and recallCan C+ and C- in the weighted SVM be balanced to optimize F-measure?Can we start over and invent an SVM to optimize F-measure?Slide 12Constructing an F-measure SVMHow do we appropriately represent F-measure in an SVM?Substitute P and R into F:Thus to maximize F-measure, we minimizeSlide 13Want to minimizeFP = # misclassified A-FN = # misclassified A+New F-measure maximizing SVM:Constructing an F-measure SVMSlide 14The F-measure Maximizing SVMApproximate with sigmoid:Can we connect with standard SVM?Slide 15Weighted misclassification count SVMHow do these two formulations relate?We show:–Pick a parameter C.–Find classifier to optimize F-measure SVM.–There exist parameters C+ and C- such that misclassification counting SVM has same solution.–Proof and formulas to obtain C+ and C- in paper.F-measure maximizing SVMSlide 16Implications of resultSince there exist C+, C- to yield same solution as F-measure maximizing SVM, finding best C+ and C- for the weighted standard SVM is “the right thing to do.”(modulo approximations)In practice, common trick is to choose C+, C- such that:This heuristic seems reasonable but is not optimal. (Good first guess?)Slide 17Implications of resultSuppose that SVM fails to provide good F-measure for a given problem, for a wide range of C+ and C- values.Q: Is there another SVM formulation that would yield better F-measure?A: Our evidence suggests not. Q: Is there another SVM formulation that would find best possible F-measure more directly?A: Yes, the F-measure maximizing SVM.Slide 18Conclusions / SummaryWe provide theoretical evidence that standard heuristic practices in using SVMs for optimizing F-measure are reasonable.We provide a framework for continued research in F-measure maximizing SVMs.All our results apply directly to SVMs with kernels (see paper).Future work: attacking F-measure maximizing SVM directly to find faster


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-4-5-6 out of 18 pages.

Please select your school