DOC PREVIEW
Berkeley COMPSCI 294 - Learning to Localize Objects with Structured Output Regression

This preview shows page 1-2-3-20-21-40-41-42 out of 42 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Learning to Localize Objectswith Structured Output RegressionMatthew B. Blaschko and Christoph H. LampertMax Planck Institute for Biological CyberneticsT¨ubingen, GermanyOctober 13th, 2008Object (Category) RecognitionIs there a cow in this picture?Binary classification problem: classifiers do the job.Object (Category) RecognitionIs there a cow in this picture?Binary classification problem: classifiers do the job.Object (Category) RecognitionIs there a cow in this picture?Binary classification problem: classifiers do the job.Object (Category) RecognitionIs there a cow in this picture?Binary classification problem: classifiers do the job.Object (Category) RecognitionIs there a cow in this picture?Binary classification problem: classifiers do the job.Object (Category) RecognitionIs there a cow in this picture?Binary classification problem: classifiers do the job.Object (Category) RecognitionIs there a cow in this picture?Binary classification problem: classifiers do the job.Object (Category) RecognitionIs there a cow in this picture?Binary classification problem: classifiers do the job.Object (Category) LocalizationWhere in the picture is the cow?Object (Category) LocalizationWhere in the picture is the cow?center pointObject (Category) LocalizationWhere in the picture is the cow?center point segmentationObject (Category) LocalizationWhere in the picture is the cow?center point segmentationoutlineObject (Category) LocalizationWhere in the picture is the cow?center point segmentationoutline bounding boxMany possible answers, none is binary.How can we build a trainable system to localize objects?From Classification to Localization: Sliding WindowMost common: binary training + sliding window.Train a binary classifier f : { all images } → R.training imagesIpositive: object class imagesInegative: background regionstrain a classifier, e.g.Isupport vector machine,IViola-Jones cascade, . . .decision function f : {images} → RIf > 0 means “image shows the object.”If < 0 means “image does not show the object.”Use f in sliding window procedure:apply f to many subimagessubimage with largest score is object locationProblems of Sliding Window LocalizationBinary training + sliding window is inconsistent:1) We learn for the wrong task:ITraining: only sign f matters.ITesting: f used as real-valuedquality measure.2) The training samples do not reflectthe test situation.ITraining: samples show eithercomplete object or no object.ITesting: many subregions showobject parts.Learning to Localize ObjectsIdeal setup:Learn a true localization function:g : {all images } → {all boxes }g !=that predicts object location from images.Train in a consistent end-to-end way.Training distribution reflects test distribution.Object Localization as Structured Output RegressionRegression task:training pairs (x1, y1), . . . , (xn, yn) ∈ X × YIxiare images, yiare bounding boxesLearn a mappingIg : X → Ythat generalizes from the given examples:Ig(xi) ≈ yi, for i = 1, . . . , n.Prefer smooth mappings to avoid overfitting.Regression is not R → R, but input and output are structured spaces:inputs are 2D imagesoutputs are 4-tuples y = [left, top, right, bottom] ∈ R4that must be predicted jointly.Alternatives: Predicting with Structured OutputsIndependent Training?Learn independent functions gleft, gtop, gright, gbottom: X → R.Unable to integrate constraints and correlations.Nearest Neighbor?Store all example (xi, yi) as prototypes.For new image x, predict box yiwhere i = argmini=1,...ndist(x, xi).No invariance e.g. against translation.Requires a lot of training data.Probabilistic Modeling?Build a model of p(x, y ) or p(y |x), e.g. Gaussian Mixture.Difficult to integrate invariances, e.g. against scale changes.Requires a lot of training data to cover 4D output space.Background: Structured Output Support Vector MachineStructured Output Support Vector Machine: [Tsochantaridis2005]Generalization of SVMs to arbitrary output domains.Goal: prediction function g : X → YILearn a compatibility functionF : X × Y → Rand defineg(x):= argmaxy∈YF (x, y)g(x) is learned discriminatively.Non-linearity and domain-knowledge included by kernelization of F .I. Tsochantaridis, T. Joachims, T. Hofmann, Y. Altun: Large Margin Methods for Structuredand Interdependent Output Variables, JMLR, 2005.Structured Output Support Vector MachineSetup:Define positive definite kernel k : (X × Y) × (X × Y) → R.Ik(. , .) induces a Reproducing Kernel Hilbert Space Hand an implicit feature map φ : X × Y → H.Define loss function ∆ : Y × Y → R.SO-SVM Training:Solve the convex optimization problemminw,ξ12kwk2+ CnXi=1ξisubject to margin constraints with loss function ∆:∆(yi, y) + hw, φ(xi, y)i − hw , φ(xi, yi)i ≤ ξi,for all y ∈ Y \ {yi} and i = 1, . . . , n.Structured Output Support Vector MachineUnique solution w ∈ H defines the compatibility functionF (x, y)=hw, φ(x, y )ilinear in w ∈ H, but nonlinear in X and Y.F (x, y) measures how well the output y fits to the input x.Ianalogue in probabilistic model: F (x, y) ˆ= log p(y|x).Ibut: F (x, y) max-margin trained, not probabilistic!Best prediction for new x is the most compatible y :g(x) := argmaxy∈YF (x, y).Evaluating g : X → Y is like generalized sliding window:Ifor fixed x, we evaluate a quality function for every box y ∈ Y.Fapproximate: sliding windowFexact: branch-and-bound [Lampert&Blaschko, CVPR2008]Iother parameterization: min-cut, (loopy) BP,...SO-SVM Training: InterpretationSO-SVM Training Revisited: Hard-Margin CaseSolve the convex optimization problemminw,ξ12kwk2subject to margin constraints with loss function ∆ ≥ 0:hw, φ(xi, yi)i − hw, φ(xi, y)i ≥ ∆(yi, y)for all y ∈ Y \ {yi} and i = 1, . . . , n.Implies:F (xi, yi) > F (xi, y) for all y ∈ Y \ {yi}.Because g(x) := argmaxyF (x, y), this meansg(xi) = yifor all training pairs (xi, yi).SO-SVM Training: InterpretationSO-SVM Training Revisited: Hard-Margin CaseSolve the convex optimization problemminw,ξ12kwk2subject to margin constraints with loss function ∆ ≥ 0:hw, φ(xi, yi)i| {z }=F (xi,yi)− hw, φ(xi, y)i| {z }=F (xi,y)≥ ∆(yi, y)for all y ∈ Y \ {yi} and i = 1, . . . , n.Implies:F (xi, yi) > F (xi, y) for all y ∈ Y \ {yi}.Because g(x) := argmaxyF (x, y), this meansg(xi) = yifor all training pairs (xi, yi).SO-SVM Training: InterpretationSO-SVM Training Revisited: General CaseSolve the convex optimization problemminw,ξ12kwk2+ CnXi=1ξisubject to margin


View Full Document

Berkeley COMPSCI 294 - Learning to Localize Objects with Structured Output Regression

Documents in this Course
"Woo" MAC

"Woo" MAC

11 pages

Pangaea

Pangaea

14 pages

Load more
Download Learning to Localize Objects with Structured Output Regression
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Learning to Localize Objects with Structured Output Regression and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Learning to Localize Objects with Structured Output Regression 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?