Unformatted text preview:

Learning Structured Prediction Models: A Large Margin ApproachSlide 2Handwriting recognitionObject segmentationNatural language parsingDisulfide connectivity predictionOutlineStructured modelsChain Markov Net (aka CRF*)Slide 10Associative Markov NetsPCFGDisulfide bonds: non-bipartite matchingScoring functionSlide 15MAP inference  linear programMarkov Net Inference LPAssociative MN Inference LPOther Inference LPsSlide 20Learning wGeometric ExampleOCR ExampleLarge margin estimationSlide 25Min-max formulationSlide 27Slide 283D MappingSlide 30Slide 31Slide 32Slide 33Segmentation resultsFly-throughCertificate formulationCertificate for non-bipartite matchingSlide 38Slide 39Slide 40Known bonded stateUnknown bonded stateFormulation summaryEstimationOmissionsCurrent WorkConclusionSlide 48Duals and KernelsHandwriting RecognitionHypertext ClassificationProjected GradientMin-Cost Flow for Markov ChainsMin-Cost Flow for Bipartite MatchingsCFG ChartCFG Inference LPLearning Structured Prediction Models:A Large Margin Approach Ben TaskarU.C. BerkeleyVassil Chatalbashev Michael Collins Carlos Guestrin Dan Klein Daphne Koller Chris Manning“Don’t worry, Howard. The big questions are multiple choice.”Handwriting recognitionbraceSequential structurex yObject segmentationSpatial structurex yNatural language parsingThe screen was a sea of redRecursive structurex yDisulfide connectivity predictionRSCCPCYWGGCPWGQNCYPEGCSGPKVCombinatorial structurex yOutlineStructured prediction modelsSequences (CRFs)Trees (CFGs)Associative Markov networks (Special MRFs)Matchings Geometric ViewStructured model polytopes Linear programming inferenceStructured large margin estimationMin-max formulationApplication: 3D object segmentationCertificate formulationApplication: disulfide connectivity predictionStructured modelsMild assumption: linear combinationspace of feasible outputsscoring functionChain Markov Net (aka CRF*)a-za-za-za-za-zyx(xi,yi) = exp{ wf(xi,yi)} (yi,yi+1) = exp{ wf (yi,yi+1)} P(y|x)  i (xi,yi) i (yi,yi+1)f(y,y’) = I(y=‘z’,y’=‘a’)*Lafferty et al. 01f(x,y) = I(xp=1, y=‘z’)Chain Markov Net (aka CRF*)a-za-za-za-za-zyxi (xi,yi) = exp{ w i f(xi,yi)} i (yi,yi+1) = exp{ w i f (yi,yi+1)} P(y|x) i (xi,yi) i (yi,yi+1) *Lafferty et al. 01 = exp{wTf(x,y)}f(x,y) = #(y=‘z’,y’=‘a’)f(x,y) = #(xp=1, y=‘z’)w = [… , w , … , w, …]f(x,y) = [… , f(x,y) , … , f(x,y) , …]Associative Markov NetsPoint featuresspin-images, point heightEdge featureslength of edge, edge orientation yiyjiji“associative” restrictionPCFG#(NP  DT NN)…#(PP  IN NP)…#(NN  ‘sea’)Disulfide bonds: non-bipartite matching12 346 5RSCCPCYWGGCPWGQNCYPEGCSGPKV 1 2 3 4 5 6 612453Fariselli & Casadio `01, Baldi et al. ‘04Scoring functionRSCCPCYWGGCPWGQNCYPEGCSGPKV 1 2 3 4 5 6 RSCCPCYWGGCPWGQNCYPEGCSGPKV 1 2 3 4 5 6 12 346 5String features: residues, physical propertiesStructured modelsMild assumption: Another mild assumption:  linear programmingspace of feasible outputsscoring functionMAP inference  linear programLP inference forChainsTreesAssociative Markov NetsBipartite Matchings…Markov Net Inference LP0 0 0 00 0 0 00 1 0 00 0 0 000100 1 0 0Has integral solutions y for chains, treesGives upper bound for general networksAssociative MN Inference LPFor K=2, solutions are always integral (optimal)For K>2, within factor of 2 of optimal Constraint matrix A is linear in number of nodes and edges, regardless of the tree-width“associative” restrictionOther Inference LPsContext-free parsingDynamic programsBipartite matchingNetwork flow Many other combinatorial problemsOutlineStructured prediction modelsSequences (CRFs)Trees (CFGs)Associative Markov networks (Special MRFs)Matchings Geometric ViewStructured model polytopes Linear programming inferenceStructured large margin estimationMin-max formulationApplication: 3D object segmentationCertificate formulationApplication: disulfide connectivity predictionLearning wTraining example (x, y*) Probabilistic approach:Maximize conditional likelihoodProblem: computing Zw(x) is #P-completeGeometric ExampleTraining data:Goal: Learn w s.t. wTf( , y*) points the “right” wayOCR ExampleWe want:argmaxword wT f( ,word) = “brace”Equivalently:wT f( ,“brace”) > wT f( ,“aaaaa”)wT f( ,“brace”) > wT f( ,“aaaab”)…wT f( ,“brace”) > wT f( ,“zzzzz”)a lot!Large margin estimationGiven training example (x, y*), we want:*Taskar et al. 03Maximize marginMistake weighted margin:# of mistakes in yLarge margin estimationBrute force enumerationMin-max formulation‘Plug-in’ linear program for inferenceMin-max formulationLP inferenceAssume linear loss (Hamming):InferenceMin-max formulationBy strong LP dualityMinimize jointly over w, zMin-max formulationFormulation produces compact QP forLow-treewidth Markov networks Associative Markov networksContext free grammarsBipartite matchingsAny problem with compact LP inference3D MappingLaser Range FinderGPSIMUData provided by: Michael Montemerlo & Sebastian ThrunLabel: ground, building, tree, shrub Training: 30 thousand points Testing: 3 million pointsSegmentation resultsHand labeled 180K test pointsModelAccuracySVM 68%V-SVM73%M3N 93%Fly-throughCertificate formulationNon-bipartite matchings:O(n3) combinatorial algorithmNo polynomial-size LP knownSpanning treesNo polynomial-size LP knownSimple certificate of optimality Intuition: Verifying optimality easier than optimizing Compact optimality condition of y* wrt. 12 346 5ijklCertificate for non-bipartite matchingAlternating cycle: Every other edge is in matchingAugmenting alternating cycle: Score of edges not in matching greater than edges in matchingNegate score of edges not in


View Full Document

UMD CMSC 828G - Learning Structured Prediction Models

Documents in this Course
Lecture 2

Lecture 2

35 pages

Load more
Download Learning Structured Prediction Models
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Learning Structured Prediction Models and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Learning Structured Prediction Models 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?