Unformatted text preview:

116.891Computer Vision and ApplicationsProf. Trevor. DarrellLecture 14: – Unsupervised Category Learning– Gestalt Principles– Segmentation by Clustering• K-Means•Graph cuts– Segmentation by Fitting• Hough transform• FittingReadings: F&P Ch. 14, 15.1-15.22(Un)Supervised Learning• Methods in last two lectures presume:– Segmentation– Labeling– Alignment• What can we do with unsupervised (weakly supervised) data?• Clustering / Generative Model Approach…3RepresentationUse a scale invariant, scale sensing feature keypoint detector (like the first steps of Lowe’s SIFT). From: Rob Fergus http://www.robots.ox.ac.uk/%7Efergus/[Slide from Bradsky & Thrun, Stanford]4Features for Category LearningA direct appearance model is taken around each located key. This is then normalized by it’s detected scale to an 11x11 window. PCA further reduces these features.From: Rob Fergus http://www.robots.ox.ac.uk/%7Efergus/[Slide from Bradsky & Thrun, Stanford]5From: Rob Fergus http://www.robots.ox.ac.uk/%7Efergus/[Slide from Bradsky & Thrun, Stanford]6Learning• Fit with E-M (this example is a 3 part model)• We start with the dual problem of what to fit and where to fit it.From: Rob Fergus http://www.robots.ox.ac.uk/%7Efergus/Assume that an object instance is the onlyconsistent thing somewhere in a scene.We don’t know where to start, so we usethe initial random parameters.1. (M) We find the best (consistent across images) assignment given the params.2. (E) We refit the feature detector params. and repeat until converged.• Note that there isn’t much consistency3. This repeats until it converges at the most consistent assignment with maximized parameters across images.[Slide from Bradsky & Thrun, Stanford]27DataSlide from Li Fei-Fei http://www.vision.caltech.edu/feifeili/Resume.htm[Slide from Bradsky & Thrun, Stanford]8LearnedModelFrom: Rob Fergus http://www.robots.ox.ac.uk/%7Efergus/The shape model. The mean location is indicated by the cross, with the ellipse showing the uncertainty in location. The number by each part is the probability of that part being present. 9RecognitionFrom: Rob Fergus http://www.robots.ox.ac.uk/%7Efergus/10Result: Unsupervised LearningSlide from Li Fei-Fei http://www.vision.caltech.edu/feifeili/Resume.htm[Slide from Bradsky & Thrun, Stanford]11From: Rob Fergus http://www.robots.ox.ac.uk/%7Efergus/12• Gestalt grouping• Background subtraction•K-Means• Graph cuts• Hough transform• Iterative fitting(Next time: Probabilistic segmentation)Segmentation and Line Fitting313Segmentation and Grouping• Motivation: vision is often simple inference, but for segmentation• Obtain a compact representation from an image/motion sequence/set of tokens• Should support application• Broad theory is absent at present• Grouping (or clustering)– collect together tokens that “belong together”• Fitting– associate a model with tokens–issues• which model?• which token goes to which element?• how many elements in the model?14General ideas• Tokens– whatever we need to group (pixels, points, surface elements, etc., etc.)• Top down segmentation– tokens belong together because they lie on the same object• Bottom up segmentation– tokens belong together because they are locally coherent• These two are not mutually exclusive15Why do these tokens belong together?16What is the figure?17Basic ideas of grouping in humans• Figure-ground discrimination– grouping can be seen in terms of allocating some elements to a figure, some to ground– impoverished theory• Gestalt properties– A series of factors affect whether elements should be grouped together18419 2021 2223Occlusion is an important cue in grouping.24Consequence:Groupings by Invisible Completions* Images from Steve Lehar’s Gestalt papers: http://cns-alumni.bu.edu/pub/slehar/Lehar.html525And the famous… 26And the famous invisible dog eating under a tree:27Technique: Background Subtraction• If we know what the background looks like, it is easy to identify “interesting bits”• Applications– Person in an office– Tracking cars on a road– surveillance• Approach:– use a moving average to estimate background image– subtract from current frame– large absolute values are interesting pixels• trick: use morphological operations to clean up pixels2829low threshhigh threshEM (later)80x6030low threshhigh threshEM (later)160x120631Static Background Modeling Examples[MIT Media Lab Pfinder / ALIVE System]32Static Background Modeling Examples[MIT Media Lab Pfinder / ALIVE System]33Static Background Modeling Examples[MIT Media Lab Pfinder / ALIVE System]34BG Pixel distribution is non-stationary:Dynamic Background[MIT AI Lab VSAM]35Staufer and Grimson tracker:Fit per-pixel mixture model to observed distrubution.Mixture of Gaussian BG model[MIT AI Lab VSAM]36Background Subtraction PrinciplesWallflower: Principles and Practice of Background Maintenance, by KentaroToyama, John Krumm, Barry Brumitt, Brian Meyers. P1:P2:P3:P4:P5:737Background Techniques ComparedFrom the Wallflower Paper38Segmentation as clustering• Cluster together (pixels, tokens, etc.) that belong together…• Agglomerative clustering– attach closest to cluster it is closest to–repeat• Divisive clustering– split cluster along best boundary–repeat• Dendrograms– yield a picture of output as clustering process continues39Clustering Algorithms4041K-Means• Choose a fixed number of clusters• Choose cluster centers and point-cluster allocations to minimize error • can’t do this by search, because there are too many possible allocations.• Algorithm– fix cluster centers; allocate points to closest cluster– fix allocation; compute best cluster centers• x could be any set of features for which we can compute a distance (careful about scaling)xj−µi2j∈elements of i'th cluster∑      i∈clusters∑42K-Means843K-means clustering using intensity alone and color aloneImageClusters on intensity (K=5) Clusters on color (K=5)44K-means using color alone, 11 segmentsImageClusters on color45K-means usingcolor alone,11 segments.Color aloneoften will not yeild salient segments!46K-means using colour andposition, 20 segmentsStill misses goal of perceptuallypleasing segmentation!Hard to pick K…47Mean Shift Segmentationhttp://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html48Mean Shift AlgorithmMean Shift Algorithm1. Choose a search window


View Full Document

MIT 6 891 - Computer Vision and Applications

Download Computer Vision and Applications
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Computer Vision and Applications and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Computer Vision and Applications 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?