MIT 6 891 - Computer Vision and Applications - D1916939

Home> Schools> Massachusetts Institute of Technology> Electrical Engineering and Computer Science (6) > 6 891> Computer Vision and Applications

MIT 6 891 - Computer Vision and Applications

School name Massachusetts Institute of Technology

Course 6 891- Advanced Topics in Theoretical Computer Science

Pages 112

Download Save

Unformatted text preview:

6.891(Un)Supervised LearningRepresentationFeatures for Category LearningLearningDataLearnedModelRecognitionResult: Unsupervised LearningSegmentation and Line FittingSegmentation and GroupingGeneral ideasBasic ideas of grouping in humansConsequence:Groupings by Invisible CompletionsAnd the famous…And the famous invisible dog eating under a tree:Technique: Background SubtractionStatic Background Modeling ExamplesStatic Background Modeling ExamplesStatic Background Modeling ExamplesDynamic BackgroundMixture of Gaussian BG modelBackground Subtraction PrinciplesBackground Techniques ComparedSegmentation as clusteringClustering AlgorithmsK-MeansK-MeansMean Shift SegmentationMean Shift AlgorithmMean Shift SegmentationResults:Graph-Theoretic Image SegmentationGraphs RepresentationsWeighted Graphs and Their RepresentationsBoundaries of image regions defined by a number of attributesMeasuring AffinityEigenvectors and affinity clustersExample eigenvectorExample eigenvectorScale affects affinityScale affects affinitySome Terminology for Graph PartitioningMinimum CutMinimum Cut and ClusteringDrawbacks of Minimum CutNormalized cutsSolving the Normalized Cut problemNormalized Cut As Generalized Eigenvalue problemNormalized cutsComparison of MethodsAdvantages/DisadvantagesAdvantages/DisadvantagesSegmentation and Line FittingFittingFitting and the Hough TransformMechanics of the Hough transformLine fittingWho came from which line?Incremental line fittingIncremental line fittingIncremental line fittingIncremental line fittingIncremental line fittingK-means line fittingK-means line fittingK-means line fittingK-means line fittingK-means line fittingK-means line fittingK-means line fittingRobustnessSegmentation and Line FittingVisual learning is inefficientFind the Mullets…One-Shot LearningLearn meta-parameters16.891Computer Vision and ApplicationsProf. Trevor. DarrellLecture 14: – Unsupervised Category Learning– Gestalt Principles– Segmentation by Clustering• K-Means• Graph cuts– Segmentation by Fitting• Hough transform• FittingReadings: F&P Ch. 14, 15.1-15.22(Un)Supervised Learning• Methods in last two lectures presume:–Segmentation– Labeling– Alignment• What can we do with unsupervised (weakly supervised) data?• Clustering / Generative Model Approach…3RepresentationUse a scale invariant, scale sensing feature keypoint detector (like the first steps of Lowe’s SIFT). From: Rob Fergus http://www.robots.ox[Slide from Bradsky & Thrun, Stanford].ac.uk/%7Efergus/4Features for Category LearningA direct appearance model is taken around each located key. This is then normalized by it’s detected scale to an 11x11 window. PCA further reduces these features.From: Rob Fergus http://www.robots.ox[Slide from Bradsky & Thrun, Stanford].ac.uk/%7Efergus/5From: Rob Fergus http://www.robots.ox.ac.uk/%7Efergus/[Slide from Bradsky & Thrun, Stanford]6Learning• Fit with E-M (this example is a 3 part model)• We start with the dual problem of what to fit and where to fit it.From: Rob Fergus http://www.robots.ox.ac.uk/%7Efergus/Assume that an object instance is the onlyconsistent thing somewhere in a scene.We don’t know where to start, so we usethe initial random parameters.1. (M) We find the best (consistent across images) assignment given the params.2. (E) We refit the feature detector params. and repeat until converged.• Note that there isn’t much consistency3. This repeats until it converges at the most consistent assignment with maximized parameters across images.[Slide from Bradsky & Thrun, Stanford]7DataSlide from Li Fei-Fei http://www.vision.caltech.edu/feifeili/Resume.htm[Slide from Bradsky & Thrun, Stanford]8LearnedModelFrom: Rob Fergus http://www.robots.ox.ac.uk/%7Efergus/The shape model. The mean location is indicated by the cross, with the ellipse showing the uncertainty in location. The number by each part is the probability of that part being present.9RecognitionFrom: Rob Fergus http://www.robots.ox.ac.uk/%7Efergus/10Result: Unsupervised LearningSlide from Li Fei-Fei http://www.vision.caltech.edu/feifeili/Resume.htm[Slide from Bradsky & Thrun, Stanford]11From: Rob Fergus http://www.robots.ox.ac.uk/%7Efergus/12Segmentation and Line Fitting• Gestalt grouping• Background subtraction•K-Means• Graph cuts• Hough transform• Iterative fitting(Next time: Probabilistic segmentation)13Segmentation and Grouping• Motivation: vision is often simple inference, but for segmentation• Obtain a compact representation from an image/motion sequence/set of tokens• Should support application• Broad theory is absent at present• Grouping (or clustering)– collect together tokens that “belong together”• Fitting– associate a model with tokens–issues• which model?• which token goes to which element?• how many elements in the model?14General ideas•Tokens– whatever we need to group (pixels, points, surface elements, etc., etc.)• Top down segmentation– tokens belong together because they lie on the same object• Bottom up segmentation– tokens belong together because they are locally coherent• These two are not mutually exclusive15Why do these tokens belong together?16What is the figure?17Basic ideas of grouping in humans• Figure-ground discrimination– grouping can be seen in terms of allocating some elements to a figure, some to ground– impoverished theory• Gestalt properties– A series of factors affect whether elements should be grouped together181920212223Occlusion is an important cue in grouping.24Consequence:Groupings by Invisible Completions* Images from Steve Lehar’s Gestalt papers: http://cns-alumni.bu.edu/pub/slehar/Lehar.html25And the famous…26And the famous invisible dog eating under a tree:27Technique: Background Subtraction• If we know what the background looks like, it is easy to identify “interesting bits”• Applications– Person in an office– Tracking cars on a road– surveillance• Approach:– use a moving average to estimate background image– subtract from current frame– large absolute values are interesting pixels• trick: use morphological operations to clean up pixels2829low threshhigh threshEM (later)80x6030low threshhigh threshEM (later)160x12031Static Background Modeling Examples[MIT Media Lab Pfinder / ALIVE System]32Static Background Modeling Examples[MIT Media Lab Pfinder / ALIVE System]33Static Background Modeling Examples[MIT Media Lab Pfinder / ALIVE System]34Dynamic

View Full Document


School:
Email:
New Password:
Confirm Password:

MIT 6 891 - Computer Vision and Applications

Sign up for free to view:

Please select your school