This preview shows page 1-2-15-16-17-32-33 out of 33 pages.

View Full Document
View Full Document

End of preview. Want to read all 33 pages?

Upload your study docs or become a GradeBuddy member to access this document.

View Full Document
Unformatted text preview:

Computer Vision: Summary and Anti-summaryToday’s classReviewGeometry (on board)MatchingSlide 6Slide 7Slide 8GroupingCategorizationSlide 11Object CategorizationSlide 13Vision as part of an intelligent systemAnti-summaryContextMachine learning3D Object ModelsAction recognitionGeometryOther sensorsPhysics-based visionWide-open problemsSlide 24Slide 25Slide 26Slide 27See you next week!Slide 29Slide 30Anti-summary outlineThe Vision ProcessSummary outlineComputer Vision: Summary and Anti-summaryComputer VisionCS 543 / ECE 549 University of IllinoisDerek Hoiem05/04/2010Today’s class•Administrative stuf–HW 4 due–Posters next Tues + reports due–Feedback and Evaluation (end of class)•Review of important concepts•Anti-summary: important concepts that were not covered•Wide open problemsReview•Geometry•Matching•Grouping•CategorizationGeometry (on board)•Projection matrix P = K [R t] relates 2d point x and 3d point X in homogeneous coordinates: x = P X•Parallel lines in 3D converge at the vanishing point in the image–A 3D plane has a vanishing line in the image•In two views, points that correspond to the same 3D point are related by the fundamental matrix: x’T F x = 0Matching•Does this patch match that patch?–In two simultaneous views? (stereo)–In two successive frames? (tracking, flow, SFM)–In two pictures of the same object? (recognition)??MatchingRepresentation: be invariant/robust to expected deformations but nothing else•Change in viewpoint–Rotation invariance: rotate and/or affine warp patch according to dominant orientations•Change in lighting or camera gain–Average intensity invariance: oriented gradient-based matching–Contrast invariance: normalize gradients by magnitude•Small translations–Translation robustness: histograms over small regionsBut can one representation do all of this?•SIFT: local normalized histograms of oriented gradients provides robustness to in-plane orientation, lighting, contrast, translation•HOG: like SIFT but does not rotate to dominant orientationMatchingSearch: efficiently localize matching patches•Interest points: find repeatable, distinctive points–Long-range matching: e.g., wide baseline stereo, panoramas, object instance recognition–Harris: points with strong gradients in orthogonal directions (e.g., corners) are precisely repeatable in x-y–Diference of Gaussian: points with peak response in Laplacian image pyramid are somewhat repeatable in x-y-scale• Local search–Short range matching: e.g., tracking, optical flow–Gradient descent on patch SSD, often with image pyramid•Windowed search–Long-range matching: e.g., recognition, stereo w/ scanlineMatchingRegistration: match sets of points that satisfy deformation constraints• Geometric transformation (e.g., affine)–Least squares fit (SVD), if all matches can be trusted–Hough transform: each potential match votes for a range of parameters•Works well if there are very few parameters (3-4)–RANSAC: repeatedly sample potential matches, compute parameters, and check for inliers•Works well if fraction of inliers is high and few parameters (4-8)•Other cases–One-to-one correspondence (Hungarian algorithm)–Small local deformation of ordered pointsB1B2B3A1A2A3Grouping•Clustering: group items (patches, pixels, lines, etc.) that have similar appearance–Discretize continuous values; typically, represent points within cluster by center–Improve efficiency: e.g., cluster interest points before recognition–Enable counting: histograms of interest points, color, texture•Segmentation: group pixels into regions of coherent color, texture, motion, and/or label–Mean-shift clustering–Watershed–Graph-based segmentation: e.g., MRF and graph cuts•EM, mixture models: probabilistically group items that are likely to be drawn from the same distribution, while estimating the distributions’ parametersCategorizationMatch objects, parts, or scenes that may vary in appearance•Categories are typically defined by human and may be related by function, cost, or other non-visual attributes•Naïve matching or clustering approach will usually fail–Elements within category often do not have obvious visual similarities –Possible deformations are not easily defined•Typically involves example-based machine learning approachTraining LabelsTraining ImagesClassifier TrainingImage FeaturesTrained ClassifierCategorizationRepresentation: ideally should be compact, comprehensive, direct•Histograms of quantized interest points (SIFT, HOG), color, texture–Typical for image or region categorization–Degree of spatial invariance is controllable by using spatial pyramids•HOG features at specified position–Often used for finding parts or objectsObject CategorizationSliding Window Detector•May work well for rigid objects•Combines window-based matching search with feature-based classifierObject or Background?Object CategorizationParts-based model•Defined by models of part appearance, geometry or spatial layout, and search algorithm•May work better for articulated objectsVision as part of an intelligent system3D SceneFeatureExtractionInterpretationActionTexture ColorOptical FlowStereo DisparityGroupingSurfacesBits of objectsSense of depthObjectsAgents and goalsShapes and propertiesOpen pathsWordsWalk, touch, contemplate, smile, evade, read on, pick up, … Motion patternsAnti-summary•Summary of things not coveredTerm “anti-summary” from Murphy’s book “Big Book of Concepts”Context•Biederman’s Relations among Objects in a Well-Formed Scene (1981):–Support–Size–Position–Interposition–Likelihood of AppearanceHock, Romanski, Galie, & Williams 1978Machine learning•Probabilistic approaches–Logistic regression–Bayesian network (special case: tree-shaped graph)•Support vector machines•Energy-based approaches–Conditional random fields (special case: lattice)–Graph cuts and belief propagation (BP-TRW)Further reading:• Learning from Data: Concepts, Theory, and Methods by Cherkassky and Muliel (2007): I have not read this, but reviews say it is good for SVM and statistical learning)• Machine Learning by Tom Mitchell (1997): Good but somewhat outdated introduction to learning • Heckerman’s tutorial on learning with Bayesian Networks (1995)3D Object Models•Aspect graphs (see Forsyth and Ponce)•Model 3D spatial configuration of parts–E.g., see recent work by Silvio Savarese and


View Full Document
Loading Unlocking...
Login

Join to view Computer Vision- Summary and Anti-summary and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Computer Vision- Summary and Anti-summary and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?