CS664 Lecture #15: Multi-camera stereoAnnouncementsGraph cuts + EM for sloped surfacesMulticamera stereoMultiview stereoVolumetric stereoDiscrete formulation: Voxel coloringComplexity and computabilityVoxel coloring (Seitz)Depth ordering: occluders first!Panoramic Depth OrderingCompatible Camera ConfigurationsVoxel Coloring ResultsLimitations of Depth OrderingSpace Carving AlgorithmConvergenceWhich shape do you get?Space Carving Results: VioletSpace Carving Results: HandWhy use energy minimization?Multi-Camera Scene Reconstruction via Graph Cuts (Kolmogorov & Zabih, ECCV ’02)Comparison with stereoKey issuesApproachProblem formulationSample configurationEnergy function has 3 terms: smoothness, data, visibilitySmoothness neighborhoodSmoothness termPhotoconsistency constraintPhotoconsistency neighborhoodData (photoconsistency) termVisibility constraintVisibility neighborhoodVisibility termTsukuba imagesComparisonCS664 Lecture #15: Multi-camera stereoSome material taken from:Steve Seitz, University of Washingtonhttp://www.cs.washington.edu/homes/seitz/2Announcements Pick a vision paper to write a report on by the next week (10/25)– Sources: CVPR, ICCV, ECCV, PAMI, IJCV– Email your choice to rdz@cs 1-page report should discuss the paper, list its assumptions, give some advantages and disadvantages– Report due on 11/15 Next quiz: 10/25 (Tuesday)– Coverage through this lecture3Graph cuts + EM for sloped surfacesDisparity 5 region “A”Disparity 5 region “B”Region “A”plane labelGraph cut solution(integer labels)Graph cut solution(plane labels)4Multicamera stereo Obvious generalization of stereo:– More than two cameras We saw this a little in “voxel occupancy”, which we solved with a binary graph cut There is a lot of work on this topic, and it naturally involves other important ideas We’ll look at some simple, elegant methods not based on energy minimization– And see how energy minimization can be used to improve them!5Multiview stereo6Volumetric stereoScene VolumeScene VolumeVVInput ImagesInput Images(Calibrated)(Calibrated)Goal: Goal: Determine occupancy, Determine occupancy, ““colorcolor””of points in Vof points in V7Discrete formulation: Voxel coloringDiscretized Discretized Scene VolumeScene VolumeInput ImagesInput Images(Calibrated)(Calibrated)Goal: Assign RGBA values to voxels in Vphoto-consistent with images8Complexity and computabilityDiscretized Discretized Scene VolumeScene VolumeN voxelsN voxelsC colorsC colors33All Scenes (CN3)Photo-ConsistentScenesTrueScene9Voxel coloring (Seitz)1. Choose voxel1. Choose voxel2. Project and correlate2. Project and correlate3.3.Color if consistentColor if consistent(standard deviation of pixel colors below threshold)Visibility Problem: Visibility Problem: in which images is each voxel visible?in which images is each voxel visible?10Depth ordering: occluders first!LayersLayersSceneSceneTraversalTraversalCondition: Condition: depth order is the depth order is the same for all input viewssame for all input views11Panoramic Depth Ordering– Cameras oriented in many different directions– Planar depth ordering does not apply12Layers radiate outwards from camerasLayers radiate outwards from cameras13Layers radiate outwards from camerasLayers radiate outwards from cameras14Layers radiate outwards from camerasLayers radiate outwards from cameras15Compatible Camera Configurations Outward-Looking cameras inside scene Inward-Looking cameras above scene16Voxel Coloring ResultsDinosaur ReconstructionDinosaur Reconstruction72 K voxels colored72 K voxels colored7.6 M voxels tested7.6 M voxels tested7 min. to compute 7 min. to compute on a 250MHz SGIon a 250MHz SGIFlower ReconstructionFlower Reconstruction70 K voxels colored70 K voxels colored7.6 M voxels tested7.6 M voxels tested7 min. to compute 7 min. to compute on a 250MHz SGIon a 250MHz SGI17Limitations of Depth Ordering A view-independent depth order may not existp q Need more powerful general-case algorithms– Unconstrained camera positions– Unconstrained scene geometry/topology18Space Carving AlgorithmImage 1Image N…...– Initialize to a volume V containing the true scene– Repeat until convergence– Choose a voxel on the current surface– Carve if not photo-consistent– Project to visible input images19Convergence Consistency Property– The resulting shape is photo-consistent• all inconsistent points are removed Convergence Property– Carving converges to a non-empty shape• a point on the true scene is never removedp20Which shape do you get? The Photo Hull is the UNION of all photo-consistent scenes in V– It is a photo-consistent scene reconstruction– Tightest possible bound on the true sceneTrue SceneTrue SceneVVPhoto HullPhoto HullVVSpace Carving Results: VioletInput Image (1 of 45) ReconstructionReconstructionReconstructionSpace Carving Results: HandInput Image(1 of 100) Views of Reconstruction23Why use energy minimization? Very similar to voxel occupancy example– Early hard decisions can be wrong• Difficult to recover from them No use of spatial smoothness!– Almost nothing in vision actually works unless it uses spatial smoothness– No one really understands why voxel coloring and space carving are exceptions– Very active area of research24Multi-Camera Scene Reconstruction via Graph Cuts(Kolmogorov & Zabih, ECCV ’02)25Comparison with stereo Much harder problem than stereo In stereo, most scene elements are visible in both cameras– It is common to ignore occlusions Here, almost no scene elements are visible in all cameras– Visibility reasoning is vital26Key issues Visibility reasoning Incorporating spatial smoothness Computational tractability– Only certain energy functions can be minimized using graph cuts! Handle a large class of camera configurations Treat input images symmetrically27Approach Problem formulation– Discrete labels, not voxels– Carefully constructed energy function Minimizing the energy via graph cuts– Local minimum in a strong sense– Use the regularity construction Experimental results– Strong preliminary results28Problem formulation Discrete set of labels corresponding to different depths– For example, from a single camera Camera pixel plus label = 3D point Goal: find the best configuration– Labeling for each pixel in each camera– Minimize an energy function
View Full Document