Occlusion Boundary Detection and Figure/Ground Assignment from Optical Flow

Home> Academic Documents> Occlusion Boundary Detection and Figure/Ground Assignment from Optical Flow

DOC PREVIEW

This preview shows page 1-2-3 out of 8 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Occlusion Boundary Detection andFigure/Ground Assignment from Optical FlowPatrik Sundberg1, Thomas Brox2, Michael Maire3, Pablo Arbel´aez1, and Jitendra Malik1 ∗1University of California at Berkeley{sundberg,arbelaez,malik}@eecs.berkeley.edu2University of Freiburg, Germany3California Institute of [email protected] [email protected] this work, we propose a contour and region detectorfor video data that exploits motion cues and distinguishesocclusion boundaries from internal boundaries based onoptical flow. This detector outperforms the state-of-the-arton the benchmark of Stein and Hebert [24], improving aver-age precision from .58 to .72. Moreover, the optical flow onand near occlusion boundaries allows us to assign a depthordering to the adjacent regions. To evaluate performanceon this edge-based figure/ground labeling task, we intro-duce a new video dataset that we believe will support fur-ther research in the field by allowing quantitative compari-son of computational models for occlusion boundary detec-tion, depth ordering and segmentation in video sequences.1. IntroductionVision systems make use of optical flow for a number ofpurposes, such as egomotion estimation and scene structurerecovery, the latter including both metric depth estimatesand ordinal relationships like figure/ground. In this paper,we focus particularly on the role of motion for grouping andfigure/ground assignments. The importance of motion cuesin these tasks is a classic point in the psychophysical litera-ture. Koffka stated the Gestalt principle of “common fate”where similarly moving points are perceived as coherent en-tities [15], and grouping based on motion was emphasizedby numerous other works including Gibson [12], who alsopointed out occlusion/disocclusion phenomena. In contrastto color and stereopsis, which also help to separate differ-ent objects, motion is a cue shared by basically all visualspecies - a fact that emphasizes its importance in biologicalvision systems.∗This work was supported by the German Academic Exchange Service(DAAD) and ONR MURI N00014-06-1-0734.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 100.10.20.30.40.50.60.70.80.91RecallPrecision our methodSargin et al.Stein et al.He et al.Figure 1. Occlusion boundary detection benchmark.Precision-recall curves for the occlusion boundary detection taskreported by Stein and Hebert [24]. We show results reported by[24], Sargin et al.[21], He et al.[14] as well as our own results. Forperformance numbers, see Table1.The algorithm presented in this paper combines staticboundary cues produced by the boundary detector from [17]with low-level motion cues and motion cues derived fromoptical flow [6]. For each point on the boundaries producedby the static detector, we compute the motion differenceδ between the two regions adjacent to the boundary. Thecomputation of δ involves both spatial and temporal aggre-gation of the optical flow. We then combine the δ featurewith the static detector output into the final contour classi-fier f , which can be turned into a set of closed regions. Inaddition, by comparing the optical flow on boundary pointswith that of regions adjacent to the boundary, we can assignfigure/ground labels to edges. Figure2 illustrates this pro-cedure. Compared to previous techniques, our approach isvery simple and transparent. Hence, we find it particularlyremarkable and rewarding that we perform so much betterthan previous, far more complex methodologies.1Figure 2. Overview. Top: Keyframe from the rocking horse sequence of [24], along with ground truth and the output of the staticboundary detector of [2]. Bottom Left: Our motion feature δ, computed on boundaries. Bottom Center: Resulting boundaries combiningδ and static cues. Bottom Right: Figure/ground classification results on detected regions, figure in green.The present paper makes three distinct contributions.First, we extend the current state-of-the-art static boundarydetector from [17] to exploit motion cues in videos, usingboth low-level motion cues and optical flow. We evaluateour algorithm on the current leading benchmark introducedin [24], and show that it improves the state of the art as mea-sured by average precision from .58 to .72, as shown in Fig-ure 1. This is a very significant performance improvementdespite our very clean and transparent approach. Second,we introduce a new, larger, and more difficult dataset for oc-clusion boundary detection and figure/ground assignment.The dataset contains 102 HD video sequences with seg-mentations and depth ordering labels, separated into train-ing and test sets. This dataset can be regarded as a videocounterpart to the Berkeley Segmentation dataset on staticimages [19]. Third, we show how to assign figure/groundlabels to detected occlusion boundaries based only on themotion of edges and their adjacent regions, getting 84% ac-curacy on the dataset from [24] and 69% on our new, morechallenging dataset.2. Previous WorkIn the computer vision literature, many works havedealt with the problem of optical flow estimation overthe past three decades, and there have been numerous ap-proaches to make use of the optical flow field for grouping[10, 22, 23, 28, 9]. Most of them are similar to the work ofWang and Adelson, which proposes to partition the imageinto motion layers by clustering similar optical flow vectorsaccording to a parametric motion model [26]. While this ap-proach is attractive in the sense that it directly provides ob-ject regions, there are many cases that are not properly cap-tured by parametric models. To deal with this shortcoming,[27] suggested a nonparametric version of layered motion,where each layer is described by a smooth flow field. Sim-ilar techniques based on level sets have been presented in[1, 7]. A shortcoming of these nonparametric layer modelsis the susceptibility of the EM procedure to local minima,particularly in areas of the image with little structure.An alternative strategy is to detect occlusion edges andto infer layers from these edges afterwards, if needed. Sucha strategy has been shown to work very well for segmen-tation of static images [2], and it makes even more sensefor grouping based on motion cues, where additional diffi-culties due to the aperture problem limit the reliability oftypical EM style procedures.Only a few works have dealt with explicitly detectingocclusion boundaries based on motion cues and assigningfigure/ground labels to both sides


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3 out of 8 pages.

Please select your school