New version page

A Theory of the Quasi-static World

Upgrade to remove ads

This preview shows page 1-2 out of 6 pages.

Save
View Full Document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Upgrade to remove ads
Unformatted text preview:

A Theory of the Quasi-static WorldBrandon C. S. Sanders and Randal C. NelsonDepartment of Computer ScienceUniversity of RochesterRochester, NY 14627[sanders,nelson]@cs.rochester.eduRahul SukthankarCompaq Research (CRL)One Cambridge CenterCambridge, MA [email protected] present the theory behind a novel unsupervisedmethod for discovering quasi-static objects, objects that arestationary during some interval of observation, within im-age sequences acquired by any number of uncalibrated cam-eras. For each pixel we generate a signature that encodes thepixel’s temporal structure. Using the set of temporal signa-tures gathered across views, we hypothesize a global sched-ule of events and a small set of objects whose arrivals anddepartures explain the events. The paper specifies observ-ability conditions under which the global schedule can be es-tablished and presents the QSL algorithm that generates themaximally-informative mapping of pixels’ observations ontothe objects they stem from. Our framework ignores distract-ing motion, correctly deals with complicated occlusions, andnaturally groups observations across cameras. The sets of2D masks we recover are suitable for unsupervised trainingand initialization of object recognition and tracking systems.1. Introduction“Object Discovery” (OD) is the problem of grouping allobservations springing from a single object without includ-ing any observations generated by other objects. Static ODsystems, such as object recognizers and image segmenters,seek to discover objects in single, static images. Object rec-ognizers discover objects for which they already have mod-els, as in [5, 9], and generally require extensive training forsatisfactory performance. In contrast to object recognizers,image segmenters do not require an a priori model of eachobject of interest. Segmenters typically rely upon local spa-tial homogeneity of color [3], texture [4], or a combinationof these cues [2] to discover objects. Because objects arenot actually spatially homogeneous, segmenters often splitobjects and combine portions of different objects together.Dynamic OD systems find objects that move indepen-dently in the world using a combination of temporal and spa-tial information. Many such systems depend upon spatialhomogeneity of motion flow vectors[11], and are sometimescombined with texture or color [1]. Other dynamic OD sys-tems use background subtraction [10] to separate moving ob-Camera 0 Camera 1 Camera 2Obj0 (Bowl)Obj1 (Keyboard)Obj2 (Can)Obj3 (Spoon)t = 41.2 s t = 51.2 st = 20.5 sFigure 1. OD results on a sequence in whichthe spoon and can arrive simultaneously, thekeyboard is always partially occluded in cam-eras 0 and 1, and the can and keyboard mutu-ally occlude each other in camera 1. Temporalinformationaloneis sufficient to group the pix-els in and across the uncalibrated cameras.jects from a static background. Unfortunately, dynamic ODsystems often require high frame rates and/or cannot separateobjects from the person manipulating them.In this paper we present the theory behind a “quasi-static” system that achieves good results (see Figure 1) us-ing only temporal information to cluster pixel observations.In the figure, the images in the top row are examples fromsequences acquired by three different uncalibrated cameras.The other images in the bottom grid show the objects dis-covered (the background is also found but not shown). Thecomplete keyboard is recovered even though it is partiallyoccluded by either the bowl or the can in every image inwhich cameras 0 and 1 observe it. The can and spoon arecorrectly discriminated even though they arrive in the sceneat the same time, a significant improvement over the resultsreported in [7]. Because we do not use spatial information,our approach is novel and complements the existing body ofsegmentation work which generally relies upon local spatialhomogeneity of color, texture or motion flow vectors. Theadvantages of our method include the following: (1) Lowframe rate requirements (e.g., 1Hz); (2) Entire objects arediscovered even in some cases where they are always par-tially occluded; (3) Objects that arrive or depart simultane-ously are correctly distinguished if each object’s lifetime (ar-rival/departure pair) is distinguishable from the lifetime ofevery other object; (4) The approach scales naturally to andbenefits from multiple completely uncalibrated cameras.2. The Quasi-static WorldIn this section we define the quasi-static world modelused throughout the remainder of the paper. This model is at-tractive because it imposes enough restrictions on the worldto be theoretically treatable while maintaining practical ap-plication to real systems. The quasi-static model assumesthat the only objects of interest are those that undergo mo-tion on some time interval and are stationary on some othertime interval (i.e., objects that stay still for a while). Thus thequasi-static world model targets objects that are picked upand set down while ignoring the person manipulating them.1The following definitions will be used throughout the paperin connection with the quasi-static model:Physical object: A chunk of matter that leads to consistentobservations through space and time (i.e., an object inthe intuitive sense). We define physical objects in orderto contrast them with quasi-static objects. A physicalobject is mobile if it is observed to move in the scene.Quasi-static object: The quasi-static world interpretationof a mobile physical object that is stationary over a par-ticular time interval.Quasi-static object lifetime: The time interval over whicha mobile physical object is stationary at a single loca-tion. When a mobile physical object m moves aroundthe scene and is stationary at multiple physical loca-tions, each stationary location i is interpreted as a sepa-rate quasi-static object oi.1Of course, according to the quasi-static world model, when a person iscompletely stationary he/she becomes an object of interest.Global schedule: A set of quasi-static object lifetimes.Pixel visage: A set of observations made by a given pixelthat are interpreted as stemming from a particular quasi-static object. A pixel’s visages are disjoint with eachother and each forms a history of a particular quasi-static object’s visual appearance through time accordingto the pixel. When an observation for a pixel p made attime t is assigned to a visage v, p is said to observe vat time t. Likewise,


Download A Theory of the Quasi-static World
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view A Theory of the Quasi-static World and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view A Theory of the Quasi-static World 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?