Stanford EE 392J - Motion Based Foreground Segmentation

Unformatted text preview:

Motion-Based Foreground SegmentationDavid B. BrownDigital Video Processing, EE392JStanford UniversityMarch 13, 2000ABSTRACTA variety of useful applications demonstrate the need for precise motion-basedsegmentation of image data. Video compression techniques rely heavily on accurate andefficient representation of spatio-temporal regions possessing similar motion models. Inaddition a variety of cinematographic effects are potentially possible with successfulseparation of a moving object’s image from an arbitrary background. Computer visionsystems often depend on the ability to distinguish or describe a moving object in animage sequence. An algorithm is designed to segment a moving foreground based onblock-matching motion methods and recursive tracing of the resulting motion vectors.1 IntroductionThe objective of this project is the creation of an algorithm that will separatemoving foreground from a stationary background in a general video sequence. Thisproblem is motivated primarily by the need for replacing or removing moving persons orobjects in motion pictures. As mentioned above, however, motion-based segmentationalso finds a niche in the realms of video compression and computer vision. While thisproblem is still not completely solved, others have directly addressed it. See Giacconeand Jones for a more thorough treatment.1Selection of a motion estimator model represents the first step in the problem.Gradient-based methods such as optical flow have shown high performance but generallycome with increased computational overhead than block-based matching. Thedisadvantage of block methods is an expected loss of sharpness at edge regions markingthe boundary between foreground and background.Regardless of motion estimator, careful attention must be paid to noise effectswhen estimating motion. Faulty motion vectors due to image noise can lead to visuallyunpleasant effects such as isolated background blocks in the resulting segmented image.Noise-reduction filters may be used to alleviate this problem. Another method is toexamine the resulting mean-squared error of known zero-motion vector regions. Anyerror must be due solely to noise and thus this provides information about the noise in aparticular image sequence.Accurate knowledge of all the motion vectors in a sequence theoretically providesthe means to segment the images into pixels associated with a moving object and pixelsassociated with a rigid background. The algorithm for tracing motion vectors throughoutthe sequences is highly recursive and can be computationally expensive, depending onthe number of non-zero motion vectors present. A video scene to be segmented shouldbe motion traced both forwards and backwards temporally.An interesting and difficult complication to this problem is the issue of occlusionswith the background. Consider a speaker that moves their hand behind a stationarybackground object such as a desk or chair. Such occlusions represent a challenge and arenot examined in this approach. The interested reader should again consult Giaccone andJones for a treatment of this issue.1In addition to ignoring occlusions, the following analysis makes otherassumptions about a given image sequence. Lighting and color changes should not bepresent to a high degree in the video being examined. Furthermore, the moving objectshould be constrained towards the interior of the image. The effect of partial or totalmovement off-screen is not considered.2 ImplementationFigure 1: System Block DiagramFigure 1 shows the block diagram for the system. A brief discussion of the detailsbehind each block follows.2.1 Noise EstimatorBefore motion estimation is begun, an accurate estimate of the mean-squarederror due to noise in pixel intensities must be obtained. With this information, incorrectnon-zero motion vectors may be discarded before they are traced. Depending on thetracing algorithm used as discussed below, the rejection of these vectors can haveimportant consequences for the resulting visual quality.The noise estimation is computed in a straightforward fashion. The motionvectors between a pair of successive frames are computed without considering noise.The mean-squared error of regions that have known zero motion vectors must be solely aresult of any noise in the image sequence. The mean and standard deviation of theresulting collection of mean-squared errors is computed, and some number of standarddeviations above the mean represents a threshold value. In all following motion vectorcalculations, only minimum mean-squared errors that fall below this threshold can beconsidered true non-zero motion vectors.This method relies on the assumption that a large number of zero-motion regionscan be found before considering noise. The problem here is that if the images are verynoisy or a particularly noisy pair of images are used for the threshold estimate, then thecollection of data points for threshold estimation is small. Also, depending on the natureof the distribution of the mean-squared errors, a varying number of standard deviationsabove the mean results in a useful threshold value. Too far above the mean will havelittle effect in reducing noise, whereas too close to the mean will result in loss of truemotion estimates.2.2 Motion EstimationA block-matching motion estimator is used to calculate motion vectors over eachpair of images in the sequence. Since high accuracy in the motion vectors is desired, theestimator performs a full-search over the window. The minimum mean-squared errorcriterion gives the best block match.The choice of the block size will have a great impact on results. A smaller blocksize will tend to produce more false motion vectors despite any noise estimation, but willresult in finer edge definition in the resulting segmented image. Larger block sizes havecoarser edges but are less plagued by noise effects. Furthermore the number ofcomputations necessary for motion tracing goes down as the block size increases sincethere are fewer motion vectors to analyze. For the images analyzed (144 x 176 pixels) ablock size of 8 pixels is used and seems to be a reasonable choice.2.3 Motion TracerThis block represents the heart of the segmentation algorithm. Its function is todistinguish regions that are moving in any frame or have moved at any time throughoutthe sequence of images. By analyzing the motion in this way, segmentation of regionsthat only move briefly is possible. For instance, in


View Full Document

Stanford EE 392J - Motion Based Foreground Segmentation

Download Motion Based Foreground Segmentation
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Motion Based Foreground Segmentation and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Motion Based Foreground Segmentation 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?