Clemson ECE 847 - Multiperson Tracking Using Kalman Filter

Unformatted text preview:

Multiperson Tracking Using Kalman FilterSalil P. Banerjee, Kris. PallipuramDecember 10, 2008AbstractThis paper addresses the problem of implementationof the Kalman filter to track multiple persons in aroom. First, an occupancy map of the room has beencreated using the six cameras which are distributedacross the room. Then the Kalman filter has beenimplemented to track the centroids of the persons de-tected in the room. The Kalman filter tracks personseven when their blobs merge, providing increased ef-ficieny in tracking multiple persons in the room.IntroductionTracking is the process of locating moving objects intime using a camera. An algorithm analyzes eachframe and outputs the location of moving targetswithin the video frame[1]. This algorithm has twobasic steps - detection of moving objects (here peo-ple) in each frame and filtering/tracking them in eachconsecutive frame. Detection of objects is a bottomup approach and have low computational complex-ity. There are various algorithms such as the blobtracking, kernel-based tracking, contour tracking andvisual feature matching, which can be used for thedetection algorithm. Filtering/tracking is basically atop down process and have higher computation com-plexity. Kalman filter and particle filter are two pop-ular algorithms used for filtering/tracking moving ob-jects.In 1960, R.E. Kalman published his paper describ-ing a recursive solution to a discrete data linear fil-tering data. Since that time, extensive research anddevelopment has taken place on the Kalman filter dueto the advances in digital computing[2].The Kalman filter is an algorithm which smoothsthe measurements by weighting them against the pre-dicted values by their variances. In other words, theKalman filter tries to find a balance between pre-dicted values and noisy measurements. The valuesof the weights are determined by modelling the stateequations. The purpose of the Kalman filter is totrack the system being measured at discrete intervalsof time.For a completely known system the noise can bereduced by the repeated application of window filterssuch as the mean filter,the median filter, the averag-ing filter and the gaussian filter. However these filterscan only be applied to a completely known system.The Kalman filter differs from these window filters inrespect that it predicts the next state based on theprevious states and needs no information of the futurestates. Its main purpose is to predict and smooth thepredicted state.Our approach has been to implement the Kalmanfilter on the centroids of the people detected in theoccupancy map. The occupancy map and the cen-troids of the people have already been computed forthe Frogger[3] game. In this problem, the state vari-ables are the x and y coordinates of the centroid ofeach tracked person.The Kalman filter has been implemented using thesoftware Visual C++ version 6.0 running on a ACPIMultiprocessor PC having a 3.2 GHz Intel Xeo Pro-cessor with 1 GB of RAM and Windows 2000 Profes-sional service pack 4 as the operating system.1MethodsSystemThe system is located in the sensor network labo-ratory in the basement of Riggs Hall. The floor ofthe sensor network laboratory is used as trackablespace. Six static cameras have been fixed across theroom, four at the corners and two at the center of theroom length. The layout of the room can be shownin Figure 1. The program for starting the system,camera calibration, acquiring live images from thecamera, background subtraction, centroid calculationand person identification decision making is creditedto Dr. Adam Hoover and Bent Olsen[3].Figure 1: Basic block diagram of the layout of theroom and the cameras.In order to detect the people from the live imagesthat have been captured, segmentation is required.In our project segmentation has been achieved bythe creation of the occupancy map from the sixstatic cameras. An occupancy map is a two dimen-sional raster image, uniformly distributed in the floorplane. Each map pixel contains a binary value, signi-fying whether the designated floorspace is occupiedor not. A spatial frame of the occupancy map iscomputed from a set of intensity images, one percamera, captured simultaneously using synchronoussignal. The system provides a 480×640 resolutionoccupancy map. The approach that has been usedto create the occupancy map is the image-freespaceperceptual paradigm. The model starts with an as-sumption that every space in the image is filled, thatis, has a value of 1 in case of a binary image. It thenprocesses this image to clear out the space. This ef-fectively ignores the observed object pixels. Hence acell of the model is transformed from being occupiedto empty. Thus if a camera cannot see an object thenit marks the corresponding pixel in its image as hav-ing a zero value, thereby creating a freespace. Theunion of the freespaces from multiple cameras createsa reasonable picture of occupied space.Creation of the occupancy map involves three ba-sic steps - calibration, acquisition of the backgroundimage and then finally determining the occupancymap for each frame. The cameras are uncalibratedand need to be calibrated so as to represent the twodimensional image plane into points in the three di-mensional world points. The transform Tnfor eachcamera’s image space I[n, c, r]tothe(x, y, z)worldspace is given byTn:[n, c, r] → (x, y, z)+(i, j, k)dd>0(1)where d is the distance from the camera to theground, n is the camera number, c and r are thecolumns and rows of the image formed in the cameraplane respectively.Next step is to acquire background image B[n,c,r]for each camera while the floorspace to be monitoredis empty. A binary mask M[n,c,r] is created for eachbackground image. If the pixel in the mask is 0, itdenotes empty floorspace. The program uses polygondrawing tool to create the mask space. The floorspaceused by our experimentation is cleared of chairs, ta-bles and person, while keeping heavy drawers nearthe walls of the room.The third and the final step is determine the occu-pancy map cell O[x,y] that each pixel I[n,c,r] views.The basic algorithm is to detect the difference be-tween the background image B[n, c, r] and the liveimage I[n, c, r] for each camera. The difference im-age is computed asD[n, c, r]=1 if|I[n, c, r] − B[n, c, r]| >T0 if|I[n, c, r] − B[n, c, r]|≤T(2)2where the threshold T controls the sensitivity of thealgorithm.Using the above equation, an occupancy map cellis


View Full Document

Clemson ECE 847 - Multiperson Tracking Using Kalman Filter

Download Multiperson Tracking Using Kalman Filter
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Multiperson Tracking Using Kalman Filter and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Multiperson Tracking Using Kalman Filter 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?