New version page

OSU CS 559 - Class-Specific Hough Forests for Object Detection

This preview shows page 1-2-3 out of 8 pages.

View Full Document
View Full Document

End of preview. Want to read all 8 pages?

Upload your study docs or become a GradeBuddy member to access this document.

View Full Document
Unformatted text preview:

Class-Specific Hough Forests for Object DetectionJuergen GallBIWI, ETH Z¨urich and MPI Informatik∗[email protected] LempitskyMicrosoft Research [email protected] present a method for the detection of instances of anobject class, such as cars or pedestrians, in natural images.Similarly to some previous works, this is accomplished viageneralized Hough transform, where the detections of in-dividual object parts cast probabilistic votes for possiblelocations of the centroid of the whole object; t he detectionhypotheses then correspond to the maxima of the Houghimage that accumulates the votes from all parts. However,whereas the previous methods detect object parts using gen-erative codebooks of part appearances, we take a more dis-criminative approach to object part detection. Towards thisend, we train a class-specific Hough forest, which is a ran-dom forest that directly maps the image patch appearanceto the probabilistic vote about the possible location of theobject centroid. We demonstrate that Hough forests improvethe results of the Hough-transform object detection signifi-cantly and achieve state-of-the-art performance for severalclasses and datasets.1. IntroductionThe appearance of objects of the same class such as carsor pedestrians in natural images vary greatly due to intra-class differences, changes in illuminations, and imagingconditions, as well as object articulations. Therefore, toease the detection (localization) most of the methods takethe bottom-up, part-based approach, where the detectionsof individual object parts (features) are further integrated toreason about the positioning of the entire objects.Toward this end, the Hough-transform based method ofLeibe et al. [10, 11] learns the class-specific implicit shapemodel (ISM), which is essentially a codebook of interestpoint descriptors typical for a given class. After the code-book is created, each entry is assigned a set of offsets withrespect to the object centroid that are observed on the train-ing data. At runtime, the interest point descriptors in theimage are matched against the codebook and the matchescast probabilistic votes about possible positions of the ob-ject in the scale-space. These votes are summed up into aHough image, the peaks of it being considered as detectionhypotheses. The whole detection process can thus be de-∗The major part of the research project was undertaken when JuergenGall was an intern with Microsoft Research Cambridge.scribed as a generalized class-specific Hough transform [4].Implicit shape models can integrate information from alarge number of parts. They also demonstrate good gen-eralization as they are free to combine parts observed ondifferent training examples. Furthermore, the additive na-ture of the Hough transform makes the approach robust topartial occlusions and untypical unseen part appearances.However, such codebook-based Hough transform comes ata significant computational price. On the one hand, a largegenerative codebook is required to achieve good discrimina-tion. On the other hand, the construction of large codebooksinvolves solving difficult, large-scale clustering problems.Finally, matching with the constructed codebook is time-consuming, as it is typically linear in the number of entries.In the paper, we develop a new Hough transform-baseddetection method, which takes a more discriminative ap-proach to part detection. Rather than using an explicit code-book of part appearances, we learn a direct mapping be-tween the appearance of an image patch and its Hough vote.While learning such a mapping cannot be formalized as astandard classification or regression problem, we demon-strate that it can be efficiently accomplished within the ran-dom forest framework [2, 7]. Thus, given a dataset of train-ing images with the bounding-box annotated samples of theclass instances, we learn a class-specific random forest thatis able to map an image patch to a probabilistic vote aboutthe position of an object centroid. At runtime, such a class-specific Hough forest is applied to the patches in the testimage and the resulting votes are accumulated in the Houghimage, where the maxima are sought.Random forests have recently attracted a lot of atten-tion in computer vision [6, 12, 20, 25, 27]. Related to ourwork, the idea of replacing generative codebooks with ran-dom forests has been investigated in the context of imageclassification and semantic segmentation in [13, 14, 20, 25].Most similar to Hough forests are the classification randomforests used to obtain the unary potentials within the Lay-outCRF method [27].While Hough forests are in many aspects similar to otherrandom forests in computer vision, they possess several in-teresting specific properties, motivated by their use withinthe generalized Hough transform framework:• The set of leaf nodes of each tree in the Hough-forestcan be regarded as a discriminative codebook. Each leafnode makes a probabilistic decision whether a patch cor-1(a) – Original image with threesample patches emphasized(b) – Votes assigned to thesepatches by the Hough forest(c) – Hough image aggregatingvotes from all patches(d) – The detection hypothesiscorresponding to the peak in (c)Figure 1. For each of the three patches emphasized in (a), the pedestrian class-specific Hough forest casts a vote about the possible locationof a pedestrian centroid (b) (each color channel corresponds to the vote of a sample patch). Note the weakness of the vote from thebackground patch (green). After the votes from all patches are aggregated into a Hough image (c), the pedestrian can be detected (d) as apeak in this image.responds to a part of the object or to the background, andcasts a probabilistic vote about the centroid position withrespect to the patch center.• The trees in a Hough forest are built directly to optimizetheir voting performance. In other words, the trainingprocess builds each tree so that the leaves produce prob-abilistic votes with small uncertainty.• Each tree is built based on the collection of patchesdrawn from the training data. Importantly, the build-ing process employs all the supervision available for thetraining data: namely, whether a patch comes from abackground or an object, in the latter case, which partof the object does it come from.Our method also benefits from the advantages typical toother random forests applications. Thus:• Random forests can be trained on large, very


View Full Document
Loading Unlocking...
Login

Join to view Class-Specific Hough Forests for Object Detection and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Class-Specific Hough Forests for Object Detection and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?