Berkeley COMPSCI 294 - Transfer Learning of Object Classes - D1035204

Home> Schools> University of California, Berkeley> Computer Science (COMPSCI) > COMPSCI 294> Transfer Learning of Object Classes

DOC PREVIEW

Berkeley COMPSCI 294 - Transfer Learning of Object Classes

School name University of California, Berkeley

Course Compsci 294- Special Topics

Pages 4

This preview shows page 1 out of 4 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Transfer Learning of Object Classes:From Cartoons to PhotographsGeremy Heitz, Gal Elidan, Daphne KollerStanford University, Stanford, CA{gaheitz,galel,koller}@cs.stanford.eduAbstractWe consider the important challenge of recognizing a variety ofdeformable objects in images. Of fundamental importance andparticular difficulty in this setting is the problem of “outlining” anobject, rather than simply deciding on its presence or absence. Amajor obstacle in learning a model that will allow us to addressthis task is the need for hand-segmented training images. In thispaper we present a transfer learning approach that circumvents thisproblem by transferring the “essence” of an object from cartoonimages to natural images, using a landmark-based model. The useof transfer to create an automatic model-learning pipeline greatlyincreases our efficiency and flexibility in learning novel objects withminimal user supervision. We show that our method is able toautomatically learn, detect and localize a variety of classes.1 IntroductionRecognition and localization of instances of object classes is an important openproblem in the field of computer vision. Many papers address this problem usinga variety of methods. The constellation method [3] attempts to recognize and lo-calize objects of interest using a generative model of object “parts”, which appearin the image as interest operator patches. Berg et al. [1] use a landmark-basedmodel and perform recognition using a nearest neighbor search relative to exem-plar images. They solved the correspondence problem using a quadratic integerprogram.Coughlan and Ferreira [2], solve the correspondence problem using loopybelief propagation (LBP), for simple objects including handwritten letters and stickfigures.All of these suffer from the fact that fully-supervised data is hard to obtain. Theysolve this by either using Expectation-Maximization [3], using many image pointsand hoping that most of them are inside the object [1], or concentrating on simpleobjects with trivial segmentations [2].In this work, we present a new transfer learning approach that circumvents thismajor obstacle, enabling us to “self-learn” using minimal user supervision. In thefirst phase of our algorithm we automatically learn a simple landmark model fromcartoon images of the object. We then correspond our model to candidate naturaltraining images using MRF inference in order to automatically identify useful can-didates. In the second phase of the algorithm, we use these candidates to constructa more elaborate landmark based model (i.e. one that also takes into account ap-pearance). We then use this model to identify and predict the location of objectsin unseen test instances.We show that our method effectively bootstraps the simple cartoon images andis able to localize objects in natural images surprisingly well. Interestingly, wedemonstrate that learning from cartoon images is often superior to learning fromhand-segmented examples, as the “essence” of the object is captured in the firstphase of our algorithm without human bias.2 Landmark-Based Object ModelWe model an object as a set of landmark points lying on the object boundary. Eachlandmark has local information about the image appearance in its neighborhood aswell as local edge information. In addition, pairs of landmarks have informationabout their relative locations, rotations, and scales.2.1 Localization using InferenceTo localize an object in an image, we define a Markov Random Field (MRF) whosevariables correspond to the landmarks of the model. An assignment to these vari-ables is a correspondence between the model and image pixels and the potentialsof the MRF take into account both the local and inter-landmark features. Thus,the problem of object localization is translated into the problem of finding the mostlikely assignment in a probabilistic graphical model. We search for this assignmentusing loopy belief propagation.3 Transfer LearningOur learning procedure is composed of two principal stages, one that involves onlysimple cartoon images and one that involves natural images. By using this two-phase approach, we can automatically determine which landmarks to use and avoidsome of the pitfalls described above. We are thus able to provide a more stablemodel than those produced using previous methods.In phase 1 of the model learning (see Figure 1), we automatically extract outlinesfrom a set of cartoon images of the object to be learned. This gives us a high-resolution contour of the outline. We then determine the correspondences betweenall of the training instances using a slight modification of our correspondence al-gorithm described above. This step produces a common parameterization for thetraining contours. By using MDL-like considerations, we can then select a set oflandmarks that most accurately represents the shape. From the chosen landmarks,we construct our phase 1 model, which captures local edge features and pairwise in-teractions between the landmarks. Such a model will contain little or no appearanceinformation, due to the artificial nature of the training images.In phase 2 (see Figure 2), we correspond the phase 1 model to each instance in ournatural image training set. From this we select the correspondences that score mosthighly (i.e. the ones we are most confident about), and use these as the phase 2training instances. By doing this, we have bootstrapped the creation of a trainingset using the information automatically extracted from the cartoon images. Thismodel will now contain the full appearance model derived from the natural images.Figure 1: Phase 1 of model learning. We begin with a set of cartoon images andextract a high resolution outline contour. We then correspond these contours andselect a set of landmarks that b est represents the shape across all of the traininginstances. A model is then learned from these cartoon instances.By breaking the process into two phases, we have reduced the number of parametersthat must be learned at any one time. In phase 1 the only parameters to learn arewhich points along the full-resolution contour to use as landmarks. In phase 2, webegin with fully-supervised training instances, so maximum-likelihood learning ofthe model parameters is straightforward.4 Preliminary ResultsWe have applied our procedure to several object categories with promising results.Below we show a few representative results for the “car side” class in the Caltech101

View Full Document

Berkeley COMPSCI 294 - Transfer Learning of Object Classes

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

This preview shows page 1 out of 4 pages.

Berkeley COMPSCI 294 - Transfer Learning of Object Classes

Sign up for free to view:

Please select your school