UT CS 395T - Adapted Vocabularies for Generic Visual Categorization - D1960549

Home> Schools> University of Texas at Austin> Computer Science (CS) > CS 395T> Adapted Vocabularies for Generic Visual Categorization

DOC PREVIEW

UT CS 395T - Adapted Vocabularies for Generic Visual Categorization

School name University of Texas at Austin

Course Cs 395t- Multicore Operating Systems Implementation

Pages 12

This preview shows page 1-2-3-4 out of 12 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 12 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 12 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 12 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 12 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 12 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

IntroductionUniversal and Adapted VocabulariesMLE Training of the Universal VocabularyMAP Adaptation of Class VocabulariesBipartite HistogramsComputational CostExperimental ValidationExperimental SetupResultsConclusionAdapted Vocabularies for Generic VisualCategorizationFlorent Perronnin, Christopher Dance, Gabriela Csurka, and Marco BressanXerox Research Centre Europe, 6, chemin de Maupertuis, 38240 Meylan, France{Firstname.Lastname}@xrce.xerox.comAbstract. Several state-of-the-art Generic Visual Categorization(GVC) systems are built around a vocabulary of visual terms and char-acterize images with one histogram of visual word counts. We proposea novel and practical approach to GVC based on a universal vocabu-lary, which describes the content of all the considered classes of images,and class vocabularies obtained through the adaptation of the univer-sal vocabulary using class-speciﬁc data. An image is characterized bya set of histograms - one per class - where each histogram describeswhether the image content is best modeled by the universal vocabulary orthe corresponding class vocabulary. It is shown experimentally on threevery diﬀerent databases that this novel representation outperforms thoseapproaches which characterize an image with a single histogram.1 IntroductionGeneric Visual Categorization (GVC) is the pattern classiﬁcation problem whichconsists in assigning one or multiple labels to an image based on its semanticcontent. We emphasize the use of the word “generic” as the goal is to classify awide variety of objects and scenes. GVC is a very challenging task as one has tocope with variations in view, lighting and occlusion and with typical object andscene variations.Several state-of-the-art GVC systems [14, 1, 4, 9, 16] were inspired by the bag-of-words (BOW) approach to text-categorization [13]. In the BOW representa-tion, a text document is encoded as a histogram of the number of occurrencesof each word. Similarly, one can characterize an image by a histogram of visualwords count. The visual vocabulary provides a “mid-level” representation whichhelps to bridge the semantic gap between the low-level features extracted froman image and the high-level concepts to be categorized [1]. However, the maindiﬀerence with text categorization is that there is no given visual vocabulary forthe GVC problem and it has to be learned automatically from a training set.To obtain the visual vocabulary, Sivic and Zisserman [14] and Csurka et al.[4] originally proposed to cluster the low-level features with the K-means algo-rithm, where each centroid corresponds to a visual word. To build a histogram,each feature vector is assigned to its closest centroid. Hsu and Chang [9] andWinn et al. [16] made use of the information bottleneck principle to obtain morediscriminative vocabularies. Farquhar et al. also proposed a generative model,the Gaussian Mixture Model (GMM), to perform clustering [7]. In this case, aA. Leonardis, H. Bischof, and A. Prinz (Eds.): ECCV 2006, Part IV, LNCS 3954, pp. 464–475, 2006.c Springer-Verlag Berlin Heidelberg 2006Adapted Vocabularies for Generic Visual Categorization 465low-level feature is not assigned to one visual word but to all words probabilisti-cally, resulting in a continuous histogram representation. They also proposed tobuild the vocabulary by training class speciﬁc vocabularies and agglomeratingthem in a single vocabulary (see also the work of Leung and Malik [10] andVarma and Zisserman [15] for the related problem of texture classiﬁcation). Al-though substantial improvements were obtained, we believe that this approachis unpractical for a large number of classes C. Indeed, if N is the size of theclass-vocabularies, the size of the agglomerated vocabulary, and therefore of thehistograms to be classiﬁed, will be C × N (c.f. the curse of dimensionality).Our emphasis in this work is on developing a practical approach which scaleswith the number of classes. We deﬁne a universal vocabulary, which describesthe visual content of all the considered classes, and class vocabularies,whichareobtained through the adaptation of the universal vocabulary using class-speciﬁcdata. While other approaches based on visual vocabularies characterize an imagewith a single histogram, in the proposed approach, an image is represented by aset of histograms of size 2×N , one per class. Each histogram describes whether animage is more suitably modeled by the universal vocabulary or the correspondingadapted vocabulary.The remainder of this paper is organized as follows. In section 2, we motivatethe use of a universal vocabulary and of adapted class-vocabularies and describethe training of both types of vocabularies. In section 3, we show how to charac-terize an image by a set of histograms using these vocabularies. In section 4, weexplain how to reduce signiﬁcantly the computational cost of the proposed ap-proach with a fast scoring procedure. In section 5, we show experimentally thatthe proposed representation outperforms those approaches which characterizean image with a single histogram. Finally, we draw conclusions.2 Universal and Adapted VocabulariesLet us ﬁrst motivate the use of a universal vocabulary and of adapted class-vocabularies with a simple two-class problem where cats have to be distinguishedfrom dogs.A universal vocabulary is supposed to represent the content of all possibleimages and it is therefore trained with data from all classes under consideration.Since cats and dogs have many similarities, cats’ and dogs’ low-level featurevectors are likely to cluster into similar visual words such as “eye”, “ear” or“tail”. Hence, a histogram representation based on such a vocabulary is notpowerful enough to help distinguish between cats and dogs. However, one canderive class vocabularies by adapting the universal vocabulary with class-speciﬁcdata. Therefore, the universal “eye” word is likely to be specialized to “cat’seye” and “dog’s eye” as depicted on Figure 1. Note that, although visual wordsare not guaranteed to be as meaningful as in the previous example, we believethat the combination of these universal and speciﬁc representations provides thenecessary information to discriminate between classes.As there exists a large body of work on the adaptation of GMMs, we representa vocabulary of visual words by means of a GMM as done in [7]. Let us denote466 F. Perronnin et al.Fig. 1. The cats and dogs

View Full Document

UT CS 395T - Adapted Vocabularies for Generic Visual Categorization

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-4 out of 12 pages.

UT CS 395T - Adapted Vocabularies for Generic Visual Categorization

Sign up for free to view:

Please select your school