New version page

Retrieval

Upgrade to remove ads

This preview shows page 1-2 out of 7 pages.

Save
View Full Document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience

Upgrade to remove ads
Unformatted text preview:

microsoft.comMicrosoft Word - p1568934734-he.docLearning an Image Manifold for Retrieval Xiaofei He*, Wei-Ying Ma, and Hong-Jiang Zhang Microsoft Research Asia Beijing, China, 100080 {wyma,hjzhang}@microsoft.com *Department of Computer Science, The University of Chicago [email protected] ABSTRACT We consider the problem of learning a mapping function from low-level feature space to high-level semantic space. Under the assumption that the data lie on a submanifold embedded in a high dimensional Euclidean space, we propose a relevance feedback scheme which is naturally conducted only on the image manifold in question rather than the total ambient space. While images are typically represented by feature vectors in Rn, the natural distance is often different from the distance induced by the ambient space Rn. The geodesic distances on manifold are used to measure the similarities between images. However, when the number of data points is small, it is hard to discover the intrinsic manifold struc-ture. Based on user interactions in a relevance feedback driven query-by-example system, the intrinsic similarities between im-ages can be accurately estimated. We then develop an algorithmic framework to approximate the optimal mapping function by a Radial Basis Function (RBF) neural network. The semantics of a new image can be inferred by the RBF neural network. Experi-mental results show that our approach is effective in improving the performance of content-based image retrieval systems. Categories and Subject Descriptors H.3.1 [Information Storage and Retrieval]: Content Analysis and Indexing – Algorithms, Indexing methods. General Terms Algorithms, Management, Performance, Experimentation. Keywords Image Retrieval, Semantic Space, Manifold Learning, Dimension-ality Reduction, Riemannian Structure 1. INTRODUCTION Content-Based Image Retrieval (CBIR) [3][9][12][14][21] is a long standing research problem in computer vision and informa-tion retrieval. Most of previous image retrieval techniques build on the assumption that the image space is Euclidean. However, in many cases, the image space might be a non-linear sub-manifold which is embedded in the ambient space. Intrinsically, there are two fundamental problems in image retrieval: 1) How do we rep-resent an image? 2) How do we judge similarity? One possible solution to these two problems is to learn a mapping function from the low-level feature space to the high-level seman-tic space. The former is not always consistent with human percep-tion while the latter is what image retrieval system desires to have. Specifically, if two images are semantically similar, then they are close to each other in semantic space. In this paper, our approach is to recover semantic structures hidden in the image feature space such as color, texture, etc. In recent years, much has been written about relevance feedback in content-based image retrieval from the perspective of machine learning [16][17][18][19][20], yet most learning methods only take into account current query session and the knowledge ob-tained from the past user interactions with the system is forgotten. To compare the effects of different learning techniques, a useful distinction can be made between short-term learning within a single query session and long-term learning over the course of many query sessions [6]. Both short- and long-term learning proc-esses are necessary for an image retrieval system though the for-mer has been the primary focus of research so far. We present a long-term learning method which learns a radial basis function neural net-work for mapping the low-level image features to high-level semantic features, based on user interactions in a relevance feedback driven query-by-example system. As we point out, the choice of the similarity measure is a deep question that lies at the core of image retrieval. In recent years, manifold learning [1][4][11][13][15] has received lots of attention and been applied to face recognition [7], graphics [10], document representation [5], etc. These research efforts show that manifold structure is more powerful than Euclidean structure for data repre-sentation, even though there is no convincing evidence that such manifold structure is accurately present. Based on the assumption that the images reside on a low-dimensional submanifold, a geo-metrically motivated relevance feedback scheme is proposed for image ranking, which is naturally conducted only on the image manifold in question rather than the total ambient space. It is worthwhile to highlight several aspects of the framework of analysis presented here: (1) Throughout this paper, we denote by image space the set of all the images. Different from most of previous geometry-based works which assume that the image space is a Euclid-ean space [8][12], in this paper, we make a much weaker as-sumption that the image space is a Riemannian manifold em- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. MM’04, October 10–16, 2004, New York, New York, USA. Copyright 2004 ACM 1-58113-893-8/04/0010…$5.00. ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ This work was done while Xiaofei He was a summer intern at Microsoft Research Asia.bedded in the feature space. Particularly, we call it image manifold. Generally, the image manifold has a lower dimen-sionality than the feature space. The metric structure of the image manifold is induced but different from the metric structure of the feature space. Thus, a new algorithm for im-age retrieval which takes into account the intrinsic metric structure of the image manifold is needed. (2) Given enough images, it is possible to recover the image manifold. However, if the number of images is too small, then any algorithm can hardly discover the intrinsic metric structure of the image manifold. Fortunately, in image re-trieval, we can make use of user provided information to learn a semantic space that is locally isometric to the image manifold. This semantic space is Euclidean and hence the geodesic distances on the image manifold can be approxi-mated by the Euclidean


Download Retrieval
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Retrieval and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Retrieval 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?