DOC PREVIEW
Stanford EE 368 - Study Notes

This preview shows page 1-2 out of 5 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1 Abstract—To aid visitors of art centers in automatically identifying an object of interest, a computer vision algorithm known as SIFT has been implemented using MATLAB. Salient aspects of the algorithm have been discussed in light of image recognition applications. Test methods to validate the algorithm have been detailed out. Performance improvement for SIFT also has been proposed. 1. INTRODUCTION Museums and art centers present objects of interests, which contain important domain-specific information that visitors are interested in. Art exhibition centers usually provide a guide or a booklet, which visitors manually scan to find out general information about an object of interest. However this manual and passive method of information retrieval is tedious and interrupts visitor’s art-viewing experience. The goal of this project is to develop an automated software-based technique, which can recognize paintings on display in an art-center based on snapshots taken with a camera-phone. 2. BACKGROUND Comparing images in order to establish a degree of similarity is an important computer vision problem and has application in various domains such as interactive museum guide, robot localization, content-based medical image retrieval, and image registration. Comparing images remains a challenging task because of issues such as variation in illumination conditions, partial occlusion of objects, differences in image orientation etc. Global image characteristics such as color histograms, responses to filter banks etc. are usually not effective in solving real-life image matching problems. Researchers have recently turned their attention to local features in an image, which are invariant to common image transformations and variations. Usually two broad steps are found in any local feature-based image-matching scheme. The first step involves detecting features (also referred to as keypoints or interest points) in an image in a repeatable way. Repeatability is important in this step because robust matching cannot be performed if the detected locations of keypoints on an object vary from image to image. The second step involves computing descriptors for each detected interest point. These descriptors are useful to distinguish between two keypoints. The goal is to design a highly distinctive descriptor for each interest point to facilitate meaningful matches, while simultaneously ensuring that a given interest point will have the same descriptor regardless of the object’s pose, the lighting in the environment, etc. Thus both detection and description steps rely on invariance of various properties for effective image matching. Image matching techniques based on local features are not new in the computer vision field. Van Gool [5] introduced the generalized color moments to describe the shape and the intensities of different color channels in a local region. Sven Siggelkow [2] used feature histograms for content-based image retrieval. These methods have achieved relative success with 2D object extraction and image matching. Mikolajczyk and Schmid [3] used the differential descriptors to approximate a point neighborhood for image matching and retrieval. Schaffalitzky and Zisserman [4] used Euclidean distance between orthogonal complex filters to provide a lower bound on the Squared Sum Differences (SSD) between corresponding image patches. David Lowe proposed [1] Scale Invariant Feature Transforms (SIFT), which are robustly resilient to several common image transforms. Mikolajczyk and Schmid [6] reported an experimental evaluation of several different descriptors where they found that the SIFT descriptors obtain the best matching results. 3. SELECTION OF LOCAL FEATURES The following requirements were key in selecting a suitable local-feature for images used in this project: (a) Invariance: The feature should be resilient to changes in illumination, image noise, uniform scaling, rotation, and minor changes in viewing direction (b) Highly Distinctive: The feature should allow for correct object identification with low probability of Recognizing pictures at an exhibition using SIFT Rahul Choudhury Biomedical Informatics Department Stanford University EE 368 Project Report2mismatch. (c) Performance: Given the nature of the image recognition problem for an art center, it should be relatively easy and fast to extract the features and compare them against a large database of local features. 4. BRIEF DESCRIPTION OF THE ALGORITHM Based on the requirements mentioned in the previous section and the reported robustness in [6], I selected the Scale Invariant Feature Transform (SIFT) [1] approach for this project. SIFT is an approach for detecting and extracting local feature descriptors that are reasonably invariant to changes in illumination, image noise, rotation, scaling, and small changes in viewpoint. Before performing any multi-resolution transformation via SIFT, the image is first converted to grayscale color representation. A complete description of SIFT can be found in [1]. An overview of the algorithm is presented here. The algorithm has four major stages as mentioned below: • Scale-space extrema detection: The first stage searches over scale space using a Difference of Gaussian function to identify potential interest points. • Keypoint localization: The location and scale of each candidate point is determined and keypoints are selected based on measures of stability. • Orientation assignment: One or more orientations are assigned to each keypoint based on local image gradients. • Keypoint descriptor: A descriptor is generated for each keypoint from local image gradients information at the scale found in the second stage. Each one of the above-mentioned stages is elaborated further in the following sections. A. Scale-space Extrema Detection The SIFT feature algorithm is based upon finding locations (called keypoints) within the scale space of an image which can be reliably extracted.


View Full Document

Stanford EE 368 - Study Notes

Documents in this Course
Load more
Download Study Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Study Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Study Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?