DOC PREVIEW
Stanford EE 368 - IMAGE GUIDED TOURS

This preview shows page 1-2 out of 7 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1 IMAGE-GUIDED TOURS: FAST-APPROXIMATED SIFT WITH U-SURF FEATURES Eric Chu, Erin Hsu, Sandy Yu Department of Electrical Engineering Stanford University {echu508, erinhsu, snowy}@stanford.edu Abstract – In this paper, a feature-detecting algorithm to identify paintings from mobile phone camera images is presented. The algorithm is capable of identifying paintings from noisy camera-phone images. Even with typical blurring, rotation, and a small amount of perspective distortion that would occur in a controlled museum setting, by training on a set of standard painting images, the paintings can be correctly identified. With calculation-intensive portions of the code in C, the algorithm runs efficiently as well as accurately. All 99 test images were detected correctly, with an average runtime of 0.9228 seconds on SCIEN machines. I. INTRODUCTION The prevalence of cellular phones with built-in cameras has stimulated a growing interest in the application of image recognition on mobile phones. Various applications are already in use, such as visual and bar code scanning technologies [1], image-based search engines [2], location disclosing photo recognition [3], and many more that are currently in development. The EE368 final project this quarter is the application of image recognition on mobile phones in a museum setting. Given a set of training images, consisting of several perspectives of 33 paintings in a gallery, and an image to be matched (which shall be referred to as the test image), the name of the painting is to be returned. In a real-world scenario, this kind of system could replace the numbered audio commentaries prevalent in many museums today and give visitors additional information via the Internet or another source. The training images consist of one painting without its frame, and some background, but it is not a difficult task to seek out the painting, since the background is a controlled white environment. Cell phone camera images tend to be poorer than regular cameras, due to low quality lenses, an increased likelihood of camera shake, and a lack of flash on many models. Noise, motion blur, low contrast, and poor lighting often plague these images. For this project, the museum provides consistent lighting in each image, so the main difficulties are noise, blur, perspective, and scaling. Several algorithms have been established for feature detection, including Scale-Invariant Feature Detection (SIFT) [4][5][6] and Speeded Up Robust Features (SURF) [7]. Both algorithms create an image pyramid, filter the pyramids appropriately to detect points of interest, quantify the features, then compare with features of another image to find matches. The algorithm described is based on this same structure but deviates from both SIFT and SURF in different areas. II. ALGORITHM DESIGN Figure 1: Algorithm flowchart Pre-Processing•RGB to Grayscale•8:1 down samplingFeature Detection•Difference of means pyramid•Local maxima detectorFeature Descriptor•Haar Coefficients•Upright SURF-64Feature Comparison•Euclidean distance•Ratio threshold requirement2 The algorithm applied to this project was developed based on studying and experimenting with SIFT and SURF processes as well as other topics from course material. After much testing, the final algorithm developed utilizes a combination of methods that are best suited to match paintings in a relatively consistent environment, while also optimizing for speed. The major steps of the algorithm involve pre-processing the test images, finding relevant features, quantifying features with local descriptors, and comparing these features with the training paintings to correctly identify the test images, as shown in Figure 1. A. Pre-Processing The first step in matching two images is often to balance the color, brightness, or contrast. Color balancing was not implemented because of the well-controlled lighting in the environment, and also because the selected feature descriptor is invariant to changes in intensity. Because of the standard museum setting and lighting, changes in brightness between images were not significant, so that was also not used. Finally, contrast-balancing was also applied at one point but also did not considerably affect accuracy. Therefore, the pre-processing in this algorithm consisted only of converting the RGB images to intensity (grayscale) images and an 8:1 down sampling to speed up the runtime of the algorithm. Color information was discarded because of its unreliability in most computer vision settings, and while the LAB colorspace could have been used to simulate color constancy in human vision, the increased runtime was unappealing, and the same task could be conducted on only the intensity image. Because the input images are of extremely high resolution, down sampling is used to decrease the number of pixels used to construct the image pyramids and thereby provide a Figure 2: The image on the left shows the aliasing that occurs when down sampling 8:1 with no filter. The image on the right is the result with a filter applied. significant decrease in processing time. Prior to down sampling, a low pass averaging filter was applied to the grayscale image in order to prevent aliasing when sampling, as demonstrated in Figure 2. This filter was chosen to precisely zero-out the zoneplate frequencies at /8 and significantly attenuate higher frequencies. B. Feature Detection Once the images have been pre-processed, the next step is to find relevant features to use to match paintings. These features are often corners or edges in the image but should be invariant to scale and stable, meaning that these features should be consistently detected even in a different setting. Because the Harris corner detector is not scale invariant, it was not a good candidate for feature detection. Rather, SIFT and SURF features, which are scale-invariant, were decided on as the features of interest. SIFT and SURF algorithms have slightly different ways of detecting features. SIFT builds an image pyramid, filtering each


View Full Document

Stanford EE 368 - IMAGE GUIDED TOURS

Documents in this Course
Load more
Download IMAGE GUIDED TOURS
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view IMAGE GUIDED TOURS and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view IMAGE GUIDED TOURS 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?