DOC PREVIEW
Stanford EE 368 - Lecture Notes

This preview shows page 1 out of 4 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

IntroductionAlgorithm OverviewFeature DetectionDescriptor GenerationMatchingEvaluationConclusionReferencesRecognition of Paintings at an Exhibition Sara Bolouki EE368 course project, spring 2007 Abstract – This report presents an automatic method for recognition of paintings in a museum. The developed method recognizes the given input picture among 33 classes of paintings and provides the name of the painting as output. The proposed method can be used as an interactive museum guide, which provides more information of the paining to the visitors. I. Introduction Recognition is considered to be one of the most important applications of image processing and computer vision. Human vision system performs this task for us in an outstanding manner in everyday life. However, when it comes to computer vision, there are some challenges to overcome; these challenges can be categorized according to different cameras characteristics, change in environment lightening, affine transformations, noise, and changes in point of view. Therefore any proposed recognition method should be as robust as possible to tolerate these changes. In this project a robust recognition method is proposed, implemented and evaluated on a set of 33 classes of paintings. The rest of the report is organized as follows: in Section 2 an overview of the algorithm is presented. In Section 3 feature detection algorithm is described while Section 4 contains details about generating descriptors. In section 5 the evaluation methodology and results are presented and Section 6 concludes the report. II. Algorithm Overview The proposed algorithm consists of four stages: preprocessing, feature detection, descriptor calculation and matching. These stages are shown in Figure 1. Since the images are taken with a mobile phone camera and in low light condition of the exhibition, they are suffer a low signal to noise ratio. In order to remove this noise, images are first converted to gray-level representation and then passed through a low pass filter. Sine the resolution of the input image is 2048 x1536, in order to speed up the processing, it is first down-sampled to one fourth of its original size. All these steps are shown as preprocessing in Figure 1. In the next step, feature detection, all the feature points of the preprocessed image are extracted. In the employed algorithm, feature points are defined as corners since corners can be considered as features that remain unchanged by variation of the scale and lightening condition, as well as applying affine transformation. In this project Harris method [1] is used for detecting corners. The third step of our recognition algorithm is calculating a descriptor for each feature point. We use SURF [2] algorithm to describe each feature. Finally the last step of the recognition procedure is to match the descriptors of the given image to the training data, and making the decision; the class with the highest number of matches will be considered as the final match. Feature detectionPreprocessingMatch train and query desrcpritors Descriptor calcu lationPaintingName of the painting Figure 1 Algorithm overview III. Feature Detection As mentioned earlier, we consider corners as robust features in the recognition process. According to Noble [3] and Harris [1],normalized value of the determinant of the matrix shown in equation 1, describes the corner strength at each pixel. ⎥⎥⎥⎥⎥⎦⎤⎢⎢⎢⎢⎢⎣⎡⎟⎟⎠⎞⎜⎜⎝⎛∂∂⎟⎟⎠⎞⎜⎜⎝⎛∂∂⎟⎠⎞⎜⎝⎛∂∂⎟⎟⎠⎞⎜⎜⎝⎛∂∂⎟⎠⎞⎜⎝⎛∂∂⎟⎠⎞⎜⎝⎛∂∂=22),(),(),(),(),(),(),(yyxIyyxIxyxIyyxIxyxIxyxIyxM Equation 1 For approximating the gradient at each pixel, Prewitt filter has been used. For noise resilience purposes, the gradients are smoothed by a Gaussian filter. I in Equation 1 denotes the smoothing operation on I. The local maxima of the determinant of M, is considered as the corner points of the image. Considering the normalized determinant of M as a gray level image, we can use gray-level dilation to calculate the local maxima at a specific neighborhood. The dilated image is then threshold-ed and the points larger than the threshold are considered as corners. Figure 2 shows the detected corners on two sample images. The paintings are the same but the scale is different. Also, the first image is slightly rotated compared to the second one. However, as demonstrated in Figure 2, recognized feature points are consistent. Figure 2 (a) Corner detection results on smaller scale and slightly rotated Figure 2 (b) Corner detection result on larger scale In order to be independent of the changes in the scale of input, feature detection is performed in two different scales. The second scale of the input image is calculated using Haar wavelet, and LL part is considered as the second scale. IV. Descriptor Generation The next step of the algorithm is to calculate a descriptor for each feature point. The descriptor is desired to be invariant regarding the scale and the orientation. We used a simpler version of the SURF [2] descriptor for describing the characteristics of each feature point. Since in this specific case the method is applied to paintings mounted on a wall, there is not a lot of variation in the orientation. Therefore the dominant orientation is not considered in order to specify the window in which SURF descriptor is calculated. As described in [2] ∑dx, ∑|dx|, ∑dy, ∑|dy|, are calculated in a sub-window of size 4x4 in the window around the feature point. This descriptor represents the intensity pattern [2] around each feature point by calculating the gradient in small regions. As described before, the feature points are detected in two different scales, where the size of the window for each scale is different. If the feature point is detected in the first scale, the window size is 64x64, and if it was selected in the second scale, the descriptor will be calculated in a 32x32 window. Since we want to have an equal number of samples in each scale, in the first scale the 64x64 window is sub-sampled by a factor of two. The size of descriptor is 256 and Sobel filter is used to calculate the gradient in each window.V. Matching The final step of the algorithm is to match the descriptors of the query image to the descriptors of the training data, and find the class with highest number of matches. To find the distance between the descriptor


View Full Document

Stanford EE 368 - Lecture Notes

Documents in this Course
Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?