Stanford EE 368 - Visual Code Marker Detection - D273607

Home> Schools> Stanford University> Electrical Engineering (EE) > EE 368> Visual Code Marker Detection

DOC PREVIEW

Stanford EE 368 - Visual Code Marker Detection

School name Stanford University

Course Ee 368- Digital Image Processing

Pages 4

This preview shows page 1 out of 4 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

INTRODUCTIONSection IISection III : Theory of operationData Base Creation – II : From Real images Coordinate transformation - ICoordinate transformation - IIReferencesVisual Code Marker DetectionKapil Rai Student, EE368, Stanford University Abstract In this project report a scheme for Visual Code Marker detection in Cell phone Camera based images is outlined. Keywords Visual Code Marker, Eigen Image, Gaussian pyramid, Lin-ear discriminant analysis INTRODUCTION This report consists of five sections. Section II outlines the issues associated with the VCM detection and standard approaches. Section III discussed the theory of operation & describes the algorithmic steps in detail. Section IV dis-cusses the implementation details. In Section VI results are presented with conclusions. Section II Visual Code Markers have a distinctive pattern for classifi-cation. With the three corners and two guide bars, these markers are distinctively polarized and are suitable for de-tection. Since the VCM are always superimposed on an-other object (background), a classification method should carefully eliminate any collinear and repetitive patterns in the background. Often the background consists of textual and pictorial combination or simply a map with different illumination. Also, the relative size of the VCM (zoom), its orientation and the perspective, place considerable demand on the image pre-processing and registration methods that must precede the training or recognition steps. Section III : Theory of operation The training images are in 480x640 (JPEG) format. The preprocessing steps are shown in the figure 1. Firstly, the gray scale image is generated. Since the VCM are distinc-tive gray level images, the color segmentation is not of direct value. However, any background information (noise, clutter) that has prominent color components can be mini-mized by eliminating the color information. The RGB im-ages were converted to the HSV space and the saturation axis was used to identify the edges. On this axis, the VCM markers stand out. However, it must be clarified that all that is gotten from this step is only a measure towards the Region of Interest classification. Let the regions identified and the associated properties be called the feature set F1. Histogram Equalization is performed on the gray image to restrict any excessive illumination and to balance out the luminance (or lack of it). This equalized gray image is now used for image pyramid creation. This residual grayscale image is arranged in a Gaussian and Laplacian pyramid to facilitate multiresolution processing. Repetitive pattern elimination. The Recognition process entails two key steps – Another feature set is generated based on the edge detection method. The Laplacian of the Gaussian operator is used to detect the edges in the gray scale image. Regions are then connected using morphological operations. A square struc-tural element was used for the close and open operations. Region labeling is then used to identify the connected re-gions. Based on the following properties, these are further classified – 1. Area : Regions exceeding a certain threshold are retained. 2. Perimeter : Regions exceeding a certain perimeter are kept 3. Perimeter Correlation : After matched filtering with the perimeter pattern, a correlation threshold is to be exceeded for a region to be classified.4. Euler number : This is a measure of subobjects in the region and given that a large Euler number points to several small sized objects inside, it can be used along with other features to eliminate large areas. Using the region properties, like the centroid and the bounded box around the centroid for every classified re-gion of interest, firstly the relevant pixels in the original grayscale image are masked out. This results in a Feature set F2. A set union, F, of the two sets F1 and F2 is deduced for running the detection process. Using the local elliptic fit on every region in the set F, the local orientation of the region with respect to the world axis is obtained. A method to adaptively compute the affine transformation was also attempted. However, this was abandoned in wake of time. After correcting for the rotation, by the orientation of the region, a normalized correlation is performed with each image in the Eigen Image set as well as with Eigen NonI-mage set, at every eigen image resolution. Based on a de-tection threshold the image is classified as belonging to the Object set or Non-Object set. The total number of objects identified with their coordi-nates and the Bounded Boxes are then passed to the next level in the pyramid. Similar search is conducted for the corresponding regions and maximum correlation computed with various resolution of the eigen images. This procedure is repeated for all the levels in the pyramid. If at the end of this procedure, no regions passed the detec-tion test, then it is concluded that there are no objects in the image. Likewise, concordant detections at every pyramid level are required for classification of the object. VCM Decoding : Following the detection step, firstly the right resolution (pyramid level) is determined. This allows for handling various zoom levels in the original image. The criterion for the correct resolution level is – [I,J]= Arg(Max(Correlation(I,j))), where i: pyramid lev-els, j: eigen image resolution level Using the combination of the I,J the guide bars are firstly localized using correlation on the boundary strip. Specify-ing some points on these bars, the coefficients for affine transformation are deduced. Then the corner points are localized. Last task that remains is helped by reducing the size to the native 11x11 format and extracting out the 83 bits in the data fields of the VCM. Implementation details In this section the various steps used for implementing the above algorithm are described. Firstly the method used for creating the data base is outlined. This is done in two different ways –synthetic and based on the real data. The synthetic data base provides the initial train-ing sequence and is eventually to be updated by real images and the discards, appropriately. Data Base Creation-I : Synthetic Given that a VCM is a 11x11 matrix with 83-element code, a large sample of 10000 such matrices was created using a random number generator and the VCM generation script ‘generate_code.m’. These binary images were used to com-pute the

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1 out of 4 pages.

Stanford EE 368 - Visual Code Marker Detection

Sign up for free to view:

Please select your school