Stanford EE 368 - Visual Code Marker Detection - D2345542

Home> Schools> Stanford University> Electrical Engineering (EE) > EE 368> Visual Code Marker Detection

DOC PREVIEW

Stanford EE 368 - Visual Code Marker Detection

School name Stanford University

Course Ee 368- Digital Image Processing

Pages 5

This preview shows page 1-2 out of 5 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 5 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 5 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 5 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

1 Abstract—This report discusses the algorithm we implemented to identify and read data from visual code markers. In addition to outlining the algorithm steps, we summarize the results our algorithm achieved on a set of training images. I. INTRODUCTION he problem we address is as follows. Given a JPEG image scattered with visual code markers (Fig. 1), we wish to determine the coordinates of the center of the upper-left square of each marker, as well as the bits encoded by each marker. Our strategy was to identify possible guide bars through region labeling followed by a series of checks on each pair of regions. To keep the execution time as short as possible while limiting false positives and negatives, we tried to strike a balance between robustness and efficiency. After finding a pair of guide bars, we identify the four corners of the code marker using data extracted from the Radon transform of the guide bar regions. Finally, we apply a projective transform to map the four points, which form an arbitrary quadrilateral, to a square in the conventional x-y grid. The data bits can then be read by simple thresholding. Figure 1: Input image containing three markers. I. ALGORITHM DESCRIPTION A. Filter/Thresholding We start by converting the image to grayscale by retaining only the luminance of the original image. Color is not an immediate concern since our first goal is to label the dark regions in the image, among which will be the guide bars. However, color information will be useful later for region elimination. Next, we filter the grayscale image with a 9x9 Laplacian of Gaussian filter with σ = 1.4 to sharpen edges. The filtered image is then thresholded near its mean (which is roughly 0), giving us an image of all the prominent dark regions, such as that shown in Fig. 2a. B. Region Labeling and Region Removal The next step is to detect all of the dark regions in the image. We use 4-connectivity rather than 8-connectivity because we expect all pixels in a guide bar to be well connected to others. Next, before entering our pairwise guidebar search, we attempt to remove as many regions as possible that clearly do not fit the characteristics of a guide bar. The pairwise search will consume almost all of the execution time of our program and will have O(n2) complexity, so removing roughly 29% of the regions, for example, would make our algorithm twice as fast. We choose to eliminate based on size any region smaller than 50 pixels or larger than 8000 pixels. The lower bound is based on the size of guide bars in the 12 training images that we were provided, in which we found no guide bar regions containing less than 100 pixels. The upper bound is based on the largest possible code marker that would fit in a 640x480 image, which would contain roughly 480x480 pixels. Its short guide bar could consume no more than 5/121 of these pixels, leaving 8000 as a reasonable upper bound. The next characteristic of guide bars that we take advantage of is the fact that they are black regions on white backgrounds. Looking now at the red, green, and blue color components of the pixels in the region, we remove a region from consideration if the mean squared distance of its color components from the average of its color components exceeds a threshold. This works since we expect the RGB components of a black pixel to be very near each other. Visual Code Marker Detection Daniel Blatnik, Abheek Banerjee EE 368, Spring 2006 T2 Figure 2: Demonstration of color removal, for training image #5. Thirty regions were rejected on the basis of color (bottom) that we could not reject on the basis of size (top). This reduced the total number of regions from 149 to 119, for an expected reduction in execution time of 1-(119*117/2)/(149*147/2) = 36%. C. Iterative Guide Bar Search After filtering, thresholding, and region deletion, we are now ready to search for pairs of guide bars. This involves iterating over each pair of the remaining regions, putting each pair through a series of tests to eliminate any false positives. To eliminate the most obvious non-guidebar regions from consideration with the least computational expense, we start by checking the ratio of the number of pixels in the larger region to the number of pixels in the smaller regions. In a code marker’s original grid, the long guide bar consumes seven squares while the short guide bar consumes five squares. Based on their lengths, we expect a ratio of the larger bar over the smaller bar of about 1.4. If the ratio lies too far away from this value, then we can eliminate the pair from consideration. We also reject from consideration any two regions that lie too far apart from one another. We use two measures of closeness: midpoint-to-midpoint distance, and the distance between the closest pair of points from each region. The next characteristic of guide bars we use is that each bar should possess a single dominant edge orientation angle. Regions containing a varied mix of edge orientation angles, can thus be eliminated. To determine the approximate distribution of edge orientation angles in a region, we take a rectangular subset that closely surrounds the region from the thresholded image. We perform edge detection on this black and white subset using the Canny detector. We chose this edge detector because it is computationally fast, and because we are not worried about edge connectivity; we only need to know the rough distribution of edge orientation angles in a connected region. To estimate this distribution, we apply the Radon transform to the edge-detected subset, thus giving the strength of lines in the subset at each (ρ, θ). Because we again are only interested in the dominant angle, so we sum the Radon matrix over all ρ for each θ, giving a histogram of edge strength vs. θ. In a true guide bar, the peak angle in this histogram will have much more energy than the rest of the spectrum of angles.

View Full Document