Stanford EE 368 - Detection of Visual Code Markers - D3019367

Home> Schools> Stanford University> Electrical Engineering (EE) > EE 368> Detection of Visual Code Markers

DOC PREVIEW

Stanford EE 368 - Detection of Visual Code Markers

School name Stanford University

Course Ee 368- Digital Image Processing

Pages 5

This preview shows page 1-2 out of 5 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 5 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 5 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 5 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

1 Abstract—This paper presents a method for entering digital data into a cell-phone via a visual code marker. The code is presented as a two-dimensional array of black and white bits along with position and orientation features. The algorithm works by searching for properly shaped and spaced regions in an adaptively-thresholded image to locate the code data. The data are then read by computing the mapping between image coordinates and code coordinates. The method presented has a low error rate and has proven to be much faster than template matching methods. Index Terms—camera, cell-phone, code, detection I. INTRODUCTION ELLULAR phones with built-in cameras are nearly ubiquitous in many societies around the world. These phones have wireless connectivity, visual and audio inputs, keypad entry, visual and audio outputs, and capable processors. Because of the prevalence and capability of such camera-phones they have the potential to be a powerful tool for linking the virtual-world and the real-world. Unfortunately, the only way that a user can enter digital data into the phone is via a small keypad. Visual code markers can be used as another quick and easy method of data entry by reading data presented to the camera’s visual field. This paper describes one possible implementation of digital data entry through a low-quality camera. II. VISUAL CODE MARKERS To extract digital data through an image the data must be presented to the camera in a structure that can be easily processed. Ease of processing, and density of data are both desirable features in the presentation format. To serve both goals simultaneously Michael Rohs has suggested a square marker layout with data in a two dimensional array. Figure 1: Sample code marker (left), Sample code marker with non-datum elements shown in color. Figure 1 illustrates the data marker layout. The code on the left shows the true color scheme of black and white. The code on the right has false colors on the elements which do not contain data. These non-data elements are used for finding the marker in a cluttered visual field, and then determining its orientation and position. The names by which these elements will be called in this paper are as follows: red is the top-left cornerstone; blue is the bottom-left cornerstone; magenta is the top-right cornerstone; green is the right guide-bar; and cyan element is the bottom-right guide-bar. Each non-data element has a one code-pixel white border between it and the data pixels. There are eighty-three bits of datum in marker. III. CODE DETECTION ALGORITHM The method for reading visual code markers that is presented in this paper is based on the geometry of regions in a binary image; it will be referred to as region-geometry identification (RGI). The algorithm functions in three major steps, region extraction, marker location, and datum extraction. In region extraction a color image is processed to obtain an image with uniform background and labeled contiguous foreground regions. Marker location is the process of selecting groupings of regions which represent code markers. The last state, datum extraction, reads the data from the code marker. Each of these steps is explained in detail below. The algorithm is illustrated with a sample image. A. Region Extraction The region extraction step of the algorithm is applied globally to the entire image and is only applied once. This step culminates in a list of labeled regions which are indexed by code location step. Region extraction drastically reduce the dimensionality of the processing, thereby increasing the speed of the algorithm. 1) Grayscale conversion The color information in the original image (Figure 2) is not useful for code detection since the white-balance of the image is unknown. An intensity image is calculated as suggested by Rohs as the average of the red and green channels. This method helps smooth the noise since the blue channel has the most noise. However, the RGI algorithm is not very sensitive to the method used to discard color information. Figure 3 shows the grayscale image. Detection of Visual Code Markers Gabriel Takacs C2 Figure 2: Color input image Figure 3: Average of red and green channels 2) Adaptive thresholding Proper creation of a binary image is critical to the RGI algorithm. A global threshold will erase a code marker if the marker is in bright illumination or shadow relative to the mean image intensity. To resolve this problem the gray level is compared to a local threshold. The threshold for a given point is the average of the pixels in a window around the point. By selecting the size of the window properly gradual intensity variations are ignored, whereas local intensity changes are captured. Figure 4: Adaptively thresholded binary image Figure 4 shows the results of an adaptive threshold operation on Figure 3. The effects of the adaptive algorithm are not apparent as the sample image does not suffer from lighting variations. 3) Region Labeling Each contiguous black region in the binary image must be individually labeled in order to calculate the properties of each region individually. Any algorithm which labels the regions will work, as the ordering of the labels is of no import to the RGI algorithm. This implementation uses the MATLAB bwlabel function. Figure 5: Labeled regions, colors represent label number The labels that result from this step are illustrated in false color in Figure 5. The colors are just for visualization, the label image consists of a single channel. B. Code Location The code location step is the key to the RGI algorithm. Once black regions are labeled, the algorithm need only iterate over the regions and not all pixels in the image. For each black region, a series of successive tests are performed to check if it is the right guide-bar of a code marker. The algorithm skips a region as soon as one of the tests fails. Regions that are very dissimilar from the true marker shape therefore require little computational effort to distinguish. For each region the height, width, centroid, eccentricity, and number of pixels is computed. First the number of pixels in the region is counted, then the centroid is calculated by averaging the x and y coordinates of each pixel in the region. The centroid is then used in calculating the second moments of the region, as suggested by Rohs. The second moments are the defined in Equations 1. )()(1)(1)(1position

View Full Document