Stanford EE 368 - Visual Code marker Segmentation - D2117356

Home> Schools> Stanford University> Electrical Engineering (EE) > EE 368> Visual Code marker Segmentation

DOC PREVIEW

Stanford EE 368 - Visual Code marker Segmentation

School name Stanford University

Course Ee 368- Digital Image Processing

Pages 6

This preview shows page 1-2 out of 6 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 6 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 6 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 6 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

V I S UAL C O D E M A R KE R SE G M E NT AT IO N AN D D AT A E X T R AP O L A TI O N Bradford Bonney, Evan Millar Electrical Engineering Department Stanford University Stanford, CA 94305 Abstract The purpose of this research is to automatically segment and read visual code markers. These markers are found in standard 640x480 pixel color images that were acquired with standard VGA cell phone cameras. In this paper, we investigate an approach to automatically identify and read visual code markers. The experimental results show that the algorithm discussed in this paper successfully segments visual code markers and reads all the data correctly. The average time for the segmentation and data parsing of twelve test images was 2.12 seconds per image. 1. Introduction Visual code markers have a wide range of applications, including the automated tasking of applications in cell phones via the built-in camera [1]. With the increasing number of cell phones on the world stage, and the integration of CCD imagers into even the most modest of cell phones, the ability to have automated visual code marker detection and reading capabilities is a wonderful feature enhancement. While we are not concentrating on the applications of our developed algorithm, the applications of such routines need not be ignored. The visual code markers that we consider are 2-dimensional arrays. The array consists of 11x11 elements. Each element is either black or white. As shown in the figure below, we fix the elements in three of the corners to be black. One vertical guide bar (7 elements long) and one horizontal guide bar (5 elements long) are also included. The immediate neighbors of the corner elements and the guide bar elements are fixed to be white. This leaves us with 83 data elements, which can be either black or white [2]. Figure 1 shows an example of the visual code marker we are trying to detect and read in the following algorithm. Our initial experimental results show that it is, in fact, possible to locate multiple visual code markers embedded in low resolution, color images, and subsequently process the data they contain. We were successfully able to locate visual code markers in 23 test images, which contained anywhere from one to five visual code markers. 2. Corner Detection In order to properly extract the data contained within each visual code marker, the four corners of the marker must first be identified. We refer to each of the corners based on their corresponding compass directions: northwest, northeast, southwest, and southeast. To determine the location of these coordinates, a four-step approach was used. First, the reverse “L” (two fixed guide bars) was identified within the image. Next, the northeast and southeast coordinates were labeled. Using geometric and algebraic calculations, the southwest coordinate was calculated next. Four, and finally, the northwest coordinate was found. If at any stage in the process a valid location was not identified, the potential marker was discounted as a false positive. Figure 2 shows the image to which the in-depth breakdown and discussion of our algorithm will be applied. 2.1. “L” Guide Bars for Northeast / Southeast Detection The fixed lengths and proportions of the guide bars make them a logical choice for initially identifying potential visual code markers. Additionally, the relatively Figure 1: Visual Code Marker Figure 2: Sample image with visual code markersFigure 4: Labeled regions of binary visual code marker image Figure 5: Binary image: regions that remain have major-to-minor axis ratios greater than or equal to 3 solid-black on solid-white nature of the visual code markers makes working with binary images ideal [3-4]. In order to convert the color images to binary images, a thresholding algorithm needed to be applied. However, a simple mean value threshold could not be used due to potential shadowing and other local image intensity characteristics. To make our thresholding algorithm as invariant to image intensity as possible, local thresholds were utilized. Our local threshold window was 40 by 40 pixels, and was applied in a non-overlapping fashion. The threshold was applied to each of the red, green, and blue channels of the color-RGB image separately. To achieve the final binary, thresholded image, all three channels were combined using a logical AND operator. This created a locally thresholded, binary version of the original input image as seen in Fig. 3. Clearly, the “L” shaped guide bars remain after the local threshold has been applied. Unfortunately, there is still a great deal of noise in the image. To eliminate this noise, we divided each mass of connected pixels into labeled regions as seen in Fig. 4. We cycled through each region and calculated the ratio of its major axis to its minor axis. The major axis and minor axis were calculated using an ellipse that had the same second moment as the labeled region. This calculation was performed automatically using the built-in MATLAB® functionality [5]. Without any knowledge of the camera’s intrinsic characteristics, we could not be certain of any perspective transformations, nor correct for them. However, knowing that an original visual code marker had vertical guides with dimension ratios of 7:1 and horizontal guides with dimension ratios of 5:1, we kept the region as a potential guide bar if the major-to-minor axis ratio was greater than 3:1. This allowed for some leniency when dealing with distortions incurred during the local thresholding process, as well as with markers in which the object plane and the image plane did not coincide – an assumption that was made during processing. Figure 5 shows the potential guide bars following this initial noise cancellation step. To further reduce noise, we took advantage of the fact that the guide bars were consistently solid black (or in the case of Fig. 5, solid white, as they are the regions of potential interest). Since the guide bars are solid, we remove from consideration any region that has holes. If the region is not solid, it is no longer a potential guide bar. Additionally, due to the guide bars straight-lined nature, the area of their convex hulls are roughly equal to the area of the guide bar regions. By eliminating from consideration all regions whose convex hull area differs greatly (by more than 30%) from the area of the region area in

View Full Document