DOC PREVIEW
Stanford EE 368 - Lecture Notes

This preview shows page 1 out of 4 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

I. IntroductionII. MethodologyA. Definition of Visual Code MarkerB. Overview of AlgorithmC. Pre-processingD. Adaptive ThresholdingE. Label and Extract Key InformationF. Identify Visual Code MarkersG. Read CodeIII. ResultsIV. ConclusionO. Feinstein, EE368 Digital Image Processing – Final ReportAbstract— A visual code marker is an 11x11 element patternwith two fixed guide bars and three fixed cornerstones, which isused to uniquely label everyday objects such as CDs, DVDs,books, etc. Each marker has an 83-bit data code that can becaptured using any digital camera and transmitted over theInternet to a central database that stores more informationabout the labeled object. This paper describes a method to findthe locations of visual code markers in a 24-bit RGB image andread the 83-bit data embedded in each marker. Alternatives atcritical stages of the algorithm are discussed. Many differentimages are shown to prove robustness.Index Terms— digital image processing, visual code markers,adaptive thresholding, rotation and scale invariance.I. INTRODUCTIONThe goal of this project is to find all of the visual codemarkers in a 24-bit RGB image regardless of orientationusing MATLAB. Upon identifying a visual code marker, theorigin is reported as well as the 83-bit data embedded in themarker.After explaining the method used in this study, the resultsare shown and discussed.II.METHODOLOGYA. Definition of Visual Code MarkerA visual code marker is defined as an 11x11 square patternwith two guide bars and three cornerstones as shown inFigure 1 below. The 83 data bits are enclosed in red, and areread column by column, from top to bottom. A black squarerepresents a data bit of 1, and a white square is a data bit of 0.The vertical guide bar is seven elements long and thehorizontal guide bar is five. The origin (0,0) is defined as theupper left cornerstone, while the upper right cornerstone andlower left cornerstone are (10,0) and (0,10), respectively.Fig. 1. A visual code marker example.B. Overview of AlgorithmThe algorithm implemented can be described as a five-stage process (see Figure 2). The first stage takes the inputimage and pre-processes it to produce a cleaner image for thenext stage. The second stage creates a binary image by usinga row-by-row adaptive threshold. The third stage labels all ofthe unique regions and extracts useful information such aseccentricity, orientation, area, etc. The fourth stage searchesManuscript received June 02, 2006. O. Feinstein is with the Electrical Engineering Department at StanfordUniversity, Stanford, CA 94305 USA (email: [email protected]).for a candidate vertical guide bar, and if found, searches forthe horizontal guide bar and the three cornerstones. Using thethree cornerstone pixel coordinates as well as the centroid ofthe horizontal guide bar, the fifth stage defines the 11x11element plane and outputs the 83 data bits for each markerfound. Fig. 2. Five-stage algorithm with outputs of each phase in red.C. Pre-processingSince color is not needed in identifying the visual codemarkers, the first step in the algorithm is to convert the 24-bitRGB input image to a gray-scale image. However, instead ofusing the MATLAB function rgb2gray, [1] suggestsweighting the gray-scale image as half red and half greenbecause the blue component of the image has poor quality forsharpness and contrast. This optimization is also lesscomputationally intensive and saves a fraction of a second inthe overall run-time.Occasionally, the input image can be noisy, thus furtherpre-processing is needed. For this application, the mostfrequent type of noise is blurring either due to motion of thecamera while acquiring the image or a low quality camera asis common to cellular telephones. With that knowledge, ahigh pass filter is needed to counteract the low passcharacteristics of the image.Even if there is no noise in the image, a high pass filter isused to further emphasize the transitions between black andwhite, which is prevalent throughout the visual code marker.A 3x3 kernel is constructed to convolve with the gray-scaleimage:0 -5 0-5 21 -50 -5 0Identifying and Reading Visual Code MarkersOren Feinstein, Electrical Engineering Department, Stanford University1O. Feinstein, EE368 Digital Image Processing – Final ReportBut, since this high pass filter strongly amplifies highfrequency components, a simple 3x3 average filter isdeveloped to attempt to maintain the fidelity of the image:1/9 1/9 1/91/9 1/9 1/91/9 1/9 1/9The combination of the high pass filter followed by anaverage filter preserves the quality of the image andsimplifies the task of the adaptive threshold since the visualcode marker gets boosted in the image. Figure 3 shows theresults of pre-processing for a given input image.An alternative to the high pass and average filtercombination was to use the scale-by-max color balancingtechnique on the gray-scale image. The idea was to boost thewhite components of the visual code marker and leave theblack components relatively unaffected. However, the resultswere poor at best – the thresholding stage did not define thetransitions between black and white nearly as well as the highpass and average filter.(a)(b)(c)(d)Fig. 3. The pre-processing stage. a) 24-bit RGB input image. b) gray = (red +green) / 2. c) Gray image after using high pass filter. d) Gray image after usinghigh pass filter and then average filter. D. Adaptive ThresholdingAfter pre-processing the image, the next step is to create abinary image as shown in Figure 4, where dark pixels arerepresented by the value 1 and light pixels are 0. According to[1], an adaptive threshold with a zigzag traversal of the scanlines is best. However, upon implementation, undesirableeffects occur at even scan lines. Thus, a reasonable adaptationto [1] is to traverse row-by-row without traversing in a zigzagmanner. Specifically, an average gs(n) is kept while passingthrough the image:gs(n) = gs(n-1) * (1 – (1/s)) + pnwhere pn is the gray value of the current pixel, s = (1/8)


View Full Document

Stanford EE 368 - Lecture Notes

Documents in this Course
Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?