DOC PREVIEW
Stanford EE 368 - Detection and Interpretation of Visual Code Markers

This preview shows page 1-2 out of 7 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

EE368 Project Report Detection and Interpretation of Visual Code Markers Taral Joglekar, Joel Darnauer and Keya Pandia Stanford University I. INTRODUCTION This project aims at creating a set of image processing algorithms that can detect two-dimensional visual code markers in images and read off the bit patterns embedded in these codes. The construction of these markers is described in [1]. The images are of inferior quality, obtained from mobile phone cameras, which have high ISO noise and changing light gradient throughout the image. Also, the markers can be located at any position within the image having arbitrary scale, rotation and slight perspective distortion. This report describes the various stages of our proposed pipeline, the algorithms used, and the performance results obtained for a set of training and test images. II. PROCESS PIPELINE Our detection process can be broadly categorized into three main stages viz. Stage 1: Image cleanup and thresholding Stage 2: Visual Marker detection Stage 3: Code perspective correction and readout The following sections describe each of the three stages. III. IMAGE CLEANUP AND THRESHOLDING The given images are low quality with a lot of high ISO speckle noise, brightness gradients and blurring. Thus an important step before thresholding is the cleanup of these images. We tried various methods for image cleanup separately from the thresholding. Figure 1: Original Training Images Noise Removal: We tried removing the noise using median filtering, gaussian blurring and non-linear color aberration detection. In all cases, we realized that for most of the noise the spatial extent of the speckles was of the same order as the corner features of the visual code marker. Hence any attempt to filter out the noise selectively also adversely affected any further attempts to successful thresholding of the ‘region of interest’. Hence we decided to not apply any explicit techniques to try and remove the speckle noise. Color to grayscale conversion: The original images are color images and thus there was an initial temptation to use the extra color information available to aid in the recognition of the possible markers and non-marker false positives in the image. A major problem with this approach was that the bad chroma capture of the mobile camera meant that a color that was actually consistent was represented in the image with varying color components. For example, what should have been black color for the visual marker segments was actually captured as low intensity red or green or blue. Thus to avoid this effect, we used the value channel of the image (value from HSV decomposition) as the grayscale conversion for this image. We also tried using the average of red and green component as the gray scale value as given in [1], but found that using value worked better in some cases. Brightness Equalization: The varying brightness can be compensated in two ways. First, we can have a specific brightness equalization stage that tries to remove the brightness gradient in the image or, secondly, we can incorporate an adaptive thresholding algorithm that accounts for the slow change in brightness within and across image scan lines. Our experiments showed that use of both these techniques gave the best results for the set of training images that we used. Figure 2: Brightness Adjusted Images The brightness equalization stage performs brightness equalization on parts of the image by subdividing the image into smaller blocks and scaling the block elementsso that the highest value element in any block is one. This method works very well within the block, but gives sharp edge artifacts along the boundaries of the blocks. A major problem with these edges is that they are usually more prominent than the code marker edges and create a lot of false positives. To avoid this effect we, used overlapping blocks, which reduces this effect. Adaptive thresholding: After the brightness equalization we implemented a straight single value thresholding of the image. For this we marked anything below a brightness value below 1/2 as black, anything above as white. But this method did not work at all and we then implemented a (mean – C) algorithm. Here the threshold to use for a particular pixel was obtained by finding the mean of values in its neighborhood and subtracting a constant C from it. The problem with this approach was that fixing a single value for a single image was impossible. To compensate for this, we wrote a adaptive algorithm that varied the neighborhood size and the value of C till the number of thresholded components in the image fell below a preset value. Though this worked much better than a fixed C algorithm, we still needed to use an arbitrary value for the maximum number of components and this value seemed to be inextricably linked with the number of code markers in the image. Hence a value that worked very well with images with a single code marker worked pretty badly with images with three markers and vice-versa. With all these failed attempts, we went back to [2] and implemented the thresholding algorithm outlined there, which basically works on a row by row basis for the image, alternating through the lines, once from left to right and then from right to left. It uses a history of (s=width/8) pixels and calculates the threshold as: 5.0=initialtval newoldnewpixelvalstvaltval +−∗=11 Then the threshold value is used as the tth percentage down value from tval. i.e. −=100100*ttvalthresholdnewnew The authors of [2] suggest using t = 15, and we found that the algorithm does work very well of this value. But for our images only doing this much was not enough. There were still some markers that would be too low intensity to be detected by this algorithm. As a final addition, we applied a high pass filter to the grayscale image, which compares the current pixel with a sampling of other pixels outside a 4x4 neighborhood, just to


View Full Document

Stanford EE 368 - Detection and Interpretation of Visual Code Markers

Documents in this Course
Load more
Download Detection and Interpretation of Visual Code Markers
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Detection and Interpretation of Visual Code Markers and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Detection and Interpretation of Visual Code Markers 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?