MIT 6 837 - Shadow Vision - D75282

Home> Schools> Massachusetts Institute of Technology> Electrical Engineering and Computer Science (6) > 6 837> Shadow Vision

DOC PREVIEW

MIT 6 837 - Shadow Vision

School name Massachusetts Institute of Technology

Course 6 837- Computer Graphics

Pages 10

This preview shows page 1-2-3 out of 10 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

December 6, 19991 of 10Shadow VisionMark HuangShishir MehrotraFlavia Sparacino6.837Introduction to Computer GraphicsFall 19991.0 AbstractShadow Vision attempts to create a virtual shadow puppet theater A user’s hand motionsover an overhead projector are used to direct object creation and manipulation in a 3DOpenInventor scene.The first task of recognizing the hand puppets is accomplished using contour modeling.Neural network based approaches are also attempted. Then, the object classification andsalient features are packaged into standardized datagrams and passed over a networksocket to the graphics engine. The graphics engine uses the object’s classification aswell as other features such as rotation to generate a 3D model. The user can then applyvirtual manipulations to the object by altering their hand puppet or by using a suite oftool images.The system successfully differentiates between a suite of approximately seven objectsand manipulators but is rather unsuccessful with the “finger” objects.Introduction2 of 10 Shadow Vision2.0 IntroductionShadow Vision is intended to be a fun dem-onstration of a few fairly powerful contourrecognition algorithms. Using only a smallcomputer video camera, an overhead projec-tor, and some creativity with the fingers, theuser can make shadow pictures on a wall andhave them realized in 3-D by the computer(see right). If the computer recognizes theimage the user is trying to create, it generatesa 3-D model of it. The user may then interactwith the model by forming various tools outof shadows as well. The model can bescaled, rotated, skewed, or discarded with these virtual tools. When the user is finishedediting it, the model can then be added to a scene.Because the easiest shadows to make (and recognize) are classic animal shapes, ShadowVision currently recognizes a bird, a crocodile, a dragon, and a cow.3.0 GoalsThe project broke roughly into three components: the computer vision components, thecomputer graphics components, and the communication between the two. Each of thesecomponents has their own objective.• Vision Goal: Correctly classify a set of shapes represeting animals and various vir-tual manipulators. Because of the use of an overhead projector, the hand puppets canbe treated as 2 dimensional black and white images. The classifcation system shouldbe invariant to rotation and scale, but should be able to retrieve this informationwhen necessary. The set of objects to be classified range from hand made shadowpuppets represeting various animals to drawings on overheads representing manipu-lators.• Graphics Goal: Offer an interactive shadow theater, capable of displaying andmanipulating a suite of different animal models. The interface to these modelsshould be rather simple, since it must be responsive to directives from the visioncomponent. Thus, for the most part, interactivity could be limitted to a handful offunctions per object.• Communications Layer Goals: Standardize an interface between the graphics andvision components of the project. Paying special attention to this section was neces-sary in order to allow parallel development of the application as well as to produce adisrtibuted runtime environment.The next section describes in detail how each of these components were implementedand how the associated objectives were acheived.ImplementationShadow Vision 3 of 104.0 Implementation4.1 Computer Vision: Recognizing the Shadow PuppetsThe computer vision components breaks down into the following chain of stages. First,during the acquisition stage, the image is acquired from the camera and converted intoa suitable form. Then, during segmentation the image is thresholded andteh pixels aremarked as being part of the object or being part of the background. Also during thisstage, a contour of the object is computed. Next, during feature extraction, a number oftransformations are applied to the contour in order to extract salient features. Finally,these features are compared to training data during the recognition stage. Here, the fea-tures of the object are compared to the models known to the system, and if a match orsufficient quality is found, then it is reported. Together, these 4 stages acheive the goalof recognizing a hand puppet and returning the correct classification.4.1.1 AcquisitionThe first task for the computer vision program was to acquire and display the incomingvideo stream.We chose to capture the images to process using an infrared camera. Anyvideo projection in the area seen by the camera will be invisible to the infrared camera.This has the advantage for us to be able to project other computer generated images ontop or around the shadow without any interference with the computer vision processing.We were not able to use any existing library as we were originally hoping to, as most ofthe existing code we had access to was targeted to color processing of images.We also needed to make sure that the images were captured inYUV space. We choseYUV because all we need to process is the luma information, as the incoming infraredimage is black and white.In this process we found out two things:• The only YUV encoding natively supported by the O2s is a 422 packing which alter-nates a U and V values for every acquired Y byte. Any other encoding will have tobe converted in software and therefore slows down the acquisition.• In some other cases we noticed that the O2 would automatically use RGB8 acquisi-tion and therefore we were only touching the red pixel instead of the expected lumapixel. Although these are each one byte, the unpacked information was wrong: it didnot correspond to the calculated luma value we compared with when calculatingluma computationally from RGB.4.1.2 DisplayAnother bit of a challenge was encountered in displaying the acquired YUV image inreal time on the screen. A possibility was to use X and the XCreateImage() function. Wewanted instead to use GL or OpenGL. We found out it is not possible to display a YUVimage in GL in real time on the screen without the introduction of a delay. If forced tohave a pixmode with a luma setting, GL does a conversion from YUV to RGB in soft-ware and this slows down the display. It can otherwise be forced to display the lumathrough the red channel of the graphics display. This works in real time but it is ugly tolook at.Implementation4 of 10 Shadow VisionWe then moved to OpenGL, in which it is possible to use glDrawPixels() with theGL_YCRCB_422_SGIX option,

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3 out of 10 pages.

MIT 6 837 - Shadow Vision

Sign up for free to view:

Please select your school