DOC PREVIEW
Berkeley COMPSCI 252 - Finding Body Parts with Vector Processing

This preview shows page 1-2-3-4-5-6 out of 18 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Finding Body Parts with Vector ProcessingIntroductionDemoVision AlgorithmsLimb FindingAlgorithm specificsSlide 7Slide 8GPUGPU dataflow modelFragment processor has high resource limitsThe algorithmResultsResults – log scaleSlide 15Slide 16CommentsAcknowledgementsFinding Body Parts with Vector ProcessingCynthia BruynsBryan FeldmanCS 252IntroductionTake existing algorithm for tracking human motion, speed up by computing on the GPU.Demonstrate that many vision algorithms are prime candidates for using vector processingResults after false candidates have been removedDemoVision AlgorithmsOften computationally expensive-searching over many pixels for objects at many orientations and scalesE.g. •[((1024x768)pix)x3colors]x[12orientations]x[5 scales]Very often the case that highly parallizableLimb FindingGoal – find candidate limbsLimbs look like long dark rectangles on light backgrounds or long light things on dark backgrounds1. Convolution with filterconvolve using FFT•Response indicates how much pixels go from low to high intensity•Convolve over all three color channels so as to not miss red – blue of same intensityAlgorithm specifics*x2. For every pixel location get respconv from “left” and “right”, put into new matrix resplimb Algorithm specifics-respconvxxrespconvxxresplimbAlgorithm specifics3. Find local maximums – for every pixel replace with max. of local neighbors. If resplimb=locMax it’s a max.50 .25 .40 .23.75 .41 .98 .75.11 .43 .15 .23.78 .34 .13 .15 .75 .98 .98 .98.75 .98 .98 .98.78 .98 .98 .98.78 .87 .23 .23 resplimblocMaxGPUIt’s a good choice because each operation is per pixel – SIMD-likeData stored in texture buffers equivalent to local cache Clean instruction set and developing interface language to exploit vector operationsJustify your gaming habitsGPU dataflow modelHardware supports several data types for bandwidth optimization, i.e. 32 bit floating point, half etc.Data passed to main memory stages via bindingApplicat ionFragmentProcessorAssembly &RasterizationFramebuferOperationsFramebuferTexturesVertexProcessorFragment processor has high resource limits1024 instructions512 constants or uniform parameters•Each constant counts as one instruction16 texture units•Reuse as many times as desiredNo branching•But, can do a lot with condition codesNo indexed reads from registers•Use texture reads insteadNo memory writesThe algorithmDraw invokes the fragment programsThe texture becomes a data structure – use two for framebuffers to avoid RAW hazzardsFFT Fragment programFFT Fragment programImageMaskConvolution ProgramCylinderProgramFind MaxProgramFor each orientation to searchResults0100200300400500600700256 512 1024im age s izetime scaleCPU origCPU FFTGPU(CPU-2.53 GHz P4GPU Nvidia FX5900)Mask size fixed (22x13) vary image size*Additional GPU optimizations possibleResults – log scale(CPU-2.53 GHz P4GPU Nvidia FX5900)Mask size fixed (22x13) vary image size42.7 sec252.1 sec*Additional GPU optimizations possibleResultsImage size fixed (512x512) vary mask sizeVarying mask sizes allow for varying limb sizes on same imageResultsComments GPU and image processing are a good matchTime to move memory from CPU to GPU is cumbersome – but can be overcomeNon-uniformity of installations, products, exact specifications are hearsayAcknowledgementsKenneth MorelandDeva RamananOkan


View Full Document

Berkeley COMPSCI 252 - Finding Body Parts with Vector Processing

Documents in this Course
Quiz

Quiz

9 pages

Caches I

Caches I

46 pages

Lecture 6

Lecture 6

36 pages

Lecture 9

Lecture 9

52 pages

Figures

Figures

26 pages

Midterm

Midterm

15 pages

Midterm

Midterm

14 pages

Midterm I

Midterm I

15 pages

ECHO

ECHO

25 pages

Quiz  1

Quiz 1

12 pages

Load more
Download Finding Body Parts with Vector Processing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Finding Body Parts with Vector Processing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Finding Body Parts with Vector Processing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?