Sum of Absolute Differences Hardware AcceleratorOutlineMotion EstimationSlide 4MPEG-4 Part 10 – AVCSlide 6MPEG4 Variable Block SizesApproachTasks PerformedAbsolute Difference UnitAbsolute DifferenceSingle Processing ElementSystolic ArrayBack-End Adder Array DiagramAnother way to represent the generation of motion vectors for the variable sized blocksThe Schedule for the Accelerator:Questions?07:30 AM 07:30 AM Sum of Absolute Sum of Absolute Differences Differences Hardware Hardware AcceleratorAcceleratorMark LodermeierMark Lodermeier07:30 AM 07:30 AM OutlineOutlineOverview of Motion Estimation and Overview of Motion Estimation and MPEG4 – Part 10 - AVCMPEG4 – Part 10 - AVCApproachApproachTasks PerformedTasks PerformedTo DoTo Do07:30 AM 07:30 AM Motion EstimationMotion EstimationUsed for video compression – block Used for video compression – block matching between successive framesmatching between successive framesSearch for best matching block (find Search for best matching block (find motion vectors)motion vectors)Used to create model of current frame using Used to create model of current frame using reference frame(s) from either previous or reference frame(s) from either previous or future framesfuture framesMotion vectors found by determining the Motion vectors found by determining the minimum SADminimum SAD MV(r,s) = argminMV(r,s) = argmin[[SADSAD((x,y,r,sx,y,r,s)])]07:30 AM 07:30 AM Motion EstimationMotion EstimationFull Search algorithm produces the best Full Search algorithm produces the best results. results. However, computationally expensive.However, computationally expensive.Motion Estimation accounts for 50-70% of Motion Estimation accounts for 50-70% of computational complexity in MPEG-4 video computational complexity in MPEG-4 video encoding/decodingencoding/decodingFor a small search range of [-8, +7] in each For a small search range of [-8, +7] in each direction with 16x16 macroblocks, there are 16*16 direction with 16x16 macroblocks, there are 16*16 pixel comparisons performed 16*16 times = 65,536 pixel comparisons performed 16*16 times = 65,536 additions of absolute differences for a single 16x16 additions of absolute differences for a single 16x16 blockblockReal-time video of a 480x640Real-time video of a 480x64007:30 AM 07:30 AM MPEG-4 Part 10 – AVCMPEG-4 Part 10 – AVCVariable Block SizesVariable Block SizesEach 16x16 Macroblock can be split in Each 16x16 Macroblock can be split in half into two 16x8 or 8x16 blocks or half into two 16x8 or 8x16 blocks or into four 8x8 sub-blocksinto four 8x8 sub-blocksThese sub-blocks can then be split in These sub-blocks can then be split in half into two 8x4 or 4x8 blocks or into half into two 8x4 or 4x8 blocks or into four 4x4 blocks. four 4x4 blocks.07:30 AM 07:30 AM MPEG-4 Part 10 – AVCMPEG-4 Part 10 – AVCPrevious 16x16 macroblock split into smaller Previous 16x16 macroblock split into smaller blocksblocks07:30 AM 07:30 AM MPEG4 Variable Block MPEG4 Variable Block SizesSizesPurpose:Purpose:Many small blocks Many small blocks requires large requires large amount of bits to amount of bits to encodeencodeFew large blocks Few large blocks may produce poor may produce poor qualityqualityCan produce higher Can produce higher efficiency with efficiency with same qualitysame qualityChallenges:Challenges:Generate motion Generate motion vectors for all block vectors for all block sizes - Increase sizes - Increase computation cost to computation cost to an already intensive an already intensive algorithmalgorithmChoose the correct Choose the correct block size among block size among many choices to many choices to balance bandwidth balance bandwidth and qualityand quality07:30 AM 07:30 AM ApproachApproachHow to efficiently generate all MV’s How to efficiently generate all MV’s for variable sized blocks?for variable sized blocks?Take full advantage of parallel Take full advantage of parallel nature of both Motion Estimation nature of both Motion Estimation and the generation of variable sized and the generation of variable sized blocksblocksMaintain high processor utilizationMaintain high processor utilization07:30 AM 07:30 AM Tasks PerformedTasks PerformedImplemented 1-D systolic array in Implemented 1-D systolic array in VHDLVHDL16 Processing Elements, each with:16 Processing Elements, each with:Absolute Difference unitAbsolute Difference unit9 to 2 Compressor9 to 2 Compressor3 to 1 Compressor3 to 1 Compressor07:30 AM 07:30 AM Absolute Difference UnitAbsolute Difference Unit07:30 AM 07:30 AM Absolute DifferenceAbsolute DifferenceMath behind absolute difference unit:Math behind absolute difference unit:Just check condition - A > BJust check condition - A > BB + B_not = 2B + B_not = 2nn – 1 – 1 B_not = 2 B_not = 2nn – 1 – B – 1 – B22nn-1+|A-B| is the value of the sum of the two -1+|A-B| is the value of the sum of the two outputs of the absolute difference unit.outputs of the absolute difference unit.Need to add a correction term of Need to add a correction term of m m to get rid to get rid of the 2of the 2nn-1, where -1, where m m is equal to the number of is equal to the number of absolute difference units used.absolute difference units used.07:30 AM 07:30 AM Abs Diff Unit Abs Diff Unit Abs Diff UnitCBA9 to 2 Adder Reduction TreeAbs Diff Unit3 to 1 Reduction TreeLatchSingle Processing Single Processing ElementElementCorrection term - 407:30 AM 07:30 AM Shift RegisterPE016 4x4 SAD values and MV’s…8 4x8 SAD values and MV’s8 8x4 SAD values and MV’s4 8x8 SAD values and MV’s2 8x16 SAD values and MV’s2 16x8 SAD values and MV’s1 16x16 SAD value and MVShift RegisterPE1Shift RegisterPE2…Shift RegisterPE15controlD D D DABC4x4 SADand MV4x4 SADand MV4x4 SADand MV4x4 SADand MVBack-End Adder Array for Variable BlocksSystolic ArraySystolic Array07:30 AM 07:30 AM Back-End Adder Array Back-End Adder Array DiagramDiagram16x16 SAD and MV2 8x16 and 2 16x8 SAD’s and MV’s4 8x8 SAD’s and MV’s8 4x8 SAD’s and MV’s8 8x4 SAD’s and MV’s16 4x4 SAD’s and MV’s- Each dot represents the following:MinLatch0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 12 13 14 15 8 9 10 114 5 6 70 1 2 3 Macroblock with 16 4x4 sub-blocks07:30 AM 07:30 AM SAD values for one 4x4
View Full Document