EE 5359 PROJECT PRESENTATION FAST INTER AND INTRA MODE DECISION IN H 264 VIDEO CODEC BASED ON THREAD LEVEL PARALLELISM Project Guide Dr K R Rao Tejas Sathe 1000731145 Email ID tejas sathe mavs uta edu Introduction to H 264 1 codec H 264 MPEG 4 Part 10 or AVC Advanced Video Coding Standard by the ITU T Video Coding Experts Group VCEG and the ISO IEC Moving Picture Experts Group MPEG Widely used for video compression and is currently one of the most commonly used formats for the recording compression and distribution of high definition video A new video compression scheme that has become the worldwide digital video standard for consumer electronics and personal computers Significant improvement in the rate distortion efficiency providing typically a factor of two in bit rate savings when compared with existing standards H 264 profiles 1 Baseline Pro le Real time conversational services e g video conferencing and videophone Main Pro le Digital storage media and television broadcasting Extended Pro le Multimedia services over Internet Four High Pro les Content contribution content distribution and studio editing and post processing Fig 1 H 264 Profiles 1 How does the H 264 codec work A codec is a device or computer program which encodes and or decodes a signal or digital data stream H 264 is a block oriented motion compensation based codec standard An H 264 video encoder carries out prediction transform and encoding processes to produce a compressed H 264 bit stream The block diagram of the H 264 video encoder is shown in Fig 1 A decoder carries out a complementary process by decoding inverse transform and reconstruction to output a decoded video sequence The block diagram of the H 264 video decoder is shown in Fig 2 H 264 Encoder Block Diagram 1 H 264 Decoder Block Diagram 1 Received bitstream is entropy decoded and rearranged to produce a set of quantized coefficients These are rescaled and inverse transformed to give a difference macroblock Using the header information decoded from the bit stream a prediction macroblock P is created and added to the difference macroblock The result is filtered to create a decoded macroblock Some highlighted features 2 of H 264 video codec Variable block size motion compensation with small block sizes Quarter sample accurate motion compensation Multiple reference picture motion compensation Weighted prediction Improved skipped and direct motion inference Directional spatial prediction for intra coding In the loop deblocking filtering Context adaptive entropy coding Flexible slice size Flexible macroblock ordering FMO Intra prediction 1 3 A technique of extrapolating the edges of the previously decoded parts of the current picture and is applied in regions of pictures that are coded as intra H 264 uses the methods of predicting intra coded macroblocks to reduce the high amount of bits coded by original input signal itself A prediction block is formed based on previously reconstructed un ltered for deblocking blocks Residual signal between the current block and the prediction is nally encoded One mode is selected from a total of 9 for each 4x4 and 8x8 luma blocks 4 modes for a 16x16 luma block and 4 modes for each chroma blocks Intra prediction Modes 4 Fig 4 4x4 intra prediction modes 4 Fig 5 16x16 Intra prediction modes 4 Inter prediction 1 14 It includes motion estimation ME and motion compensation MC ME MC performs prediction A predicted version of a rectangular array of pixels is generated by choosing another similarly sized rectangular array of pixels from previously decoded reference picture Reference array is translated to the position of current rectangular array to compensate for the motion in the video stream Different sizes of arrays for luma 4x4 4x8 8x4 8x8 16x8 8x16 16x16 pixels Fig 6 Macro block partitions 16x16 8x16 16x8 8x8 14 Fig 7 Sub Macro block partitions 8x8 4x8 8x4 4x4 14 JM reference software 12 The JM reference software is used for implementation of the H 264 codec The software package consists of configuration files viz encoder cfg and decoder cfg through which various input parameters like input sequence frame rate video resolution of the input sequence bit rate quantization parameter profile to be used etc can be set The command used under command prompt to execute the H 264 encoder is lencod exe f encoder cfg encoder cfg is parsed to get all the input parameters set by the user JM software version used for testing JM 17 2 Latest version available JM 18 0 Test Sequences akiyo cif yuv akiyo qcif yuv carphone cif yuv carphone qcif yu v container cif yuv container qcif yuv Results obtained using original JM 17 2 reference software akiyo qcif 30 FPS 30 Frames encoded Results obtained using original JM 17 2 reference software carphone qcif 30 FPS 30 Frames encoded Results obtained using original JM 17 2 reference software container qcif 30 FPS 30 Frames encoded Results obtained using original JM 17 2 reference software akiyo cif 30 FPS 30 Frames encoded Results obtained using original JM 17 2 reference software carphone cif 30 FPS 30 Frames encoded Results obtained using original JM 17 2 reference software container cif 30 FPS 30 Frames encoded Need of fast mode decision Motion estimation in H 264 takes about 60 to 70 percent of the total encoding time Mode selection for intra and inter prediction results in considerable amount of computation and memory access In RD optimization all the modes are checked and then the best one with the least rate distortion cost is selected This increases coding efficiency but price to pay is increased computational complexity 592 RDO calculations by H 264 AVC encoder for intra prediction To select the best mode for one macro block Increase in the computational complexity poses implementation limitations especially on handheld devices with limited battery life How to make fast mode decision The complexity in mode selection for intra and inter mode selection can be reduced using thread level parallelism approach RDO mode decision algorithm can be implemented based on thread level parallelism for the H 264 encoder This approach can efficiently resolve the dependences and exploit thread level parallelism for fast mode decision Challenge Reduction in the total encoding time without PSNR loss and bit rate increment Multicore 6 An architecture design that places multiple processors on a single die computer chip Each processor is called a core These designs known as Chip Multiprocessors allow single
View Full Document