EE 5359 MULTIMEDIA PROCESSING SPRING 2011 Final Report IMPLEMENTATION AND ANALYSIS OF DIRECTIONAL DISCRETE COSINE TRANSFORM IN H 264 Under guidance of DR K R RAO DEPARTMENT OF ELECTRICAL ENGINEERING UNIVERSITY OF TEXAS AT ARLINGTON Presented by PRIYADARSHINI ANJANAPPA priyadarshini anjanappa mavs uta edu Introduction A popular scenario in image blocks is the occurrence of directional edges By recognizing such characteristics the video coding standard H 264 advanced video coding AVC 2 has developed a number of directional predictions in the coding of all intra blocks called intra predictions But it is still the conventional discrete cosine transform DCT 3 that is used after each intra prediction Conventional DCT 3 The 2D DCT of a square or a rectangular block is used for almost all block based transform schemes for image and video coding The conventional 2D DCT is implemented separately through two 1D transforms one along the vertical direction and the other along the horizontal direction as shown in Fig 1 These two processes can be interchanged as the 2D DCT is a separable transform The conventional DCT seems to be the best choice for image blocks in which vertical and or horizontal edges are dominating Fig 1 2D DCT implementation A combination of 1D DCTs along horizontal and vertical directions Forward 2D DCT NXM XC2 k l cos C k C l k 0 1 N 1 C p p 0 l 0 1 M 1 C p 1 p 0 cos p k l x n m samples in the 2D data domain XC2 k l coefficients in the 2D DCT domain Inverse 2D DCT NXM C l XC2 k l cos cos n 0 1 N 1 m 0 1 M 1 The transform used by H 264 AVC to process both intra and inter prediction residuals 4 is related to an integer 2D DCT implemented using 1D DCTs horizontally followed by 1D DCTs vertically This can be interchanged It has been found that the coding efficiency can be improved by using directional transforms 1 6 since the residuals often contain textures that exhibit directional features A directional DCT DDCT framework 1 has been developed which provides a remarkable coding gain as compared to the conventional DCT The encoder block diagram representing the basic coding structure for H 264 AVC for a macroblock is shown in Fig 2 The illustration of H 264 profiles is shown in Fig 3 Fig 2 H 264 encoder block diagram 2 Fig 3 Illustration of H 264 profiles 14 Intra coding in AVC and where DDCT fits in 8 The H 264 encoder forms a prediction of the current macroblock One based on the current frame using intra prediction spatial prediction technique Intra prediction is an important technique in image and video compression to exploit spatial correlation within one picture It has 4 prediction modes for 16x16 blocks shown in Fig 4 9 prediction modes for 8x8 blocks and 9 prediction modes for 4x4 blocks shown in Fig 5 4 8 a 16x16 for Luma Mode 0 vertical extrapolation from upper samples H Mode 1 horizontal extrapolation from left samples V Mode 2 DC mean of upper and left hand samples H V Mode 3 Plane a linear plane function is fitted to the upper and left hand samples H and V This works well in areas of smoothly varying luminance Fig 4 16x16 luma intra prediction modes 5 b 4x4 for Luma Mode 0 Vertical Mode 1 Horizontal Mode 2 DC Mode 3 Diagonal down left Mode 4 Diagonal down right Mode 5 Vertical Right Mode 6 Horizontal down Mode 7 Vertical left Mode 8 Horizontal up Fig 5 4x4 luma intra prediction modes 5 A H they are the previously coded pixels of the upper macroblock and are available both at encoder decoder I L they are the previously coded pixels of the left macroblock and are available both at encoder decoder M it is the previously coded pixel of the upper left macroblock and available both at encoder decoder For each intra prediction mode an intra prediction algorithm is used to predict the image content in the current block based on decoded neighbors The intra prediction errors are transformed using a 4x4 integer DCT An additional 2x2 Hadamard transform is applied to the four DC coefficients of each chroma component If a macroblock is coded in intra 16x16 mode a similar 4x4 transform is performed for the 4x4 DC coefficients of the luma signal as shown in Fig 5a 2 In this framework DDCT is to replace the AVC transforms by a set of transforms taking into account the prediction mode of the current block Hence DDCT provides 9 transforms for 4x4 9 transforms for 8x8 and 4 transforms for 16x16 although many of them are the same or can be simply inferred from a core transform For each transform the DDCT also provides a fixed scanning pattern based on the quantization parameter QP and the intra prediction mode to replace the zigzag scanning pattern of DCT coefficients in AVC Fig 5a 4x4 DC coefficients for intra 16x16 mode Transforms 8 DDCT provides 9 transforms for 4x4 9 transforms for 8x8 and 4 transforms for 16x16 4 5 8 For each intra prediction mode DDCT consists of two stages Stage 1 along the prediction direction pixels that align along the prediction direction are grouped together and 1 D DCT is applied Note that in cases of prediction modes that are neither horizontal nor vertical the DCTs used are of different sizes Stage 2 across the prediction direction another stage of DCT is applied to the transform coefficients resulted in the first stage Again the DCTs may be of different sizes Six directional modes of DDCT are shown in Fig 6 Stage 1 is illustrated in Fig 7 and stage 2 is illustrated in Fig 8 To make the transform sizes more balanced the DDCTs group pixels in the corners together in order to use DCT of longer size hence more efficient in terms of compression Fig 6 Six directional modes in DDCT defined in a similar way as in H 264 for the block size 8x8 1 The vertical and horizontal modes are not included here Fig 7 NXN image block in which 1D DCT is applied along diagonal down left direction mode 3 1 Fig 8 Arrangement of coefficients after the first 1 D DCT followed by rearrangement of coefficients after the second DCT as well as the modified zig zag scanning 1 Illustration of mode 3 DDCT Mode 3 implementation of DDCT is illustrated in Fig 9 Fig 9 Implementation of mode 3 DDCT Procedure to obtain basis images After step 3 in Fig 9 for each basis image repeat step 4 in Fig 9 by replacing the corresponding coefficient with 1 and the remaining coefficients with 0 For a 4x4 block 16 basis images are obtained The same procedure is applied for all the other DDCT modes The basis images for 4x4 mode 3 DDCT is shown in Fig 11 The basis images for mode 0 1 DDCT mode 3 DDCT and
View Full Document