Gaurav Hansda 1000721849 [email protected] Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE 5359Outline Introduction to H.264 Current algorithms for intra prediction Proposed algorithms Implementation results ConclusionsOverview of H.264 H.264 is an industry standard. It defines a format for compressed video data. It provides a set of tools that can be used in a variety of ways to compress and communicate visual information. Purpose of a standard Define a coded representation (or syntax) that describes visual data in a compressed form and method of decoding the syntax to reconstruct visual information. compliant encoders and decoders can successfully interoperate with each other.H.264 Profiles I frames P frames CAVLC Baseline Profile Main Profile B frames Interlace CABAC Extended Profile SP frames SI frames FMO Redundant slices Fig. 1. H.264 profiles [2]Applications Broadcast television Streaming video Video storage and playback Videoconferencing Mobile video Studio distributionCoding Process The image is divided into macroblocks (16x16 pixels). The macroblocks are grouped into slice groups, which are divided into slices. Each slice is coded either as an I, P or B slice (There are also types called SI and SP). In an I slice, all blocks are coded as I blocks. In a P slice, blocks are coded as I or P blocks. In a B slice, blocks are coded as I, P or B blocks.Fig. 2. Typical H.264 encoder [18]Fig.3. Typical H.264 decoder [18]Mode decision of H.264 encoder Fig. 4. Mode decision hierarchy of an H.264 compliant encoder. [4]Implication of Hierarchical Structure 1. To ensure the correctness of the decision at upper layer. 2. To ensure early termination is executed accurately and as early as possible. Most fast mode decision algorithms developed so far, only deal with a single stage of the mode decision hierarchy [5]-[14] and fail to achieve the best possible complexity reduction.Intra-Prediction There are 3 macroblock (MB) modes for intra prediction of luma pixels: intra4x4 (I4MB), intra8x8 (I8MB), and intra16x16 (I16MB). Intra4MB and Intra8MB have 9 prediction modes as shown in Fig. 5(a). Intra16MB has only 4 prediction modes as shown in Fig. 5(b). Fig. 5. Prediction modes for (a) Intra4MB and (b) Intra16MB. [4]Fig. 6. Prediction flow diagram [18] Fig. 7. Intra-prediction [18]Mode Decision To achieve a better tradeoff between bit-rate and distortion, H.264 encoder adopts the rate-distortion (R-D) optimization framework and the Lagrangian technique for mode decision [2]. For intra frames, the best prediction mode of a block is defined as the mode that, among all prediction modes of the block, gives rise to the minimum R-D cost. The R-D cost of an MB mode is the sum of the minimum R-D cost of each individual block.Proposed Algorithm for Block Size Decision Block size is highly correlated with texture complexity. Variance of a block corresponds to the total energy of the AC coefficients of the block, hence it is good measurement of the texture complexity. Thus variance based classification of texture complexity is used [16]. If variance is above the threshold, Intra4MB and Intra8MB is selected; otherwise, Intra8MB and Intra16MB is chosen. This is simple way to skip the examination of Intra4MB mode.Fig. 8. Variance-based MB mode decision [4]Improved Prediction Mode Decision Earlier algorithms only consider the edge information of the current block. The correlation between blocks was not considered. Hence the Most Probable Mode(MPM) is used. The MPM, which takes advantage of the spatial correlation of the prediction modes between the neighboring blocks and the current block for coding, is defined as the prediction mode of the left or the upper neighbor, whichever has the smaller prediction mode number.Improved Prediction Mode Decision Each original image block is evenly divided into four subblocks first. Each subblock is represented by the average pixel magnitude of its pixels. Fig. 9. Formation of subsampled block for a block of (a) Intra4MB, (b) Intra8MB, and (c) Intra16MB [4] Apply the following filters: • Determine the dominant edge. Fig.10. Five sets of filter coefficients for dominant edge detection. [4]Input 2x2 subsampled block Pass through the filters separately Determine the dominant edge Choose the candidate modes Fig. 11. Prediction mode decision [4]Proposed algorithm for Intra Block Decision Intra block decision, for inter frames, occupies a considerable percentage of the total computations of inter-frame coding. Intra16MB takes much less computation time than the other modes. Hence, scaled R-D cost is used [4]. An MB is less probable to be intra coded if the R-D cost difference between best inter mode and Intra16MB is small. Denoting the scaled R-D cost differences between Intra16MB and the inter MB mode by dˆJ, and based on the above observation, if dˆJ is small both I4MB and I8MB can be skipped.Joint Model (JM) Reference implementation standardized in ISO/IEC JTC1/SC29/WG11 Decoder implements almost all the features Encoder Exercises most of the important coding tools Provides an elaborate list of control parameters Offers a rate distortion optimized implementation Offers several fast computation options Serves as a reference for what is best quality possible using H.264 Good description of the reference algorithms exists Currently at v17.2 Can be downloaded from http://iphome.hhi.de/suehring/tml/download/Sequences used Hall (CIF and QCIF) Container (CIF and QCIF) Mobile (CIF)JM software implementation The JM reference software version 17.2 is implemented [17]. The conditions of the experiment are as follows. 1. Run on a PC with Intel Core i3 2.27GHz processor and 3.00 GB RAM. 2. Set the QP values to 16, 20, 24, and 28. 3. Number of frames to be coded=100 4. Enable the R-D optimization. 5. Choose context-adaptive binary arithmetic coding (CABAC) as the entropy coding method.Performance QP PSNR(dB) Bit rate (kbits/s) Time (s) SSIM 28 38.685 697.64 20.022 0.9766 24 41.378 961.59 30.875 0.9831 20 44.118 1359.14 34.094 0.9883 16 47.085 1930.5 37.767 0.9934 Table I. Performance of
View Full Document