FALL 2009 EE 5359 MULTIMEDIA PROCESSING PROJECT REPORT RATE DISTORTION OPTIMIZATION USING SSIM IN H 264 I FRAME ENCODER INSTRUCTOR DR K R RAO Babu Hemanth Kumar Aswathappa Department of Electrical Engineering University of Texas at Arlington Email babuhemanthkumar aswathappa mavs uta edu 1 Page List of acronyms AVC Advanced Video Coding CABAC Context Adaptive Binary Arithmetic Coding CALVC context Adaptive Variable Length Coding D Distortion HDTV High Definition Television HVS Human Visual System I Intra frame ITU International Telecommunication Union JPEG Joint Photographic Experts Group JVT Joint Video Team MPEG Moving Picture Experts Group MSE Mean Squared Error MSSIM Mean Structural Similarity Index Measurement PSNR Peak to peak signal to noise Ratio QP Quantization Parameter RD Rate Distortion SDTV Standard Definition Television SSD Sum of Squared Differences SSIM Structural Similarity Index Measurement 2 Page ABSTRACT In the rate distortion optimization for H 264 I frame encoder 1 the distortion D is measured as the sum of the squared differences between the reconstructed and the original images which is same as MSE Although peak to peak signal to noise ratio PSNR and MSE are currently the most widely used objective metrics due to their low complexity and clear physical meaning they are also widely criticized for not correlating well with human visual system HVS for a long time 2 During past several decades a great deal of effort has been made to develop new image quality assessment based on error sensitivity theory of HVS but only limited success has been achieved by the reason that the HVS has not been well comprehended 2 Recently a new philosophy for image quality measurement was proposed based on the assumption that the human visual system is highly adapted to extract structural information from the viewing field It follows that a measure of structural information change can provide a good approximation to perceived image distortion In this new theory an item called structural similarity index SSIM including three comparisons is introduced to measure the structural information change Experiments have shown that the SSIM index method is easy to implement and can better correspond with human perceived measurement than PSNR or MSE 4 8 9 The main idea of this project is to employ SSIM in the rate distortion optimizations of H 264 Iframe encoder to choose the best prediction mode s The required modifications will be done on the JVT reference software JM92 program 3 Results in terms of size of the compressed image SSIM of the whole reconstructed image for H 264 JM92 software and the new method will be compared 3 Page INTRODUCTION MSE MEAN SQUARED ERROR MSE is a signal fidelity measure The goal of a signal fidelity measure is to compare two signals by providing a quantitative score that describes the degree of similarity fidelity or conversely the level of error distortion between them Usually it is assumed that one of the signals is a pristine original while the other is distorted or contaminated by errors Suppose that x xi i 1 2 N and y yi i 1 2 N are two finite length discrete signals e g visual images where N is the number of signal samples pixels if the signals are images and xi and yi are the values of the i th samples in x and y respectively The MSE between the signals x and y is MSE x y 1 N N xi yi 2 1 i 1 In the MSE we will often refer to the error signal ei xi yi which is the difference between the original and distorted signals If one of the signals is an original signal of acceptable or perhaps pristine quality and the other is a distorted version of it whose quality is being evaluated then the MSE may also be regarded as a measure of signal quality A more general form is the lp norm is 2 MSE is often converted into a peak to peak signal to noise ratio PSNR measure L2 3 MSE where L is the dynamic range of allowable image pixel intensities For example for images that have allocations of 8 bits pixel of gray scale L 28 1 255 The PSNR is useful if images having different dynamic ranges are being compared but otherwise contains no new information relative to the MSE PSNR 10log 10 WHY MSE 2 The MSE has many attractive features 1 It is simple It is parameter free and inexpensive to compute with a complexity of only one multiply and two additions per sample It is also memoryless the squared error can be evaluated at each sample independent of other samples 2 It has a clear physical meaning it is the natural way to define the energy of the error signal Such an energy measure is preserved after any orthogonal or unitary linear transformation such as the Fourier transform Parseval s theorem The energy preserving property guarantees that the energy of a signal distortion in the transform domain is the same as in the signal domain 4 Page 3 4 The MSE is an excellent metric in the context of optimization Minimum MSE MMSE optimization problems often have closed form analytical solutions and when they do not iterative numerical optimization procedures are often easy to formulate since the gradient and the Hessian matrix 2 of the MSE are easy to compute MSE is widely used simply because it is a convention Historically it has been employed extensively for optimizing and assessing a wide variety of signal processing applications including filter design signal compression restoration denoising reconstruction and classification Moreover throughout the literature competing algorithms have most often been compared using the MSE PSNR It therefore provides a convenient and extensive standard against which the MSE PSNR results of new algorithms may be compared This saves time and effort but further propagates the use of the MSE WHAT IS WRONG WITH MSE 2 It is apparent that the MSE possesses many favorable properties for application and analysis but the reader might point out that a more fundamental issue has been missing That is does the MSE really measure signal fidelity Given all of its attractive features a signal processing practitioner might opt for the MSE if it proved to be a reasonable signal fidelity measure But is that the case Unfortunately the converse appears true when the MSE is used to predict human perception of image fidelity and quality An illustrative example is shown in Figure 1 2 where an original Einstein image is altered by different types of distortion a contrast stretch mean luminance shift contamination by additive white Gaussian noise impulsive noise distortion
View Full Document