DOC PREVIEW
UT Arlington EE 5359 - High-Performance Video Coding

This preview shows page 1-2 out of 7 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Efficient Large Size Transforms for High-Performance Video Coding Rajan Joshi, Yuriy A. Reznik*, and Marta Karczewicz Qualcomm Inc, 5775 Morehouse Drive, San Diego, CA, USA 92121 ABSTRACT This paper describes design of transforms for extended block sizes for video coding. The proposed transforms are orthogonal integer transforms, based on a simple recursive factorization structure, and allow very compact and efficient implementations. We discuss techniques used for finding integer and scale factors in these transforms, and describe our final design. We evaluate efficiency of our proposed transforms in VCEG's H.265/JMKTA framework, and show that they achieve nearly identical performance compared to much more complex transforms in the current test model. Keywords: discrete cosine transform, DCT, factorization, multiplicative complexity, scaled transform, video coding, H.264, H.265, HEVC. 1. INTRODUCTION Discrete Cosine Transform of type II (DCT-II)1-3 is a fundamental operation performed by the majority of today’s image and video compression algorithms. It was first suggested by N. Ahmed, T. Natarajan, and K. R. Rao,1 and subsequent research provided a number of theoretical arguments for its use, such as energy compaction property, asymptotic equivalence of DCT to the Karhunen-Loève transform for signals produced by Markov-1 process with high correlation coefficient, etc. (see, e.g. [3, Chapter 3] for a survey of the related results). DCT-II of size 8 has served as the transform of choice in H.261, JPEG, MPEG-1, MPEG-2, H.263, and MPEG-4 visual standards.2,4-9 More recent standards, such as MPEG-4 AVC | H.264,10 VC-1,11 and AVS12 have adopted integer approximations of DCT-II with transform sizes: 4, 8, and 16. An emerging JPEG-XR image compression standard13 uses overlapping transforms, which are also based on 4-point DCT-II kernels. A new emerging standard, High Efficiency Video Coding (HEVC), currently under development by Joint Collaborative Team of video experts from MPEG and ITU-T SG16 (JCT-VC),14 includes a number of integer transforms of sizes ranging from 4 to 64. As video resolutions keep increasing, it is possible that even larger transforms will be considered in the future. In this paper we describe design of scaled integer transforms, which are numerically stable, fully recursive in structure, and remain orthogonal with perfect scaling (in the absence of quantization). As such, they are well suitable for use in future video coding applications. Described transforms have been proposed to ITU-T SG16 Q6 (VCEG) standardization committee, 21 and were also included in Qualcomm’s response to JCT-VC call for proposals22. This paper is organized as follows. In Section 2, we describe design of underlying factorization that we use in the transform. Section 3 discusses conversion to integer arithmetic and other implementation aspects. Section 4 provides experimental results obtained using this transform in ITU-T SG16 Q6 JMKTA video coding model. Conclusions are drawn in Section 5. 2. FACTORIZATION Let {}, 0,..., 1nx n N= − be a sequence of input samples (i.e. line of pixel values). DCT-II and its inverse transform over this sequence are defined as follows: 100,...., 1,2 (2 1)( ) cos ,2NIIk nnk Nn kX k xN Nπλ−== −+ =  ∑ * Corresponding author. Email: [email protected], Phone: 858-658-1866.100,...., 1,2 (2 1)( )cos ,2NIIn kkn Nn kx X kN Nπλ−== −+ =  ∑ where ()1/ 2kλ=, if 0,k= and 1 otherwise. DCT-IV transform and its inverse are defined as follows: 1010, 0,...., 1,, 0,...., 1.2cos (2 1)(2 1)42cos (2 1)(2 1)4NIVk nnNIVk knk Nn NX x n kN Nx X n kN Nππ−=−== −= − = + +   = + +  ∑∑ We note that in video coding, we usually work with NxN matrices of input data, and so the above transforms need to be applied to all rows and columns in a separable fashion, to produce corresponding matrices of transform coefficients2. Hereafter we will adopt such separable model in our design of integer 2D transforms, and will focus mainly on speeding up computations of the component 1D transforms†. For further convenience, we omit normalization factors (2/N and ()kλ) and define matrices: ()( )( )2 1( , ) cos , , 0,..., 1,22 1 2 1( , ) cos , , 0,..., 1,4IINIVNn kC n k n k NNn kC n k n k NNππ += = −    + += = −    representing coefficients of DCT-II, and DCT-IV transforms correspondingly. It is well known, that even-sized DCT-II matrix can be factored into a product containing direct sum of smaller DCT-II and DCT-IV matrices as follows2,3: /2 /2/2/2 /2/2 /200IIN NIINN NIVN NN NI JCC PJ IC J  =   −  , (1) where PN is a permutation matrix producing reordering: 2 /2 2 1, , 0,1,..., / 2 1,i i N i ix x x x i N+ +′ ′= = = − (2) and where IN/2 and JN/2 denote N/2×N/2 identity and order reversal matrices respectively. Chen-Smith-Fralick15, Wang, and many other well-known DCT-II factorizations2,3 rely on factorization (1) as a basic step in their decimation process. We next apply the following decomposition to the DCT-IV block in (1): /2 /2/2 1 /2 1/2/2 /2 /2/2 1 /2 1/21 00 000 000 1IIN NN NIV TNN N NIIN N NN NNI II ICC P RE J EI IC− −− −       =      −     − , (3) where: PN is a reordering matrix (2), EN/2 is the diagonal sign-alteration matrix ( ){}/2diag -1 , 0,1,... / 2 1,kNE k N= = − (4) † While faster non-separable 2D designs can possibly be created, 3 their large expanded structure, and the need for support of multiple (including hybrid, e.g. 8x16) block sizes makes this approach much less appealing in practice.RN is the matrix of Givens rotations: ( ) ( )( ) ( )cos sin4 43 3cos sin4 41 1cos sin4 41 1sin cos4 43 3sin cos4 4sin cos4 4NN NN NN NN NRN NN NN NN Nπ ππ ππ ππ ππ ππ π        − −  = − − −    −   −  O NN O, (5) and where /2IINC denotes matrices of the remaining half-sized DCT-II transforms. This decomposition of DCT-IV is very similar to the one derived by G. Plonka and M.Tache17. In our formulation (3) all


View Full Document

UT Arlington EE 5359 - High-Performance Video Coding

Documents in this Course
JPEG 2000

JPEG 2000

27 pages

MPEG-II

MPEG-II

45 pages

MATLAB

MATLAB

22 pages

AVS China

AVS China

22 pages

Load more
Download High-Performance Video Coding
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view High-Performance Video Coding and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view High-Performance Video Coding 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?