UT Arlington EE 5359 - High-Performance Video Coding - D128492

Home> Schools> University of Texas at Arlington> Electrical Engineering (EE) > EE 5359> High-Performance Video Coding

DOC PREVIEW

UT Arlington EE 5359 - High-Performance Video Coding

School name University of Texas at Arlington

Course Ee 5359- Topics in Signal Processing

Pages 7

This preview shows page 1-2 out of 7 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 7 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 7 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 7 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Efficient Large Size Transforms for High-Performance Video Coding Rajan Joshi, Yuriy A. Reznik*, and Marta Karczewicz Qualcomm Inc, 5775 Morehouse Drive, San Diego, CA, USA 92121 ABSTRACT This paper describes design of transforms for extended block sizes for video coding. The proposed transforms are orthogonal integer transforms, based on a simple recursive factorization structure, and allow very compact and efficient implementations. We discuss techniques used for finding integer and scale factors in these transforms, and describe our final design. We evaluate efficiency of our proposed transforms in VCEG's H.265/JMKTA framework, and show that they achieve nearly identical performance compared to much more complex transforms in the current test model. Keywords: discrete cosine transform, DCT, factorization, multiplicative complexity, scaled transform, video coding, H.264, H.265, HEVC. 1. INTRODUCTION Discrete Cosine Transform of type II (DCT-II)1-3 is a fundamental operation performed by the majority of today’s image and video compression algorithms. It was first suggested by N. Ahmed, T. Natarajan, and K. R. Rao,1 and subsequent research provided a number of theoretical arguments for its use, such as energy compaction property, asymptotic equivalence of DCT to the Karhunen-Loève transform for signals produced by Markov-1 process with high correlation coefficient, etc. (see, e.g. [3, Chapter 3] for a survey of the related results). DCT-II of size 8 has served as the transform of choice in H.261, JPEG, MPEG-1, MPEG-2, H.263, and MPEG-4 visual standards.2,4-9 More recent standards, such as MPEG-4 AVC | H.264,10 VC-1,11 and AVS12 have adopted integer approximations of DCT-II with transform sizes: 4, 8, and 16. An emerging JPEG-XR image compression standard13 uses overlapping transforms, which are also based on 4-point DCT-II kernels. A new emerging standard, High Efficiency Video Coding (HEVC), currently under development by Joint Collaborative Team of video experts from MPEG and ITU-T SG16 (JCT-VC),14 includes a number of integer transforms of sizes ranging from 4 to 64. As video resolutions keep increasing, it is possible that even larger transforms will be considered in the future. In this paper we describe design of scaled integer transforms, which are numerically stable, fully recursive in structure, and remain orthogonal with perfect scaling (in the absence of quantization). As such, they are well suitable for use in future video coding applications. Described transforms have been proposed to ITU-T SG16 Q6 (VCEG) standardization committee, 21 and were also included in Qualcomm’s response to JCT-VC call for proposals22. This paper is organized as follows. In Section 2, we describe design of underlying factorization that we use in the transform. Section 3 discusses conversion to integer arithmetic and other implementation aspects. Section 4 provides experimental results obtained using this transform in ITU-T SG16 Q6 JMKTA video coding model. Conclusions are drawn in Section 5. 2. FACTORIZATION Let {}, 0,..., 1nx n N= − be a sequence of input samples (i.e. line of pixel values). DCT-II and its inverse transform over this sequence are defined as follows: 100,...., 1,2 (2 1)( ) cos ,2NIIk nnk Nn kX k xN Nπλ−== −+ =  ∑ * Corresponding author. Email: [email protected], Phone: 858-658-1866.100,...., 1,2 (2 1)( )cos ,2NIIn kkn Nn kx X kN Nπλ−== −+ =  ∑ where ()1/ 2kλ=, if 0,k= and 1 otherwise. DCT-IV transform and its inverse are defined as follows: 1010, 0,...., 1,, 0,...., 1.2cos (2 1)(2 1)42cos (2 1)(2 1)4NIVk nnNIVk knk Nn NX x n kN Nx X n kN Nππ−=−== −= − = + +   = + +  ∑∑ We note that in video coding, we usually work with NxN matrices of input data, and so the above transforms need to be applied to all rows and columns in a separable fashion, to produce corresponding matrices of transform coefficients2. Hereafter we will adopt such separable model in our design of integer 2D transforms, and will focus mainly on speeding up computations of the component 1D transforms†. For further convenience, we omit normalization factors (2/N and ()kλ) and define matrices: ()( )( )2 1( , ) cos , , 0,..., 1,22 1 2 1( , ) cos , , 0,..., 1,4IINIVNn kC n k n k NNn kC n k n k NNππ += = −    + += = −    representing coefficients of DCT-II, and DCT-IV transforms correspondingly. It is well known, that even-sized DCT-II matrix can be factored into a product containing direct sum of smaller DCT-II and DCT-IV matrices as follows2,3: /2 /2/2/2 /2/2 /200IIN NIINN NIVN NN NI JCC PJ IC J  =   −  , (1) where PN is a permutation matrix producing reordering: 2 /2 2 1, , 0,1,..., / 2 1,i i N i ix x x x i N+ +′ ′= = = − (2) and where IN/2 and JN/2 denote N/2×N/2 identity and order reversal matrices respectively. Chen-Smith-Fralick15, Wang, and many other well-known DCT-II factorizations2,3 rely on factorization (1) as a basic step in their decimation process. We next apply the following decomposition to the DCT-IV block in (1): /2 /2/2 1 /2 1/2/2 /2 /2/2 1 /2 1/21 00 000 000 1IIN NN NIV TNN N NIIN N NN NNI II ICC P RE J EI IC− −− −       =      −     − , (3) where: PN is a reordering matrix (2), EN/2 is the diagonal sign-alteration matrix ( ){}/2diag -1 , 0,1,... / 2 1,kNE k N= = − (4) † While faster non-separable 2D designs can possibly be created, 3 their large expanded structure, and the need for support of multiple (including hybrid, e.g. 8x16) block sizes makes this approach much less appealing in practice.RN is the matrix of Givens rotations: ( ) ( )( ) ( )cos sin4 43 3cos sin4 41 1cos sin4 41 1sin cos4 43 3sin cos4 4sin cos4 4NN NN NN NN NRN NN NN NN Nπ ππ ππ ππ ππ ππ π        − −  = − − −    −   −  O NN O, (5) and where /2IINC denotes matrices of the remaining half-sized DCT-II transforms. This decomposition of DCT-IV is very similar to the one derived by G. Plonka and M.Tache17. In our formulation (3) all

View Full Document

UT Arlington EE 5359 - High-Performance Video Coding

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2 out of 7 pages.

UT Arlington EE 5359 - High-Performance Video Coding

Sign up for free to view:

Please select your school