UT Arlington EE 5359 - H.264-BASED DISTRIBUTED VIDEO CODER - D2000443

Home> Schools> University of Texas at Arlington> Electrical Engineering (EE) > EE 5359> H.264-BASED DISTRIBUTED VIDEO CODER

DOC PREVIEW

UT Arlington EE 5359 - H.264-BASED DISTRIBUTED VIDEO CODER

School name University of Texas at Arlington

Course Ee 5359- Topics in Signal Processing

Pages 4

This preview shows page 1 out of 4 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

PARAMETER ESTIMATION FOR AN H 264 BASED DISTRIBUTED VIDEO CODER B Macchiavello R L de Queiroz Departamento de Engenharia Eletrica Universidade de Brasilia Brasilia DF Brazil Email bruno queiroz image unb br ABSTRACT In this paper we present a statistical model used to select coding parameters for a mixed resolution Wyner Ziv framework implemented using the H 264 AVC standard This paper extends the results of a previous work for the H 263 case to the H 264 AVC coder since the parameters need to be recalculated for the H 264 case The proposed correlation estimation mechanism guides the parameter choice process and also yields the statistical model used for decoding This mechanism is proposed based on extracting edge information and residual error rate in co located blocks from the low resolution base layer that is available at both ends Index Terms distributed video coding Wyner Ziv parameter estimation 1 INTRODUCTION Distributed source coding DSC which has its roots in the theory of coding correlated sources developed by Slepian and Wolf 1 for the lossless case and Wyner and Ziv 2 for the lossy case has been recently applied to video coding to enable a reversed complexity coding mode 3 8 In reversed mode the encoder complexity is reduced by eliminating the motion estimation task or obviating the need for full motion search The performance loss is partially recovered by a more complex decoding process exploiting source statistics In previous works 9 10 we proposed a mixed resolution framework that can be implemented as an optional coding mode in any existing video codec standard In this framework the reference frames are coded exactly as in a regular codec as I P or reference B frames at full resolution For the non reference P or B frames the encoding complexity is reduced by low resolution LR encoding At the decoder a high quality version of the non reference frames are generated by a multi frame motion based mixed super resolution mechanism 9 12 The interpolated LR reconstruction is subtracted form this frame to obtain the side information SI Work supported by HP Brasil D Mukherjee Hewlett Packard Labs Palo Alto CA USA debargha mukherjee hpl hp com which is a Laplacian residual frame Thereafter the WynerZiv WZ layer is channel decoded to obtain the final reconstruction In realistic usage scenarios for video communication using power constrained devices it is not necessary for a video decoder to reproduce the signal immediately after reception Therefore a feedback channel may not always be available In the mixed resolution approach the LR layer can be immediately decoded for real time communications More important since the framework does not use a feedback channel for rate estimation it enables the enhancement layer to be decoded offline However the elimination of the feedback channel requires a sophisticated mechanism for estimating the correlation statistics at the encoder followed by mapping the estimated statistics to actual encoding parameters A previous work 13 presented a such estimation model using H 263 as the regular codec We present in this paper as a continuation 13 a statistical model as well as a mechanism to estimate the model parameters for a memoryless coset code using H 264 AVC 2 WYNER ZIV CODING MODE ON H 264 AVC The basic architecture for the WZ coding mode can be found elsewhere 9 10 Summarizing at the encoder shown in Fig 1 the non reference frames are decimated and coded using decimated versions of the reconstructed reference frames in the frame store Then the Laplacian residual obtained by taking the difference between the original frame and an interpolated version of the LR layer reconstruction is WZ coded to form the enhancement layer Related work 14 has also explored spatial reduction Nevertheless our mixed resolution approach while less aggressive in complexity reduction may achieve better compression efficiency At the decoder the LR image is decoded and interpolated The optional process of enhancement begins with the generation of the SI The interpolated decoded frame and the reference frames are used to create a semi super resolution version of the current frame 11 Then it is subtracted from the interpolated LR decoded frame The resulting residual frame is The regularly coded reference frames and the LR layer frames are assumed to be coded with quantization step size QPt Therefore the enhancement layer frames should be ideally coded such that the distortion is at about the same level as that obtained by regular coding with QPt A ratedistortion analysis to find the optimal encoding parameters QP M based on our statistical model can be found elsewhere 10 15 Reconstructed ref Frame store Syntax Element list for ref Frames n 2x2 n Syntax Element Transform Interpolated Reconstructed Frame Current Frame Reconstructed LR frame Current LR frame 3 CORRELATION STATISTICS ESTIMATION Regular Coder n 2x2 n n 2x2 n Residual Frame LR Bit stream Wyner Ziv Bit stream Wyner Ziv Coder Entropy Coder Fig 1 Architecture for the DVC encoding mode the actual SI frame to be used for channel decoding 2 1 Enhancement Layer Let the random variable X denote the transform coefficients of the residual error frame Then the quantization of X yields Q Q X QP QP being the quantization step size Next the cosets C C Q M X QP M M being the coset modulus are computed q M Q M Q M Q M q M M Q M q M M 2 Q M Q M M 2 1 If quantization bin q corresponds to interval xl q xh q then the probability of the bin q q and c c are given by p q Z xl q fX x dx 2 xh q p c X X p q q q q M c q q q M c Z xl q fX x dx xh q 3 The entropy coder that already exists in the regular coder can be reused for C but a different entropy coder conditioned on M should yield better compression For decoding the minimum MSE reconstruction function based on unquantized side information y and received coset index c is given by X Y C y c P q q q M c P R xl q q q q M c xh q xfX Y x y dx R xl q xh q fX Y x y dx 4 We assume a general enough statistical model Y X Z where X is a Laplacian distributed transform coefficient Z is additive Gaussian noise uncorrelated with X and 0 1 is an attenuation factor expected to decay at higher frequencies It is necessary to have an accurate estimation of X and Z for the encoder parameter choice and for minimum MSE reconstruction at the decoder Note that this is a generalization of the simpler model Y X Z 10 11 15 However we can rewrite it as Y X Z Then the same procedure described in 9 11 can be

View Full Document

UT Arlington EE 5359 - H.264-BASED DISTRIBUTED VIDEO CODER

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

This preview shows page 1 out of 4 pages.

UT Arlington EE 5359 - H.264-BASED DISTRIBUTED VIDEO CODER

Sign up for free to view:

Please select your school