UT Arlington EE 5359 - H.264-BASED DISTRIBUTED VIDEO CODER - D2000443

Home> Schools> University of Texas at Arlington> Electrical Engineering (EE) > EE 5359> H.264-BASED DISTRIBUTED VIDEO CODER

DOC PREVIEW

UT Arlington EE 5359 - H.264-BASED DISTRIBUTED VIDEO CODER

School name University of Texas at Arlington

Course Ee 5359- Topics in Signal Processing

Pages 4

This preview shows page 1 out of 4 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

PARAMETER ESTIMATION FOR AN H.264-BASED DISTRIBUTED VIDEO CODERB. Macchiavello, R. L. de Queiroz∗Departamento de Engenharia EletricaUniversidade de BrasiliaBrasilia, DF, BrazilEmail: bruno,[email protected]. MukherjeeHewlett Packard LabsPalo Alto, CA, [email protected] this paper we present a statistical model used to select cod-ing parameters for a mixed resolution Wyner-Ziv frameworkimplemented using the H.264/AVC standard. This paper ex-tends the results of a previous work for the H.263+ case tothe H.264/AVC coder, since the parameters need to be recal-culated for the H.264 case. The proposed correlation estima-tion mechanism guides the parameter choice process, and alsoyields the statistical model used for decoding. This mecha-nism is proposed based on extracting edge information andresidual error rate in co-located blocks from the low resolu-tion base layer that is available at both ends.Index Terms— distributed video coding, Wyner-Ziv, pa-rameter estimation1. INTRODUCTIONDistributed source coding (DSC), which has its roots in thetheory of coding correlated sources developed by Slepian andWolf [1], for the lossless case, and Wyner and Ziv [2], for thelossy case, has been recently applied to video coding to en-able a reversed complexity coding mode [3]–[8]. In reversedmode, the encoder complexity is reduced by eliminating themotion estimation task or obviating the need for full motionsearch. The performance loss is partially recovered by a morecomplex decoding process exploiting source statistics.In previous works [9],[10] we proposed a mixed resolu-tion framework that can be implemented as an optional cod-ing mode in any existing video codec standard. In this frame-work, the reference frames are coded exactly as in a regularcodec as I-, P - or reference B-frames, at full resolution. Forthe non-reference P - or B- frames the encoding complexityis reduced by low resolution (LR) encoding. At the decoder,a high quality version of the non-reference frames are gen-erated by a multi-frame motion-based mixed super-resolutionmechanism [9]–[12]. The interpolated LR reconstruction issubtracted form this frame to obtain the side-information (SI),∗Work supported by HP Brasil.which is a Laplacian residual frame. Thereafter, the Wyner-Ziv (WZ) layer is channel decoded to obtain the final recon-struction.In realistic usage scenarios for video communication us-ing power-constrained devices, it is not necessary for a videodecoder to reproduce the signal immediately after reception.Therefore, a feedback channel may not always be available.In the mixed resolution approach, the LR layer can be im-mediately decoded for real-time communications. More im-portant, since the framework does not use a feedback chan-nel for rate-estimation, it enables the enhancement layer tobe decoded offline. However, the elimination of the feedbackchannel requires a sophisticated mechanism for estimating thecorrelation statistics at the encoder, followed by mapping theestimated statistics to actual encoding parameters. A previouswork [13] presented a such estimation model using H.263+as the regular codec. We present in this paper, as a continua-tion [13], a statistical model as well as a mechanism to esti-mate the model parameters for a memoryless coset code usingH.264/AVC.2. WYNER-ZIV CODING MODE ON H.264/AVCThe basic architecture for the WZ coding mode can be foundelsewhere [9], [10]. Summarizing, at the encoder (shown inFig.1), the non-reference frames are decimated and coded us-ing decimated versions of the reconstructed reference framesin the frame store. Then the Laplacian residual, obtained bytaking the difference between the original frame and an inter-polated version of the LR layer reconstruction, is WZ codedto form the enhancement layer. Related work [14] has alsoexplored spatial reduction. Nevertheless, our mixed resolu-tion approach, while less aggressive in complexity reduction,may achieve better compression efficiency.At the decoder, the LR image is decoded and interpolated.The optional process of enhancement begins with the genera-tion of the SI. The interpolated decoded frame and the refer-ence frames are used to create a semi super-resolution versionof the current frame [11]. Then, it is subtracted from the in-terpolated LR decoded frame. The resulting residual frame is+Wyner-ZivCoderEntropyCoderRegularCoderCurrentFrameInterpolatedReconstructedFrameResidualFrameCurrentLRframeReconstructedLRframe2 x2n n2 x2n nSyntaxElementTransform2 x2n nReconstructedref.Framestore+_LRBit-streamWyner-ZivBit-streamSyntaxElementlistforref.FramesFig. 1. Architecture for the DVC encoding mode.the actual SI frame to be used for channel decoding.2.1. Enhancement LayerLet the random variable X denote the transform coefficientsof the residual error frame. Then, the quantization of X yieldsQ : Q = φ(X, QP ), QP being the quantization step-size.Next, the cosets C : C = ψ(Q, M ) = ψ(φ(X, QP ), M ), Mbeing the coset modulus, are computed:ψ(q, M ) =(Q) − M ⌊Q/M ⌋ , (Q) − M ⌊q/M⌋ < M/2(Q) − M ⌊q/M⌋ − M, (Q) − M ⌊Q/M⌋ ≥ M/2(1)If quantization bin q corresponds to interval [xl(q), xh(q)],then the probability of the bin q ∈ Ψq, and c ∈ Ψcare givenby:p(q) =Zxl(q)xh(q)fX(x)dx (2)p(c) =Xq∈Ψq,ψ(q,M)=cp(q) =Xq∈Ψq,ψ(q,M)=cZxl(q)xh(q)fX(x)dx,(3)The entropy coder that already exists in the regular codercan be reused for C, but a different entropy coder conditionedon M should yield better compression. For decoding, theminimum MSE reconstruction function based on unquantizedside information y and received coset index c, is given by:ˆXY C(y, c) =Pq∈Ψq,ψ(q,M)=cRxl(q)xh(q)xfX|Y(x, y)dxPq∈Ψq,ψ(q,M)=cRxl(q)xh(q)fX|Y(x, y)dx. (4)The regularly coded reference frames and the LR layerframes are assumed to be coded with quantization step-sizeQPt. Therefore, the enhancement layer frames should beideally coded such that the distortion is at about the samelevel as that obtained by regular coding with QPt. A rate-distortion analysis to find the optimal encoding parametersQP, M based on our statistical model, can be found else-where [10], [15].3. CORRELATION STATISTICS ESTIMATIONWe assume a general enough statistical model: Y = ρX + Z,where X is a Laplacian distributed transform coefficient, Z isadditive Gaussian noise uncorrelated with X and 0 < ρ ≤ 1 isan attenuation factor expected to decay at higher

View Full Document

UT Arlington EE 5359 - H.264-BASED DISTRIBUTED VIDEO CODER

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

This preview shows page 1 out of 4 pages.

UT Arlington EE 5359 - H.264-BASED DISTRIBUTED VIDEO CODER

Sign up for free to view:

Please select your school