DOC PREVIEW
VCIP09-Lai

This preview shows page 1-2-3 out of 10 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Improving View Rendering Quality and Coding Efficiency bySuppressing Compression Artifacts in Depth-Image CodingPoLin Lai∗,†, Antonio Ortega∗, Camilo C. Dorea†,PengYin†, and Cristina Gomila†∗Signal and Image Processing Institute, Univ. of Southern California, Los Angeles, CA 90089†Thomson Corporate Research, 2 Independence Way, Princeton, NJ 08540ABSTRACTWe consider the effect of depth-image compression artifacts on the quality of virtual views rendered usingneighboring views. Such view rendering processes are utilized in new video applications such as 3D television(3DTV) and free viewpoint video (FVV). We first analyze how compression artifacts in compressed depth-imagesresult in distortions in rendered views. We show that the rendering position error is a monotonic function of thecoding error. For the scenario in which cameras are arranged with parallel optical axes , we further demonstratespecific properties of rendering position error. Exploiting special characteristics of depth-images, namely smoothregions separated by sharp edges, we investigate a possible solution to suppress compression artifacts by encodingdepth-images with a recently published sparsity-based in-loop de-artifacting filter. Simulation results show thatapplying such techniques not only provides significantly higher coding efficiency for depth-image coding, but,more importantly, also improves the quality of rendered views in terms of PSNR and subjective quality.Keywords: Depth compression, multiview video plus depth, de-artifact filter, view rendering1. INTRODUCTIONWith advances in computer vision, multiview video coding, and other related fields, recently, new video applica-tions such as 3D television (3DTV) and free viewpoint video (FVV) have drawn wide attention. 3DTV aims toprovide viewers depth perception of the scene by simultaneously rendering multiple images for different viewingangles using special displays. Instead, the goal of FVV is to allow viewers to select an arbitrary viewing angle ofthe scene. Currently, the most widely used video representation which enables 3DTV and FVV functionalities isthe multiview video plus depth format (MVD). It consists of multiple video sequences (referred to as views), cap-tured by synchronized cameras from different viewpoints, and the corresponding per pixel depth maps (referredto as depth-images). The pixel values in these gray-level depth-images represent depth Z in the range betweenZnearand Zfar. At pixel location (x, y) in the depth image D we will haveD(x, y) = 255 ·Z(x, y)−1− (Zfar)−1(Znear)−1− (Zfar)−1, (1)Objects with smaller depths (closer to the camera) will appear brighter in the depth-image and farther awayobjects will be darker. Fig. 1 shows some examples of depth-images. Virtual views can be rendered fromcaptured views using view-warping algorithms with the corresponding depth-images and camera parameters. Inthe following, we will refer to this process as view rendering.The MVD format contains significant amounts of data and compression of MVD is essential in order to realizenew 3DTV and FVV applications. For the video data, efficient multiview video coding schemes can be achievedby combining new coding tools for multiview scenarios[1]with existing techniques for conventional video coding.As for depth-images, they are not viewed by users directly, and instead they are used to render virtual views.Thus, the effect of lossy depth-image compression on view rendering should be studied. In addition to consideringconventional rate-distortion criteria, the performance of depth-image coding should also be evaluated based on thequality of rendered views. For this purpose, there are two PSNR/MSE (mean-squared error) metrics employed inthe literature. Let us denote D andˆD the original depth-image and the compressed one, respectively. Let VR(D),VR(ˆD) denote the rendered views using D andˆD, respectively. The first metric, denoted PSNR(VR(ˆD),VR(D)),calculates the PSNR between VR(ˆD)andVR(D)[2].Thismetricimplicitly assumes that using uncompressed(a) Ballet, depth-image V4 frame 0 (b) Breakdancers, depth-image V4 frame 0Figure 1. Examples of depth-imagedepth-images leads to the best rendering quality and thus VR(D) is used as a reference to evaluate differentˆD:higher PSNR indicates VR(ˆD)iscloserVR(D). However, when the original depth-image D is estimated (e.g.via disparity estimation) instead of obtained using active systems such as range camera, there could be somedistortion in D, and thus VR(D) may not serve as a good reference for rendering quality. For example in Fig. 1,we can see some false contours in the floor area. Processing such region could lead to improved rendering quality,while it will decrease PSNR(VR(ˆD),VR(D)) as the depth-image is “altered”. On the other hand, the secondPSNR metric is calculated between VR(ˆD)andtheground truth VC, i.e., the corresponding view captured at thesame location and orientation chosen for rendering[3](denoted as PSNR(VR(ˆD),VC).) For example in a MVDsystem, one can use compressed video and depth-images from View 0 and View 2 to render View 1 and evaluatePSNR against the captured View 1. This metric directly reflects the fidelity of the rendered view. In this paper,we will use this metric to measure the rendering quality.Using PSNR(VR(ˆD),VR(D)), the effect of MVD compression has been investigated by changing the bitrateratio between multiview video and the depth-images[4]. The results indicate that, on rendered views, the qualityof compressed depth-images has much stronger impact than the quality of compressed video. The compressionartifacts in depth-images result in significant distortion around depth discontinuities, which occur at boundariesof objects with different depths. To better preserve these boundaries, the authors suggested processing the codeddepth-images, and/or designing other coding techniques. Y. Morvan et al.[5]proposed an alternative intra codingapproach, called “platelet-based coding”, to encode depth-images. Considering the fact that depth-images aretypically made up with smooth regions separated by depth discontinuities, the authors employed a quadtreedecomposition to partition depth-images into blocks with variable sizes and represent them with piecewise linearfunctions (platelets). It was reported that, in certain low-bitrate scenarios, depth-images encoded with platelet-based method provide higher PSNR for the rendered


VCIP09-Lai

Download VCIP09-Lai
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view VCIP09-Lai and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view VCIP09-Lai 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?