VCIP09-Lai

Home> Academic Documents> VCIP09-Lai

DOC PREVIEW

VCIP09-Lai

This preview shows page 1-2-3 out of 10 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Improving View Rendering Quality and Coding Eﬃciency bySuppressing Compression Artifacts in Depth-Image CodingPoLin Lai∗,†, Antonio Ortega∗, Camilo C. Dorea†,PengYin†, and Cristina Gomila†∗Signal and Image Processing Institute, Univ. of Southern California, Los Angeles, CA 90089†Thomson Corporate Research, 2 Independence Way, Princeton, NJ 08540ABSTRACTWe consider the eﬀect of depth-image compression artifacts on the quality of virtual views rendered usingneighboring views. Such view rendering processes are utilized in new video applications such as 3D television(3DTV) and free viewpoint video (FVV). We ﬁrst analyze how compression artifacts in compressed depth-imagesresult in distortions in rendered views. We show that the rendering position error is a monotonic function of thecoding error. For the scenario in which cameras are arranged with parallel optical axes , we further demonstratespeciﬁc properties of rendering position error. Exploiting special characteristics of depth-images, namely smoothregions separated by sharp edges, we investigate a possible solution to suppress compression artifacts by encodingdepth-images with a recently published sparsity-based in-loop de-artifacting ﬁlter. Simulation results show thatapplying such techniques not only provides signiﬁcantly higher coding eﬃciency for depth-image coding, but,more importantly, also improves the quality of rendered views in terms of PSNR and subjective quality.Keywords: Depth compression, multiview video plus depth, de-artifact ﬁlter, view rendering1. INTRODUCTIONWith advances in computer vision, multiview video coding, and other related ﬁelds, recently, new video applica-tions such as 3D television (3DTV) and free viewpoint video (FVV) have drawn wide attention. 3DTV aims toprovide viewers depth perception of the scene by simultaneously rendering multiple images for diﬀerent viewingangles using special displays. Instead, the goal of FVV is to allow viewers to select an arbitrary viewing angle ofthe scene. Currently, the most widely used video representation which enables 3DTV and FVV functionalities isthe multiview video plus depth format (MVD). It consists of multiple video sequences (referred to as views), cap-tured by synchronized cameras from diﬀerent viewpoints, and the corresponding per pixel depth maps (referredto as depth-images). The pixel values in these gray-level depth-images represent depth Z in the range betweenZnearand Zfar. At pixel location (x, y) in the depth image D we will haveD(x, y) = 255 ·Z(x, y)−1− (Zfar)−1(Znear)−1− (Zfar)−1, (1)Objects with smaller depths (closer to the camera) will appear brighter in the depth-image and farther awayobjects will be darker. Fig. 1 shows some examples of depth-images. Virtual views can be rendered fromcaptured views using view-warping algorithms with the corresponding depth-images and camera parameters. Inthe following, we will refer to this process as view rendering.The MVD format contains signiﬁcant amounts of data and compression of MVD is essential in order to realizenew 3DTV and FVV applications. For the video data, eﬃcient multiview video coding schemes can be achievedby combining new coding tools for multiview scenarios[1]with existing techniques for conventional video coding.As for depth-images, they are not viewed by users directly, and instead they are used to render virtual views.Thus, the eﬀect of lossy depth-image compression on view rendering should be studied. In addition to consideringconventional rate-distortion criteria, the performance of depth-image coding should also be evaluated based on thequality of rendered views. For this purpose, there are two PSNR/MSE (mean-squared error) metrics employed inthe literature. Let us denote D andˆD the original depth-image and the compressed one, respectively. Let VR(D),VR(ˆD) denote the rendered views using D andˆD, respectively. The ﬁrst metric, denoted PSNR(VR(ˆD),VR(D)),calculates the PSNR between VR(ˆD)andVR(D)[2].Thismetricimplicitly assumes that using uncompressed(a) Ballet, depth-image V4 frame 0 (b) Breakdancers, depth-image V4 frame 0Figure 1. Examples of depth-imagedepth-images leads to the best rendering quality and thus VR(D) is used as a reference to evaluate diﬀerentˆD:higher PSNR indicates VR(ˆD)iscloserVR(D). However, when the original depth-image D is estimated (e.g.via disparity estimation) instead of obtained using active systems such as range camera, there could be somedistortion in D, and thus VR(D) may not serve as a good reference for rendering quality. For example in Fig. 1,we can see some false contours in the ﬂoor area. Processing such region could lead to improved rendering quality,while it will decrease PSNR(VR(ˆD),VR(D)) as the depth-image is “altered”. On the other hand, the secondPSNR metric is calculated between VR(ˆD)andtheground truth VC, i.e., the corresponding view captured at thesame location and orientation chosen for rendering[3](denoted as PSNR(VR(ˆD),VC).) For example in a MVDsystem, one can use compressed video and depth-images from View 0 and View 2 to render View 1 and evaluatePSNR against the captured View 1. This metric directly reﬂects the ﬁdelity of the rendered view. In this paper,we will use this metric to measure the rendering quality.Using PSNR(VR(ˆD),VR(D)), the eﬀect of MVD compression has been investigated by changing the bitrateratio between multiview video and the depth-images[4]. The results indicate that, on rendered views, the qualityof compressed depth-images has much stronger impact than the quality of compressed video. The compressionartifacts in depth-images result in signiﬁcant distortion around depth discontinuities, which occur at boundariesof objects with diﬀerent depths. To better preserve these boundaries, the authors suggested processing the codeddepth-images, and/or designing other coding techniques. Y. Morvan et al.[5]proposed an alternative intra codingapproach, called “platelet-based coding”, to encode depth-images. Considering the fact that depth-images aretypically made up with smooth regions separated by depth discontinuities, the authors employed a quadtreedecomposition to partition depth-images into blocks with variable sizes and represent them with piecewise linearfunctions (platelets). It was reported that, in certain low-bitrate scenarios, depth-images encoded with platelet-based method provide higher PSNR for the rendered


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3 out of 10 pages.

Please select your school