Clemson CPSC 863 - : Multimedia Systems and Applications

Unformatted text preview:

1 CpSc 863: Multimedia Systems and Applications Video Compression James Wang Some notes are adapted from Prof. Lawrence A. Rowe’s original slides at http://www.BMRC.Berkeley.EDU/~larry 2 Representations Composite NTSC - 6MHz (4.2MHz video), 29.97 frames/second PAL - 6-8MHz (4.2-6MHz video), 50 frames/second Component Separation video (luma, chroma) - svhs, Hi8mm RGB, YUV, YIQ, … YCBCR - used for most compressed representations Separation video called “s-video” 3 Analog Video Representations NTSC Y = 0.299R + 0.587G + 0.114B I = 0.596R - 0.275G - 0.321B Q = 0.212R - 0.523G + 0.311B composite = Y + Icos(Fsc t) + Qsin(Fsc t) PAL Y = 0.299R + 0.587G + 0.114B U = 0.492(B-Y) V = 0.877(R-Y) composite = Y + Usin(Fsc t) + Vcos(Fsc t) 4 Digitizing Analog TV is a continuous signal Digital TV uses discrete numeric values Signal is sampled Samples are quantized Small, discrete regions are digitized Image represented by pixel array 5 Digital Video Block Structure 4:2:2 YCBCR 16x16 macroblock 8x8 pixel blocks 8 bits/sample = 16 bits/pixel = 4Kbits/macroblock 4:1:1 YCBCR 3Kbits/macroblock 12 bits/pixel Y3 CB1 CB2 CR2 CR1 Y1 Y2 Y4 macroblock Y3 CB CR Y1 Y2 Y4 6 What is Video Data Rate? Digital 720x483 = 347,760 pixels/frame 4:2:2 sampling gives 695,520 bytes/frame 21 MB/sec (167 Mbs) 4:4:4 sampling gives 250 Mbs ATV (MPEG MP@ML) 1280x720 = 921,600 pixels/frame 4:2:0 sampling gives 1,382,400 bytes/frame 41 MB/sec (328 Mbs) (Note: MPEG coded streams are 1.5-80 Mbs)2 7 What is Video Data Rate (cont.)? ATSC (720P) 720x1280 = 921,600 pixels per frame 4:2:2 sampling = 1,843,200 bytes per frame 24 fps = 44,236,800 bytes per second  44 MB/s = 354 Mbs ATSC (studio 1080I) 1080x1920 = 2,073,600 pixels per frame 4:4:4 sampling = 6,220,800 bytes per frame 30 fps = 186,624,000 bytes per second 187MB/s = 1.5 Gbs 8 Human Perception What is smooth motion Depends on source material Most action is perceived as smooth at 24 fps Human most sensitive Low frequencies Changes in luminance and blue-orange axis Vision emphasizes edge detection Strong bias to horizontal and vertical lines Visual masking by large luminance changes 9 H. 261 Developed by CCITT (Consultative Committee for International Telephone and Telegraph) in 1988-1990 Designed for videoconferencing, video-telephone applications over ISDN telephone lines. Bit-rate is p x 64 Kb/sec, where p ranges from 1 to 30. 10 Overview of H. 261 Frame Sequence Frame types are CCIR 601 CIF (352 x 288) and QCIF (176 x 144) images with 4:2:0 sub-sampling. Two frame types: Intra-frames (I-frames) and Inter-frames (P-frames): I-frame provides an accessing point, it uses basically JPEG. P-frames use "pseudo-differences" from previous frame ("predicted"), so frames depend on each other. 11 Intra-frame Coding 12 Intra-frame Coding Macroblocks are 16 x 16 pixel areas on Y plane of original image. A macroblock usually consists of 4 Y blocks, 1 Cr block, and 1 Cb block. Quantization is by constant value for all DCT coefficients (i.e., no quantization table as in JPEG).3 13 Inter-frame (P-frame) Coding A Coding Example (P-frame) 14 Inter-frame (P-frame) Coding Previous image is called reference image, the image to encode is called target image. Points to emphasize: 1. The difference image (not the target image itself) is encoded. 2. Need to use the decoded image as reference image, not the original. 3. We're using "Mean Absolute Error" (MAE) to decide best block. Can also use "Mean Squared Error" (MSE) = sum(E*E)/N 15 H. 261 Encoder Control" -- controlling the bit-rate. If the transmission buffer is too full, then bit-rate will be reduced by changing the quantization factors. "memory" -- used to store the reconstructed image (blocks) for the purpose of motion vector search for the next P-frame. 16 H. 261 Encoder 17 Methods for Motion Vector Searches 18 Methods for Motion Vector Searches C(x + k, y + l) -- pixels in the macro block with upper left corner (x, y) in the Target frame. R(x + i + k, y + j + l) -- pixels in the macro block with upper left corner (x + i, y + j) in the Reference frame. Cost function is: where MAE stands for Mean Absolute Error. Goal is to find a vector (u, v) such that MAE(u, v) is minimum.4 19 Methods for Motion Vector Searches Full Search Method Sequentially search the whole [-p, p] region --> very slow Two-Dimensional Logarithmic Search Similar to binary search. MAE function is initially computed within a window of [-p/2, p/2] at nine locations as shown in the figure. Repeat until the size of the search region is one pixel wide: 1. Find one of the nine locations that yields the minimum MAE 2. Form a new searching region with half of the previous size and centred at the location found in step 1. 20 Methods for Motion Vector Searches 21 Methods for Motion Vector Searches Hierarchical Motion Estimation 22 Hierarchical Motion Estimation 1. Form several low resolution version of the target and reference pictures 2. Find the best match motion vector in the lowest resolution version. 3. Modify the motion vector level by level when going up 23 Some Important Issues Avoiding propagation of errors 1. Send an I-frame every once in a while 2. Make sure you use decoded frame for comparison Bit-rate control Simple feedback loop based on "buffer fullness" If buffer is too full, increase the quantization scale factor to reduce the data. 24 Details How the Macroblock is Coded ? Many macroblocks will be exact matches (or close enough). So send address of each block in image --> Addr Sometimes no good match can be found, so send INTRA block --> Type Will want to vary the quantization to fine tune compression, so send quantization value -->Quant Motion vector --> vector Some blocks in macroblock will match well, others match poorly. So send bitmask indicating which blocks are present (Coded Block Pattern, or CBP). Send the blocks (4 Y, 1 Cr, 1 Cb) as in JPEG.5 25 Details H. 261 Bitstream Structure 26 Details Need to delineate boundaries between pictures, so send Picture Start Code --> PSC Need timestamp for picture (used later for audio synchronization), so send Temporal Reference --> TR Is this a P-frame or an I-frame? Send Picture Type --> PType Picture is divided into regions of 11 x 3 macroblocks called Groups of Blocks --> GOB Might want to skip whole groups, so send Group Number (Grp #) Might want to use one quantization value for whole group, so send Group


View Full Document

Clemson CPSC 863 - : Multimedia Systems and Applications

Documents in this Course
Load more
Download : Multimedia Systems and Applications
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view : Multimedia Systems and Applications and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view : Multimedia Systems and Applications 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?