1 Introduction2 Modules2.1 NTSC Decoder (Chris)2.2 Downsampler (Chris)2.3 Block Splitter (Chris)2.4 DCT (Evan and Chris)2.5 Quantizer (Evan)2.6 Entropy/Huffman Encoder (Chris)2.7 Audio Capture (Chris)2.8 Packer (Evan)2.9 Unpacker (Evan)2.10 Audio Playback (Chris)2.11 Decoder2.12 Color-Space Conversion and Display (Evan)3 TestingReferencesVideo-Conferencing SystemEvan Broder and C. Christoher PostIntroductory Digital Systems LaboratoryNovember 2, 2007AbstractThe goal of this project is to create a video/audio conferencing system. Video willbe captured from a camera and then processed using JPEG compression. The resultingcompressed signal is then framed for serial transmission along with an audio signal andsent to the decoder with a checksum to ensure data integrity. The decoder then extractsthe separate signals, reverses the JPEG compression, displays the resulting image onthe screen, and outputs the audio to a speaker.Contents1 Introduction 32 Modules 32.1 NTSC Decoder (Chris) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.2 Downsampler (Chris) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.3 Block Splitter (Chris) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.4 DCT (Evan and Chris) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.5 Quantizer (Evan) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.6 Entropy/Huffman Encoder (Chris) . . . . . . . . . . . . . . . . . . . . . . . 52.7 Audio Capture (Chris) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.8 Packer (Evan) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.9 Unpacker (Evan) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.10 Audio Playback (Chris) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.11 Decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.12 Color-Space Conversion and Display (Evan) . . . . . . . . . . . . . . . . . . 73 Testing 7References 7List of Figures1 The top-level block diagram for the Video-Conferencing System . . . . . . . 32 The JPEG compression algorithm is applied to each block of 8x8 pixels. . . . 43 “Zigzag” ordering of elements in the block matrix [1]. . . . . . . . . . . . . . 621 IntroductionFor this project, we will implement a real-time video conferencing system. In order to reducethe bandwidth needed for video transmission, we will perform JPEG compression on eachvideo frame.The target for the video system is a 320x240 image at 15 frames per second. Becausethe compression happens in several steps, it can be highly pipelined. In addition, JPEGoperates independently on small blocks of the image, so it is highly parallelizable.In order to maintain data integrity across the transmission medium, the video and au-dio data will be formed into packets with a checksum to detect malformed packets on thereceiving end.CameraNTSCDecoderDownsamplerBlockSplit terJPEGEncodingPackingUnpackingPhysicalconnectionMicrophoneAC97Capture/DownsampleJPEGDecodingAC97Interpolate/PlaybackBlockReassemblerUpsamplerYCrCb to RGBConverterVGA Driver MonitorSpeakerFigure 1: The top-level block diagram for the Video-Conferencing System2 Modules2.1 NTSC Decoder (Chris)This module will take the output of the ADV7185 ADC on the labkit and decodes its packetsequence. The data from the ADV7185 will be interlaced, and sinc e the transmission image3size is 320x240, there is no reason to deinterlace the incoming video. Therefore, this modulewill output images at 640x240. Additionally, this module will reduce the framerate of thevideo to the desired 15 frames per second.2.2 Downsampler (Chris)Our target resolution is 320x240. Since the incoming signal will be 640x240, we will, atthe very least, need to downsample the entire image by a factor of two in the horizontaldirection. Additionally, the human eye is less sensitive to changes in chrominance (colordifference) than it is to changes in luminance (brightness). Therefore, the JPEG s tandardspecifies that each of the two chroma channels is downsampled by an additional factor of 2in each direction. The result of this operation is stored in BRAM where it accessed by theBlock Splitter.2.3 Block Splitter (Chris)The rest of the data flow operates on 8x8 pixel blocks. This module reads data from BRAMone 8x8 block at a time and feeds it to the DCT.2.4 DCT (Evan and Chris)Block of8x8 pixelsDCT QuantizerEntropy/HuffmanEncodingFigure 2: The JPEG compression algorithm is applied to each block of 8x8 pixels.A DCT (Discrete Cosine Transform) translates spatial data into spatial frequency data.The human eye is most sensitive to changes in the DC (f = 0) and low-frequency ACcomponent of the image and least sensitive to the high-frequency AC components. Theselow frequencies to which the eye is most sensitive are grouped in one area of the resultmatrix.In JPEG compression, each 8x8 block of pixels is first converted from unsigned to signedintegers, such that the range is centered around 0. Then, a two-dimensional (type-II) DCTis applied to each 8x8 block of pixels. The result is defined byGu,v= α(u)α(v)7Xx=07Xy=0gx,ycosπ8x +12ucosπ8y +12vwhere u and v represent the spatial frequencies in the horizontal and vertical directions,respectively, gx,yis the pixel value at coordinates (x, y), G is the result matrix (still of size8x8), and4α(n) =q18, if n = 0q28, otherwiseThis result is passed to the Quantizer.Since, from an algorithmic standpoint, this is the most significant and challenging module,we will be developing it cooperatively.2.5 Quantizer (Evan)If the brightness of a block is varying at a high spatial frequency, the human eye has a difficulttime detecting the exact strength of that variation. Therefore, it is not necessary to retainas much information about those high-frequency components. The information is reducedby dividing each element of the frequency-domain matrix by a corresponding element of aconstant quantization matrix and rounding the result. This quantization matrix is constantfor all 8x8 blocks of a given channel; however, the matrix for the luminance channel may bedifferent from the matrix for the chrominance channels.Quantization is the lossy stage of the compression. Because many of the elements of thequantization matrix are relatively large (i.e. > 50 for 8-bit data), the elements of the resultmatrix become very small numbers, or even zero. This
View Full Document