FAST INTER AND INTRA MODE DECISION ALGORITHMBASED ON THREAD-LEVEL PARALLELISM IN H.264 VIDEO CODINGProject Guide – Dr. K. R. RaoTejas Sathe (1000731145)[email protected]: To reduce H.264 video encoder complexity by incorporating fast inter and intra mode decisionalgorithm using thread level parallelism technique.Motivation:The most recent advances in microprocessor design for desktop computers involve putting multiple processors on a single computer chip. These multicore designs are completely replacing the traditional single core designs that have been the foundation of desktop computers.The primary problem is that regular software has not been designed to take advantage of the newmulticore architectures. In fact, to see any real speedup from the new multicore architectures, currently used software will have to be redesigned.In H.264 encoder, the major complexity lies in Motion Estimation block. Using thread-level parallelization, not only hardware resources can be efficiently utilized, but also significant speed up can be achieved in encoding. Introduction:1. H.264 CODEC standard:H.264/MPEG-4 Part 10 or AVC (Advanced Video Coding) is a standard by the ITU-T VideoCoding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG). It isused for video compression, and is currently one of the most commonly used formats for therecording, compression, and distribution of high definition video. H.264 is a new video compression scheme that is becoming the worldwide digital videostandard for consumer electronics and personal computers. H.264/AVC has achieved asignificant improvement in the rate-distortion efficiency providing, typically, a factor of two inbit-rate savings when compared with existing standards such as MPEG-2. In particular, H.264has already been selected as a key compression scheme (codec) for the next generation of opticaldisc formats, HD-DVD and Blu-ray disc.H.264 has following profiles as shown in Fig.1 [1]:1. Baseline Profile: Real-time conversational services e.g. video conferencing and videophone.2. Main Profile: Designed for digital storage media and television broadcasting3. Extended Profile: Multimedia services over Internet4. Four High Profiles: Content-contribution, content-distribution, and studio editing and post-processingFig.1. H.264 Profiles [1]Fig.2. Block diagram of H.264 algorithm [1]As shown in Fig.2 encoder does inter and intra prediction in order to get rid of spatial andtemporal redundancy, respectively in the frames. For this, mode selection algorithms are used toselect the best prediction mode for the current macro block within a frame.To select the best mode for one Macro block in the intra prediction, the H.264/AVC encodercarries out 592 RDO calculations. As a result, the complexity of the encoder increases extremely.This project focuses on the complexity reduction of encoder using thread level parallelismtechnique.2. Thread level Parallelism:The focus of software design and development will have to be changed from sequential programming techniques to parallel and multithreaded programming techniques.3. Multicore [6]:A multicore is an architecture design that places multiple processors on a single die (computer chip). Each processor is called a core. As chip capacity increased, placing multiple processors on a single chip became practical.These designs are known as Chip Multiprocessors (CMPs) because they allow for single chip multiprocessing.Multicore architectures are now center stage in terms of improving overall system performance.CMPs come in multiple flavors: two processors (dual core), four processors (quad core), and eight processors (octa - core) configurations.When implemented properly, threading can enhance performance by making better use ofhardware resources.To take advantage of multicore processors, knowledge of details of software threading model aswell as capabilities of the platform hardware is necessary.4. Thread [7]:A thread can be defined from both, hardware and software point of view.A thread is a discrete sequence of related instructions that is executed independently of otherinstruction sequences.In a program there is at least one thread called main thread, which, furthermore, can create otherthreads.On the other hand, at hardware level, thread is an execution path that remains independent ofother hardware execution paths.Goal:Though thread-level parallelization can parallelize some of the threads within a process, major hurdle for implementation of the same is complicated data dependences in multimedia applications.H.264 encoder has data dependences between the inter mode decision and the intra mode decision, especially when rate-distortion optimization (RDO) is used.Goal is to implement RDO mode decision algorithm based on thread-level parallelization for the H.264 encoder using JM reference software (version 17.2), which can efficiently resolve the dependences and exploit thread-level parallelism for fast mode decision.Reduction in the total encoding time without PSNR loss and bit rate increment is the challenge inthe project.References:[1] Soon-kak Kwon, A. Tamhankar and K.R. Rao, “Overview of H.264/MPEG-4 part 10”, Video/Image Processing and Multimedia Communications, 2003.[2] T. Wiegand, et al “Overview of the H.264/AVC video coding standard”, IEEE Trans. on circuits and systems for video technology, vol. 13, pp. 560-576, July 2003.[3] D. Marpe, T. Wiegand and G. J. Sullivan, “The H.264/MPEG-4 AVC standard and its applications”, IEEE Communications Magazine, vol. 44, pp. 134-143, Aug. 2006.[4] J. Kim, D. Kim, and J. Jeong, “Complexity reduction algorithm for intra mode selection in H.264/AVC video coding” J. Blanc-Talon et al. (Eds.): ACIVS 2006, LNCS 4179, pp. 454 – 465, 2006.Springer-Verlag Berlin Heidelberg, 2006.[5] Ju-Ho Hyun, “Fast mode decision algorithm based on thread-level parallelization and thread slipstreaming in h.264 video coding” Multimedia and Expo (ICME), 2010 IEEE InternationalConference [6] Cameron Hughes and Tracey Hughes, “Professional Multicore Programming Design and Implementation for C++ Developers”, Wiley 2010[7] Shameem Akhter and Jason Roberts, “Multi-Core Programming Increasing Performance through Software Multi-threading”, Intel Press 2006[8] Eric Q. Li and Yen-Kuang Chen, “Implementation of H.264 Encoder on General-Purpose
View Full Document