Multiplexing the elementary streams of H 264 video and MPEG4 HE AAC v2 audio using MPEG2 systems specification demultiplexing and achieving lip synchronization during playback by NAVEEN SIDDARAJU Presented to the Faculty of the Graduate School of The University of Texas at Arlington in Partial Fulfillment of the Requirements for the Degree of MASTER OF SCIENCE IN ELECTRICAL ENGINEERING Nov 2010 i Copyright by Naveen Siddaraju 2010 All Rights Reserved ii ACKNOWLEDGEMENTS I am greatly thankful to my supervising professor Dr K R Rao whose constant encouragement guidance and support have helped me in smooth completion of the project He has always been accessible and helpful throughout I also thank him for introducing me to the field of multimedia processing I would like to thank Dr W Alan Davis and Dr William E Dillon for taking interest in my project and accepting to be part of my project defense committee I am forever grateful to my parents for their unconditional support at each turn of the road I thank my brother and sisters who have always been a source of inspiration I would like to thank my friends both in US and in India for their encouragement and support November 22 2010 iii ABSTRACT MULTIPLEXING THE ELEMENTARY STREAMS OF H 264 VIDEO AND MPEG4 HE AAC v2 AUDIO USING MPEG2 SYSTEMS SPECIFICATION DEMULTIPLEXING AND ACHIEVING LIP SYNCHRONIZATION DURING PLAYBACK Naveen Siddaraju MS The University of Texas at Arlington 2010 Supervising Professor Dr K R Rao Delivering broadcast quality content to the mobile customers is one of the most challenging tasks in the world of digital broadcasting Limited network bandwidth and processing capability of the handheld devices are critical factors that should be considered Hence selection of the compression schemes for the media content is very important from both economic and quality points of view H 264 which is also known as Advanced Video Codec AVC 1 is the latest and the most advanced video codec available in the market today The H 264 baseline profile which is used in applications such as mobile television mobile DTV broadcast has one of the best compression ratios among the other profiles and requires the least processing power at the decoder The audio MPEG4 HE AAC v2 2 which is also known as enhanced aacplus is the latest audio codec belonging to the AAC advanced audio codec 3 family In addition to the core AAC it uses the latest tools such as Spectral Band Replication SBR 2 and Parametric Stereo PS 2 resulting in the best perceived quality for the lowest iv bitrates The audio and video codec standards have been chosen based on ATSCM H advanced television systems committee mobile handheld 17 For the television broadcasting applications such as ATSC M H DVB 16 the encoded audio and video streams should be transmitted in a single transport stream containing fixed sized data packets which can be easily recognized and decoded at the receiver The goal of the project is to implement a multiplexing scheme for the elementary streams of H 264 baseline and HE AAC v2 using the MPEG2 systems specifications 4 then demultiplex the transport stream and playback the decoded elementary stream with lip synchronization or audio video synchronization The multiplexing involves two layers of packetization of the elementary streams of audio and video The first level of packetization results in Program Elementary Stream PES packets which are variable size packets and hence not suitable for transport MPEG2 defines a transport stream where PES packets are logically organized into fixed size packets called the Transport Stream TS packets which are 188 bytes long These packets are continuously generated to form a transport stream which is decoded by the receiver and the original elementary streams are reconstructed The PES packets that are logically encapsulated into the TS header contain the time stamp information which is used at the de multiplexer to achieve synchronization between audio and video elementary streams v TABLE OF CONTENTS ACKNOWLEDGEMENTS iii ABSTRACT iv LIST OF FIGURES iii LIST OF TABLES ix LIST OF TABLES xi ACRONYMS AND ABBREVIATIONS xii Chapter 1 INTRODUCTION 1 2 OVERVIEW OF H 264 2 2 1 H 264 AVC 2 2 2 Coding structure 2 2 3 Profiles and levels 3 2 4 Description of various profiles 4 2 4 1 Baseline Profile 4 2 4 2 Extended profile 5 2 4 3 Main Profile 5 2 4 4 High Profiles 5 2 5 H 264 encoder and decoder 6 2 5 1 Intra prediction 8 2 5 2 Inter prediction 9 2 5 3 Transform and quantization 10 2 5 4 Entropy coding 10 vi 2 5 5 Deblocking filter 11 2 6 H 264 bitstream 11 3 OVERVIEW OF HE AAC V2 16 3 1HE AAC v2 16 3 2 Spectral Band Replication SBR 18 3 3 Parametric Stereo PS 19 3 4 Enhanced aacplus encoder 20 3 5 Enhanced aacplus decoder 22 3 6 Advanced Audio Coding AAC 23 3 6 1 AAC encoder 23 3 7 HE AAC v2 bitstream formats 27 4 TRANSPORT PROTOCOLS 30 4 1 Introduction 30 4 2 Real Time protocol RTP 30 4 3 MPEG2 systems layer 31 4 4 Packetized elementary stream PES 32 4 4 1 PES encapsulation process 34 4 5 MPEG Transport stream MPEG TS 35 4 6 Time stamps 38 5 MULTIPLEXING 42 6 DE MULTIPLEXING 48 6 1 Lip or audio video synchronization vii 51 7 RESULTS 55 7 1 Buffer fullness 55 7 2 Synchronization skew calculation 56 8 CONCLUSIONS 59 9 FUTURE WORK 59 References 60 viii LIST OF FIGURES Fig 2 1 Video data organization in H 264 42 Fig 2 2 Specific coding parts of the profiles in H 264 5 Fig 2 3 Different YUV systems Fig 2 4 H 264 encoder 5 Fig 2 5 H 264 decoder 5 Fig2 6 Intra prediction modes for 4X4 luma in H 264 Fig2 7 Different layers of JVT coding Fig2 8 NAL formatting of VCL and non VCL data 6 Fig2 9 NAL unit format 6 Fig2 10 Relationship between parameter sets and picture slices 24 Fig3 1 HE AAC audio codec family Fig 3 2 Typical bitrate ranges of HE AAC v2 HE AAC and AAC for stereo 7 Fig 3 4 Original audio signal 28 Fig 3 5 High band reconstruction through SBR 28 Fig3 6 Enhanced aacplus encoder block diagram 9 Fig3 7 Enhanced aacplus decoder block diagram 9 Fig 3 8 AAC encoder block diagram 10 Fig 3 9 ADTS elementary stream Fig 4 1 RTP packet structure simplified 22 ix Fig 4 2 MPEG2 transport stream 22 Fig 4 3 Conversion of an elementary stream into PES packets 29 Fig4 4 A standard MPEG TS packet structure 14 Fig4 5 Transport stream TS packet format used in this project Fig5 1 Overall multiplexer flow diagram Fig5 2 Flow chart of video processing block Fig 5 3 Flow chart of audio processing block Fig6 1 Flow chart for the de multiplexer used x
View Full Document