Multiplexing H 264 AVC Video with MPEG AAC Audio Harishankar Murugan University of Texas at Arlington Outline Multiplexing Areas of applications Why H 264 and AAC Multiplexing De multiplexing Synchronization and Playback Results Conclusions Future work References Multiplexing Areas of applications DVB DVB C DVB T ATSC IPTV Multiplexing Areas of applications Why H 264 Video Up to 50 in bit rate savings Compared to H 263v2 H 263 or MPEG 2 Simple Profile High quality video H 264 offers consistently good video quality at high and low bit rates Error resilience H 264 provides the tools necessary to deal with packet loss in packet networks and bit errors in error prone wireless networks Wide areas of application streaming mobile TV HDTV and storage options for the home user Important features of H 264 IDR Instantaneous decoder refresh picture Anchor picture with only I slices Sequence parameter set profile and level indicator decoding or playback order number of reference frames aspect ratio or color space details Picture parameter set entropy coding mode used slice data partitioning and macroblock reordering Flags indicating the usage of weighted bi prediction Quantization parameter details AAC Audio Advanced Audio Coding is a standardized lossy compression scheme for audio Encoder Block diagram of AAC AAC Audio Profiles Low Complexity LC the simplest and most widely used Main Profile MAIN LC profile with backwards prediction Sample Rate Scalable SRS LC profile with gain control tool Bit stream Formats ADIF Audio Data Interchange Format Only one header in the beginning of the file followed by raw data blocks ADTS Audio Data Transport Stream Separate header for each frame enabling decoding from any frame Why AAC Audio Supports Sample frequencies from 8 kHz to 96 kHz official MP3 16 kHz to 48 kHz Higher coding efficiency and simpler filterbank pure MDCT as compared to mp3 hybrid filter bank Improved compression provides higher quality audio with smaller bit rates Superior performance at bit rates 64 kbps and at bit rates reaching as low as 16 kbps Factors to be considered for Multiplexing and Transmission Split the video and audio coded bit streams into smaller data packets Multiplex with equal priority given to all elementary streams Detect packet losses and errors Additional information to help synchronize audio and video Packetization Video Source 2 layers of packetization PES Packetized Elementary stream Transport Stream H264 Encoder Packetize r MPEG encoded stream Audio Source Data Source AAC Encoder PES Packetizer Packetizer Multiplexer Transport Stream Packetized Elementary stream PES Elementary streams ES Encoded video stream Encoded audio stream Data stream Optional PES contains access units that are sequentially separated and packetized PES headers distinguish different ES and contain timestamp information Packet size varies with the size of access units Packetized Elementary stream PES AUDIO OR VIDEO ELEMENTARY STREAM PES Header PES Payload PES PES PES Header Description 3 bytes of start code 0x000001 1 byte of stream ID 2 bytes of packet length 2 bytes of time stamp Frame number Frame number as time stamp Video frame rate constant 25 30 fps time frame number fps Audio sampling rate constant 8 96 kHz Number of samples frame AAC 1024 time 1024 frame number sampling rate Advantages over the method that uses clock samples as time stamps Saves the extra header bytes used for sending program clock reference PCR information periodically No synchronization problem due to clock jitters No propagation of delay between audio and video Less complex and more suitable for software implementation Transport Packets PES from various elementary sources are broken into smaller packets called transport packets Transport packets have a fixed length of 188 bytes Constraints Each packet can have data from only one PES PES header should be the first byte of the transport packet payload Stuffing bytes are added if the above constraints are not met Transport stream PES Header Transport Stream Packet PES Payload Transport Header Stuffing bytes Packet Header Syntax Number of bits Sync byte 8 PID 10 Payload unit start indicator 1 Adaptation field control 1 Continuity counter 4 if adaptation field control 1 payload byte offset stuffing bytes or additional header payload 8 Packet Header PID Packet identifier Each elementary stream has a unique PID Some are reserved for NULL packets and PSI Program Specific Information PSI Program specific information Sequence parameter set and picture parameter set are sent as PSI at frequent intervals Payload unit start indicator 1 bit flag to indicate presence of PES header in the payload Adaptation field control 1 bit flag to indicate presence of any data other than PES data in payload Packet Header Continuity counter 4 bit rolling counter which is incremented by 1 for each consecutive TS packet of the same PID To detect packet loss Payload Byte offset If adaptation field control bit is 1 byte offset value of the start of the payload or the length of adaptation field is mentioned here Adaptation field Stuffing bytes if PES data TS packet size Additional header information Multiplexing method adopted Multiplexing method affects buffer fullness at the de multiplexer and in turn playback Video and audio timing counters are used to ensure proper multiplexing Timing counters are incremented according to the playback time of each packet multiplexed PES with the least timing counter value is always given preference during packet allocation Multiplexing method adopted fps 25 PES length 570 Video PES of TS round 570 185 1 25 40 ms 40 4 10 ms 4 TS packets Multiplexed transport stream Video PES Transport stream PID Audio PES P1 V 0x2 P1 A 0x4 P1 A 0x5 P1 A 0x6 P1 V 0x3 N N P1 A 0x7 15 16 16 16 15 1024 16 Demultiplexing Buffer fullness at demultiplexer Test criteria Video buffer 100kB Video buffer 600kB Video buffer 600kB Buffer details Start TS packet number 1 1 3850 End TS packet number 802 4341 7928 Video buffer fullness kB 100 600 600 Audio buffer fullness kB 33 142 119 Number of video frames 43 211 173 Number of audio frames 82 397 326 Video buffer content playback time 1 72 sec 8 44 sec 6 92 sec Audio buffer content playback time 1 75 sec 8 47 sec 6 94 sec Synchronization and playback During playback data is loaded from the buffer IDR frame is searched from the top of the video buffer Frame number of IDR frame is extracted Corresponding audio frame number is calculated as
View Full Document