DOC PREVIEW
UT Arlington EE 5359 - MULTIPLEXING H.264 VIDEO

This preview shows page 1-2-3-4 out of 12 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

MULTIPLEXING H 264 VIDEO WITH AAC AUDIO BIT STREAMS DEMULTIPLEXING AND ACHIEVING LIP SYNC DURING PLAYBACK Harishankar Murugan K R Rao Fellow IEEE Abstract This paper has developed a multiplexing demultiplexing system for H 264 video stream and AAC audio stream that is based on the MPEG 2 framework and uses frame numbers as timestamps It also provides lip sync after decoding the video and audio bit streams Keywords H 264 video AAC audio Multiplexing demultiplexing 1 Introduction With better quality and less bandwidth requirement digital television transmission has already replaced analog television transmission in a big way With the advent of HDTV transmission schemes are aiming at transmitting superior quality video with provision to view both standard format and wide screen 16 9 format along with one or more audio streams per channel Digital video broadcasting DVB in Europe and advanced television systems committee ATSC 21 in North America are working in parallel to achieve high quality video and audio transmission Choosing the right video codec and audio codec plays a very important role in achieving the bandwidth and quality requirements H 264 MPEG 4 part 10 or AVC 5 achieves about 50 bit rate savings as compared to earlier standards 12 H 264 provides the tools necessary to deal with packet losses in packet networks and bit errors in errorprone wireless networks These features make this codec the right candidate for using in transmission Advanced audio coding AAC 1 is a standardized lossy compression scheme for audio used in MPEG 2 1 and MPEG 4 2 This codec showed higher coding efficiency and superior performance at both low and high bit rates as compared to MP3 and AC3 The H 264 video and AAC audio coded bit streams need to be multiplexed in order to construct a single stream The multiplexing process mainly focuses on splitting the individual streams into small packets embedding information to easily realign the packets achieving lip sync between the individual streams and providing provision to detect and correct bit errors and packet losses In this paper the process of encoding video and audio streams multiplexing the compressed streams followed by demultiplexing decoding and synchronizing the individual streams during playback is explained in detail Factors to be considered for multiplexing and transmission Split the video and audio coded bit streams into smaller data packets Maintain buffer fullness at demultiplexer without buffer overflow or underflow Detect packet losses and errors Send additional information to help synchronize audio and video Video Source H264 Encoder Packetizer MPEG encoded stream Transport PES Multiplexer Audio Source AAC Packetizer Stream Encoder Data Source Packetizer Fig 1 Two layers of packetization Packetization According to MPEG 2 systems 2 layers of packetization process Fig 1 are carried out First layer of packetization yields packetized elementary stream PES and the second layer yields transport stream TS This second layer is what is used for transmission Multiplexing takes place after the second layer of packetization just before the transmission AUDIO OR VIDEO ELEMENTARY STREAM PES PES PES Frame 1 Frame 2 Frame 3 Header Packetized elementary stream The packetized elementary stream PES packets are obtained by encapsulating coded video coded audio and data elementary streams Fig 2 This forms the first layer of packetization The encapsulation on video and audio data is done by sequentially separating the elementary streams into access units Access units in case of audio and video elementary streams are audio and video frames respectively Each PES packet contains data from one and only one elementary stream PES packets may have a variable length since the frame size in both audio and video bit streams is variable The PES packet consists of the PES packet header followed by the PES packet payload The header information distinguishes different elementary streams carries the synchronization information in the form of timestamps and other useful information PES packet header The PES header consists of 3 bytes of start code with a value of 0x000001 followed 2 bytes of stream ID and 2 bytes of packet length Payload Fig 2 PES from elementary stream PES packet length field allows explicit signaling of the size of the PES packet up to 65536 bytes or in the case of longer video elementary streams the size may be indicated as unbounded by setting the packet length field to zero Finally last 2 bytes of the header are used for timestamps The frame number of the corresponding audio and video frames in the PES packet is sent as timestamp information in the header Frame number as timestamp The proposed method uses frame number as timestamps Both H 264 and AAC bit streams are composed of data blocks sorted into frames A particular video bit stream has a constant frame rate during playback specified by frames per second fps So given the frame number one can calculate the time of occurrence of this frame in the video sequence during playback as follows Time of playback Frame number fps 1 AAC compression standard 1 defines each audio frame to contain 1024 samples The audio data in the AAC bit stream can have any discrete sampling frequency between 8 96 kHz The frame duration increases from 96 kHz to 8 kHz However the sampling frequency and hence the frame duration remains constant throughout a particular audio stream So the time of occurrence of the frame during playback is as follows Playback time 1024 frame number Sampling rate 2 Thus from 1 and 2 we can find the time of playback by encoding the frame numbers as the time stamps In other words given the frame number of one stream we can the find the frame number of the other streams that will be played at the same time as the frame of the first stream This will help us synchronize the streams during playback This idea can be extended to synchronize more than one audio stream with the single video stream like in the case of stereo or programs with single video and multiple audio channels The timestamp is assigned in the last 2 bytes of the PES packet header This implies that timestamp can carry frame numbers up to 65536 Once the frame number exceeds this in case of long video and audio streams the frame number is rolled over The rollover takes simultaneously on both audio and video frame numbers as soon as either one of the stream crosses the maximum allowed frame number This will not create a conflict at the


View Full Document

UT Arlington EE 5359 - MULTIPLEXING H.264 VIDEO

Documents in this Course
JPEG 2000

JPEG 2000

27 pages

MPEG-II

MPEG-II

45 pages

MATLAB

MATLAB

22 pages

AVS China

AVS China

22 pages

Load more
Download MULTIPLEXING H.264 VIDEO
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view MULTIPLEXING H.264 VIDEO and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view MULTIPLEXING H.264 VIDEO and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?