DOC PREVIEW
UT Arlington EE 5359 - MULTIPLEXING H.264 VIDEO

This preview shows page 1-2-3-4 out of 12 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

MULTIPLEXING H.264 VIDEO WITH AAC AUDIO BITSTREAMS, DEMULTIPLEXING AND ACHIEVING LIPSYNC DURING PLAYBACKHarishankar MuruganK.R.Rao ,Fellow, IEEE Abstract: This paper has developed amultiplexing-demultiplexing system for H.264video stream and AAC audio stream that isbased on the MPEG-2 framework and usesframe numbers as timestamps. It also provideslip sync after decoding the video and audio bitstreams. Keywords: H.264 video, AAC audio,Multiplexing/demultiplexing. 1 .Introduction With better quality and less bandwidthrequirement, digital television transmission hasalready replaced analog television transmissionin a big way. With the advent of HDTV,transmission schemes are aiming at transmittingsuperior quality video with provision to viewboth standard format and wide screen (16:9)format along with one or more audio streamsper channel. Digital video broadcasting (DVB)in Europe and advanced television systemscommittee (ATSC) [21] in North America areworking in parallel to achieve high qualityvideo and audio transmission. Choosing theright video codec and audio codec plays a veryimportant role in achieving the bandwidth andquality requirements. H.264, MPEG-4 part-10or AVC [5], achieves about 50% bit rate savingsas compared to earlier standards [12]. H.264provides the tools necessary to deal with packetlosses in packet networks and bit errors in error-prone wireless networks. These features makethis codec the right candidate for using intransmission.Advanced audio coding (AAC) [1] is astandardized lossy compression scheme foraudio, used in MPEG-2 [1], and MPEG-4 [2].This codec showed higher coding efficiency andsuperior performance at both low and high bitrates, as compared to MP3 and AC3. The H.264video and AAC audio coded bit streams need tobe multiplexed in order to construct a singlestream. The multiplexing process mainlyfocuses on splitting the individual streams intosmall packets, embedding information to easilyrealign the packets, achieving lip sync betweenthe individual streams and providing provisionto detect and correct bit errors and packetlosses. In this paper, the process of encodingvideo and audio streams, multiplexing thecompressed streams followed bydemultiplexing, decoding and synchronizing theindividual streams during playback, is explainedin detail. Factors to be considered for multiplexing and transmission- Split the video and audio coded bitstreams into smaller data packets- Maintain buffer fullness atdemultiplexer without buffer overflowor underflow - Detect packet losses and errors- Send additional information to helpsynchronize audio and videoFig. 1: Two layers of packetizationPacketization According to MPEG-2 systems 2 layers ofpacketization process (Fig. 1) are carried out.First layer of packetization yields packetizedelementary stream (PES) and the second layeryields transport stream (TS). This second layeris what is used for transmission. Multiplexingtakes place after the second layer ofpacketization, just before the transmission.Packetized elementary stream The packetized elementary stream (PES)packets are obtained by encapsulating codedvideo, coded audio, and data elementary streams(Fig. 2). This forms the first layer ofpacketization. The encapsulation on video andaudio data is done by sequentially separating theelementary streams into access units. Accessunits in case of audio and video elementarystreams are audio and video frames respectively.Each PES packet contains data from one andonly one elementary stream. PES packets mayhave a variable length since the frame size inboth audio and video bit streams is variable.The PES packet consists of the PES packetheader followed by the PES packet payload.The header information distinguishes differentelementary streams, carries the synchronizationinformation in the form of timestamps and otheruseful information. PES packet headerThe PES header consists of 3 bytes of start codewith a value of 0x000001 followed 2 bytes ofstream ID and 2 bytes of packet length. Fig. 2: PES from elementary streamPES packet length field allows explicit signalingof the size of the PES packet (up to 65536bytes) or, in the case of longer video elementarystreams, the size may be indicated as unboundedby setting the packet length field to zero. Finallylast 2 bytes of the header are used fortimestamps. The frame number of thecorresponding audio and video frames in thePES packet is sent as timestamp information inthe header. Frame number as timestampThe proposed method uses framenumber as timestamps. Both H.264 and AAC bitstreams are composed of data blocks sorted intoframes. A particular video bit stream has aconstant frame rate during playback specifiedby frames per second (fps). So, given the framenumber, one can calculate the time ofoccurrence of this frame in the video sequenceduring playback as follows:Time of playback = Frame number /fps (1) H264 EncoderAAC EncoderPacketizerPacketizer MultiplexerTransportStreamVideoSourceAudioSourceMPEG encoded streamDataSourcePacketizerPESAUDIO OR VIDEO ELEMENTARY STREAMPESFrame 1PESFrame 3PESFrame 2Header PayloadAAC compression standard [1] defines eachaudio frame to contain 1024 samples. The audiodata in the AAC bit stream can have anydiscrete sampling frequency between 8 – 96kHz. The frame duration increases from 96 kHzto 8 kHz. However, the sampling frequency andhence the frame duration remains constantthroughout a particular audio stream. So, thetime of occurrence of the frame during playbackis as follows:Playback time = rate Samplingnumber frame*1024(2)Thus from (1) and (2) we can find the time ofplayback by encoding the frame numbers as thetime stamps. In other words, given the framenumber of one stream, we can the find the framenumber of the other streams that will be playedat the same time as the frame of the first stream.This will help us synchronize the streams duringplayback. This idea


View Full Document

UT Arlington EE 5359 - MULTIPLEXING H.264 VIDEO

Documents in this Course
JPEG 2000

JPEG 2000

27 pages

MPEG-II

MPEG-II

45 pages

MATLAB

MATLAB

22 pages

AVS China

AVS China

22 pages

Load more
Download MULTIPLEXING H.264 VIDEO
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view MULTIPLEXING H.264 VIDEO and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view MULTIPLEXING H.264 VIDEO 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?