UT EE 382C - Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy - D268739

Home> Schools> University of Texas at Austin> Electrical Computer (EE) > EE 382C> Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy

DOC PREVIEW

UT EE 382C - Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy

School name University of Texas at Austin

Course Ee 382c- Topics in Computer Engineering

Pages 9

This preview shows page 1-2-3 out of 9 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 9 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 9 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 9 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 9 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy Patrick Brown EE382C – Embedded Software Systems May 10, 2000 $EVWUDFWMPEG Audio Layer-3 is a standard for the compression of high-quality digital audio. It has rapidly become very popular and has found widespread application in devices such as portable, all-electronic music players and in Internet audio. Currently, many algorithms for Layer-3 audio encoding are used. Their performances vary greatly, but the average encoding rate is approximately one second of digital audio encoded per second on a typical Windows-based desktop computer. While this rate is acceptable to most personal computer users, some applications demand a much higher encoding rate. For example, a typical radio station has thousands of CDs, each containing up to 74 minutes of digital audio. If the radio station decides to convert all of its audio to MPEG Layer-3, 74 minutes to encode each CD is unacceptable. The goal of this project is a formal modeling of an MP3 encoder, which will expose the parallelism in the encoding algorithm, facilitating the scaling of the algorithm to a multi-processor implementation. Much greater throughput is achievable by scaling the algorithm to multiple processors.&RQWH[WMPEG (Moving Picture Experts Group) Audio Layer-3 [1], more commonly known as “MP3,” is part of the set of standards known as “MPEG-1,” which was approved by the International Organization for Standardization (ISO) in November 1992 [2]. The primary focus of this standard is the compression of high-quality, synchronized audio and video to a data rate of approximately 1.5 Mbps [3]. This standard consists of three main parts: system, video, and audio. Within the audio portion of the standard, there are three “layers.” Layer-3 provides the highest compression at a given sound quality. Table 1 [4] shows some of the common compression ratios available using MP3 compression based on the relative quality of the resulting audio. Sound Quality Bandwidth Mode Bit-Rate Compression Ratio Telephone 2.5 kHz mono 8 kbps 96:1 AM Radio 7.5 kHz mono 32 kbps 24:1 FM Radio 11 kHz stereo 56 - 64 kbps 26:1 - 24:1 Near CD 15 kHz stereo 96 kbps 16:1 CD Quality Over 15 kHz stereo 112+ kbps Up to 12:1 Table 1: MP3 Compression ratios for various output sound qualities. In MP3, as with other source coding standards, the decoder is rigidly defined, whereas great flexibility exists in the design of the encoder. Many “freeware,” “shareware,” and commercial encoders exist, some of which are open source. The formal model will be based on the L.A.M.E. encoder [5], because it is open source, freely distributable, efficient, and produces good sound quality. As with most signal processing applications, the encoding algorithm contains alarge amount of parallelism. A major benefit of building the formal model is the exposure of this parallelism, because this is what will make the algorithm scalable to multiple processors. 2EMHFWLYHVThe formal model of the MPEG Layer-3 encoder will be a dataflow graph consisting of various blocks, or “actors,” each containing a portion of the L.A.M.E. source code. The actors will exist within the SDF (Synchronous Data Flow) model of computation. In this model, or “domain,” each actor has a fixed number of input and output ports. Each of these ports receives or sends a fixed number of “tokens” of data, such as a single integer or floating-point number, or a matrix of values. There is no notion of time within this domain. This domain is well-suited to the modeling of an MP3 encoder, because input audio is processed sequentially, 576 samples at a time, as quickly as possible, with no consideration of time or other factors that may be present in other domains. In addition to exposing the inherent parallelism, the formal modeling also allows retargeting of the algorithm to different implementations. Therefore, it will be possible to apply the algorithm, which was originally a C program written to run on a single general-purpose processor, to a wide variety of platforms, such as a multiple processor workstation or custom hardware containing multiple DSPs. Converting the C code to C++ and importing it into SDF actors introduces additional processing overhead, because the domain must provide a means for communication between actors. In the original algorithm, this was simply doneby function calls. Scheduling the algorithm on multiple processors also introduces overhead. This is due to the fact that processors must spend time sending and receiving data. However, the nature of the algorithm is such that the amount of inter-actor and inter-processor communication is very small. This is the main reason that a formal model of the encoder is beneficial. ,PSOHPHQWDWLRQ DQG 0RGHOLQJ Figure 1: Block diagram representation of an MP3 encoder. The formal model was constructed using Ptolemy [6]. Layer-3 is a very complex encoding scheme; the actual ISO standard [2] is nearly 200 pages long, and the L.A.M.E. encoder source is approximately 20,000 lines in length. Unfortunately, it was not possible to implement a fully functional encoder due to this complexity and the timing constraints of the project. Figure 2 shows a screen shot of the implemented model. Time to Frequency Mapping Filter Bank Noise Allocation, Quantizer, and Coding Bit Stream Formatting Psychoacoustic Model PCM Audio Input Encoded Bit StreamFigure 2: The model The model includes the most important filtering actors. The model does not include the final stages of the algorithm, which involve noise allocation and bit stream formatting to produce the actual MP3 output file. These portions of the algorithm do not involve a large amount of parallel data and, therefore, would benefit least from formal modeling. A truly complete model of an encoder would include these blocks at the far right of the graph, in place of the two red blocks in Figure 2. These red blocks are currently actors that plot the output of the filters in the frequency domain.The top and bottom halves of the graph represent the two channels (left and right) of audio. The actors on the far left are source actors, included for the purpose of simulation. They generate an input signal composed of three sine waves at different frequencies with two Gaussian random noise sources added. This input provides a wide range of frequency content, similar to what

View Full Document