DOC PREVIEW
UT EE 382C - Programmable VLIW and SIMD Architectures for DSP and Multimedia Applications

This preview shows page 1-2-3 out of 9 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Programmable VLIW and SIMD Architecturesfor DSP and Multimedia ApplicationsDeepu TallaLaboratory for Computer ArchitectureDepartment of Electrical and Computer EngineeringThe University of Texas at [email protected] – Digital Signal Processing (DSP) and multimedia workloads are expected to bethe dominant workloads on future computer systems. This is true in both low costembedded applications that use specialized microprocessors like DSPs and in the general-purpose processor market. Very Long Instruction Word (VLIW) architectures havemultiple functional units to take advantage of vastly available Instruction LevelParallelism (ILP) in such applications. Single Instruction Multiple Data (SIMD)techniques operate on multiple data in a single instruction (exploiting data parallelism).This paper proposes to evaluate the benefits of using the above two techniques for DSPand multimedia applications. Using a modern commodity processor from each category –Texas Instruments Inc.’s TMS320C6x (VLIW) and Intel’s Pentium II with MMX(SIMD), several DSP and multimedia benchmarks will be evaluated.1. Introduction Digital Signal Processing (DSP) and multimedia applications, where text becomesthe exception rather than the rule, are now starting to become exceedingly important forcomputer systems as a dominant computing workload [5][6]. Dynamic multimediacomponent technologies such as video conferencing, video authoring, visualization, 3Dgraphics, animation, realistic simulation, speech processing and recognition, andbroadband communications hold a great promise. In contrast to traditional applications,multimedia and DSP-rich applications will involve significant demands on the processor.With an ever-increasing proportion of CPU cycles being used to run such applications, itis pertinent to design machines that speed up programs that constitute a large portion ofcomputation time.Current solutions for these compute-centric applications are based principally onVLSI implementations except for certain control functions that may be implemented on aprogrammable micro-controller. To make the implementation flexible and cost effectiveover a variety of products and product generations, however, there is now a great deal ofinterest in migrating functionality from application specific hardware into softwarerunning on a programmable CPU or DSP.The importance of multimedia technology, services and applications is beingwidely recognized by microprocessor designers. A number of manufacturers are offeringmultimedia processors that are claimed to be able to decode coded video streams in real-time in software. Most of such processors like the Trimedia processor from Philips andthe Multimedia signal processor from Samsung usually have hardware assists for one ormore of the multimedia decoding functions. The market for these special purposemultimedia processors will be in low cost embedded applications such as set-top boxes,wireless terminals, digital TVs, and stand-alone entertainment devices such as DVDplayers. A number of general-purpose CPU manufacturers are offering multimediaenhanced versions of their CPUs for accelerating audio and video processing. TheUltraSPARC processor enhanced with the Visual Instruction Set (VIS) from Sun, and themultimedia-enhanced MMX Pentium processors from Intel are examples. Such CPUs arelikely to take over multimedia and DSP functions like audio-video decoding/encoding,modem, telephony functions, and network functions on a PC/workstation platform, alongwith the general purpose computing they currently perform.2. Objectives and MotivationDSP and multimedia applications possess several distinguishing characteristicsthan the normal workloads on desktop computing systems. Diefendorff and Dubey [5]specified the following characteristics of the media-centric applications – real-timeresponse, processing of continuous-media types, significant fine and coarse grainedparallelism, high instruction-reference locality, and high network and memory bandwidth.There is significant data and instruction level parallelism (ILP) that can be exploited inthese workloads.Very Long Instruction Word (VLIW) architectures incorporate multiple functionalunits in the data path to exploit the ILP in applications. A single instruction specifiesmore than one concurrent operation (for example, two loads, two adds, two multiplies andtwo shifts all in a single instruction). The instruction width is quite large (sometimes up to8 times than normal architectures) and takes many bits to encode multiple operations.VLIW processors rely on software to pack the collection of operations (compaction) andin workloads with limited ILP, instruction bandwidth is wasted with no-operations placedin the instruction. Examples of modern VLIW processors are from major DSP vendors –TI’s TMS320C6x series, Analog Devices Inc.’s TigerSHARC and the joint venture ofMotorola and Lucent known as StarCore. For multimedia and DSP applications VLIWprocessors seem to be an intuitive performance win over traditional single instruction percycle architectures. Figure (1) shows the CPU core of the C6x processor having eightfunctional units in the data path.Figure. 1. CPU core of the C6x processor (VLIW)Single Instruction Multiple Data (SIMD) techniques traditionally have beeninstruction set architecture extensions to general-purpose superscalar processors. Sucharchitectures exploit data parallelism as opposed to ILP – each instruction operates onmultiple data in a single instruction (for example, four loads or four additions, etc. but nota combination of different operations). Many of the DSP and multimedia applications canuse vectors of packed 8-, 16- and 32-bit integers and floating-point numbers that allowspotential benefits of SIMD architectures like the MMX for the Pentium family ofprocessors and the Visual Instruction Set (VIS) extensions for the UltraSPARCprocessors. Figure (2) shows the “multiply and accumulate” instruction operating onmultiple data in the case of MMX technology.Figure. 2. Multiply and accumulate instruction operating on several data valuesThe objective of this project is to evaluate the effectiveness of VLIW and SIMDarchitectures for DSP and multimedia applications. Choosing one modern representativecommodity processor from each category – TI’s C6x DSP processor as a VLIWrepresentative and Intel’s Pentium II with MMX as a SIMD representative, we propose toassess DSP and multimedia


View Full Document

UT EE 382C - Programmable VLIW and SIMD Architectures for DSP and Multimedia Applications

Documents in this Course
Load more
Download Programmable VLIW and SIMD Architectures for DSP and Multimedia Applications
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Programmable VLIW and SIMD Architectures for DSP and Multimedia Applications and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Programmable VLIW and SIMD Architectures for DSP and Multimedia Applications 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?