DOC PREVIEW
UT EE 382C - Literature Survey On CGC6000

This preview shows page 1-2-3 out of 9 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Literature SurveyOnCGC6000Ptolemy Code Generation Domain forTMS320C6xSresth KumarVikram Sudhir SardesaiHamid Rahim SheikhEE382C-9Embedded Software SystemsProf. Brian L. EvansDepartment of Electrical & Computer EngineeringThe University of Texas at Austin.March 2000.AbstractPtolemy is an environment that provides a block-diagram mechanism for representingsystems described by one or more Models of Computation. It allows simulation ofsystems as well as synthesis of software from the block diagram for a variety of targetlanguages: low-level assembly as well as high-level languages. A broad class of SignalProcessing & Communication Applications can be described using the SynchronousDataflow (SDF) model of computation. In this project, we intend to write a ‘CodeGeneration Domain’ for Ptolemy that will generate code for Texas Instrument’sTMS320C6000 family of processors from a system represented as an SDF Graph.1. IntroductionPtolemy is an object-oriented framework for simulation, prototyping andsynthesis of heterogeneous systems. It provides a mechanism for representing a systemdescribed by one or more Models of Computation. In this project, we are interested onlyin Ptolemy Code Generation environment.Ptolemy Code Generation environment is modular and extensible because of itsobject-oriented design. It consists of a number of domains pertaining to different targetprocessors (or languages) and architectures, e.g. CGC (Code Generation in C), CG56(Motorola’s DSP56k), C50 (TI’s TMS320C50) etc. Systems are described as blockdiagrams in the desired code generation domain. Executing the system generates code forthat processor targeting a particular platform, e.g. single processor target, multiprocessortarget, and simulator target etc.Texas Instrument’s TMS320C6x is a VLIW RISC processor that is becomingincreasingly popular because of its computational power and ability to handle demandingSignal Processing and Multimedia applications like DVD, MPEG, and Digital TV etc.Writing optimized code for the C6x manually is cumbersome and time consuming. In themodern era of constrained time-to-market schedules, efficient Automatic CodeGeneration mechanisms are very desirable.In this project, we intend to write a Code Generation Domain for the C6xprocessor. Our domain would generate efficient C6x code from an SDF description of asystem. The generated code would run on a C6x evaluation board and perform better thanthe code generated by the CGC.2. Synchronous Dataflow (SDF)In the Dataflow model of computation, a system is represented as a graph.Operations on data are represented as nodes (blocks or actors)ofthegraphandthearcsconnecting the nodes are the data (or signal) paths. Communication between nodes is bymeans of data samples (or tokens) that travel along arcs. Dataflow is a natural paradigmfor describing a large class of DSP systems. It exposes the parallelism that exists in mostDSP algorithms, thus allowing designers to map the algorithm onto multiple processors ifsuch an implementation is desired.SDF is a dataflow model in which the number of tokens produced and consumedby each actor in the graph upon execution is known beforehand and is fixed (and finite)throughout the execution of the graph. An actor cannot be executed until all its inputshave the required number of data tokens. At each invocation (or firing) of the actor, afixed number of tokens is consumed from the input arcs and a fixed number of tokens isproduced on the output arcs [3]. These features allow bounded memory execution andmake it is possible to schedule the execution of valid SDF graphs statically i.e. at compiletime. Also SDF allows convenient handling of multiple sampling rates in a system. Anynode in an SDF graph is enabled for execution when a sufficient number of input tokensis available at the inputs. Thus more than one node may be enabled at any time. Thisimplies that a number of nodes may be fired simultaneously. It is therefore possible topartition an SDF graph into sub-graphs to be scheduled and executed on parallelprocessors if desired.Optimal scheduling of SDF graphs can be done in polynomial time for mostgraphs. Heuristics exist that can achieve or approach the lower bound for program anddata memory requirement. Also the memory and computational resource requirementsare known at compile time and are static throughout the execution of the graph [6]3. TMS320C6x VLIW RISC DSPVery Long Instruction Word (VLIW) architecture is very popular in the DSPworld for several reasons. One important reason is its ability to take advantage of theInstruction Level Parallelism (ILP) that is available in typical DSP codes whilesimplifying the control logic. Single Instruction Multiple Data (SIMD) is another way toexploit ILP. However there are advantages of VLIW over SIMD architecture includinggreater flexibility and reusability of code (a VLIW target does not need major rewritingof the application). Moreover, while compilers perform poorly for traditional DSP’s, thecompilers for VLIW RISC processors are quite efficient because of Trace Scheduling andSoftware Pipelining [2]. Compilers do the determination and scheduling of ILP in ahigh-level language code. Compiler techniques exist for VLIW RISC architectures thatallow programmers to write efficient code in a high-level language.TMS320C6x family is a high-end multimedia VLIW RISC processor familywhose architecture and C compiler were designed hand in hand. The C6x and its CCompiler were designed to ease the task of programming complex applications.Programming the C6x in assembly is very cumbersome for managing these complextasks, especially because of its 8 parallel execution units and a very deep pipeline ([7] &[8]). While the compiler achieves high efficiency, for tight loops and other complexalgorithms it may be important to harness the full potential of the processor. For thispurpose, TI provides optimized assembly code for common DSP structures like FIR, IIR,and FFT etc. It has been reported that these optimizations provide better performancethan simple C code. It is our intention to use a library of such optimized code for the stars(actors or nodes) in our domain ([5] & [9]).4. Ptolemy Code Generation MechanismPtolemy’s code generation framework uses a methodology wherein codesegments pertaining to some functional operations in a particular target language areembedded inside stars as codeblocks. Upon execution of the Universe (the graph), thecode segments from


View Full Document

UT EE 382C - Literature Survey On CGC6000

Documents in this Course
Load more
Download Literature Survey On CGC6000
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Literature Survey On CGC6000 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Literature Survey On CGC6000 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?