Unformatted text preview:

Lx A Technology Platform for Customizable VLIW Embedded Processing Paolo Faraboschi Geoffrey Brown Joseph A Fisher Giuseppe Desoli Fred Mark Owen Homewood Hewlett Packard Laboratories Cambridge MA STMicroelectronics Cambridge MA frb gbrown jfisher desoli hpl hp com fred bristol st com standards new user needs and performance requirements ABSTRACT The combination of application complexity and time to market considerations is what makes a software based approach to embedded systems particularly appealing today Ideally embedded system designers would like to have a single processing platform where high performance digital signal processing capability for real time signal processing is coupled to microprocessor functionality for general purpose processing tasks This trend is what is causing the traditionally separated DSP and micro controller domains to converge in an increasingly large number of products that are starting to be commercially offered Lx is a scalable and customizable VLIW processor technology platform designed by Hewlett Packard and STMicroelectronics that allows variations in instruction issue width the number and capabilities of structures and the processor instruction set For Lx we developed the architecture and software from the beginning to support both scalability variable numbers of identical processing resources and customizability special purpose resources In this paper we consider the following issues When is customization or scaling beneficial How can one determine the right degree of customization or scaling for a particular application domain What architectural compromises were made in the Lx project to contain the complexity inherent in a customizable and scalable processor family Our approach is based on two concepts The experiments described in the paper show that specialization for an application domain is effective yielding large gains in price performance ratio We also show how scaling machine resources scales performance although not uniformly across all applications Finally we show that customization on an application by application basis is today still very dangerous and much remains to be done for it to become a viable solution A new clustered VLIW core architecture and microarchitecture specialized to an application domain that ensures scalability and customizability A toolchain based on aggressive ILP compiler technology that gives the user a uniform view of the platform at the programming language level The technology we are developing is called Lx we are doing it in a production environment most pieces have already been developed and products are expected in the near future The reasons for developing a new ISA come from the observation that existing architectures are not scalable in width and customization areas are limited Existing ISAs are either too specialized most DSP processors or too general general purpose platforms like ARM and MIPS 1 INTRODUCTION Dataquest estimates that the embedded processor market should grow from 7 5 billion in 1998 to 26 billion by 2002 This market space is seeing an increasing number of competitors ranging from companies implementing variations of traditional embedded processor architectures such as ARM and MIPS to more aggressive startups introducing their own new ISA such as ARC Cores and Tensilica We believe that the combination of clustering VLIW with precise interrupts a slim and scalable microarchitecture and interesting memory hierarchies constitute a novel technology platform 1 1 Convergence of Embedded Technologies At the same time the complexity of embedded applications is escalating considerably and it is not uncommon to find many hundred of thousands of lines of high level language code in embedded products such as printers or mobile phones Time tomarket is also becoming a primary concern as the lifetime of embedded products constantly shrinks to keep pace with evolving DSP and micro controllers are converging in the high end markets This new batch of processors include a combination of features from the DSP domain such as low overhead looping rich set of addressing modes special purpose arithmetic operations and formats etc At the same time they usually include a more RISC like set of instructions sometimes in a different mode to ease high level C or C code development to support system code and multitasking OS s and in general to be able to implement much larger applications in the same platform In the following we discuss the subset of announced DSPs and configurable RISC cores that have the most commonality with the Lx architecture Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page To copy otherwise to republish to post on servers or to redistribute to lists requires prior specific permission and or a fee ISCA 00 Vancouver British Columbia Canada Copyright c 2000 ACM 1 58113 287 5 00 06 203 5 00 Over the past several years a number of semiconductor manufacturers have announced high performance embedded VLIW cores 203 and processors These include the Motorola Lucent StarCore the TI C6xxx family and the Philips Trimedia Of these all but the StarCore are currently in production however only the TI C6 family has apparently shipped in large volumes In addition STMicroelectronics has announced the ST100 DSP which has a VLIW mode for key inner loops The announced StarCore architecture 13 is a natural VLIW extension of traditional DSPs the basic operations supported are optimized for DSP applications with the ability to issue multiple operations simultaneously While not as register starved as previous DSPs the available 16 data registers are likely to make the compiler s task difficult The TI C6 family 15 is significantly closer than the StarCore to a general purpose processor C6 presents some difficulties for real time applications because for example software pipelining using modulo scheduling is evidently not interruptible and interruptible code requires hazard free register usage This may cause significant register pressure for the compiler In contrast the Lx was designed to be interruptible and all code generated by the compiler is hazard free ity uniquely position this technology This is particularly true in a world where time to market is rapidly becoming the dominant


View Full Document

UCSD CSE 291 - Lx: A Technology Platform

Documents in this Course
Bluegene

Bluegene

23 pages

TinyECC

TinyECC

19 pages

MultiNet

MultiNet

18 pages

Lecture 2

Lecture 2

23 pages

AdaBoost

AdaBoost

25 pages

Lecture 9

Lecture 9

46 pages

Lecture

Lecture

5 pages

GPSR

GPSR

18 pages

Load more
Loading Unlocking...
Login

Join to view Lx: A Technology Platform and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lx: A Technology Platform and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?