Unformatted text preview:

60Until a few years ago, processorswere only sold as packaged individual ICs.The growing density of CMOS circuits, how-ever, created an opportunity to incorporatethe processor as part of a larger system on achip. Initial processor designs for this marketwere based on the processor existing as a sep-arate entity, and cores were handcrafted foreach manufacturing process technology,resulting in costly and fixed solutions. Fur-thermore, it was not possible to modify thesecores for the particular application, in muchthe same way that it was not possible to mod-ify a stand-alone prepackaged processor.Xtensa is a processor core designed with easeof integration, customization, and extensionin mind. Unlike previous processors, Xtensalets the system designer select and size only thefeatures required for a given application. Theconfiguration and generation process isstraightforward and lets the designer definenew system-specific instructions if preexistingfeatures don’t provide the required function-ality. Furthermore, Xtensa fits easily into thestandard ASIC design flow. Xtensa is fully syn-thesizeable, and designers can use the mostpopular physical-design tools during the place-and-route process.Processor developmentApplication-specific processor developmentis an active area of research in the CAD,computer architecture, and VLSI design com-munities. Early attempts to add application-specific instructions to general-purposecomputer engines relied on writable micro-code.1,2These techniques dynamically aug-mented the base instruction set withapplication-specific instructions.More recent research focuses on automaticinstruction set design3,4or on reconfigurable,also called retargetable, processors.5Thesegroups, however, try to solve slightly differentproblems than those addressed by Xtensa.Automatic instruction set design systemati-cally analyzes a benchmark program to derivean entirely new instruction set for a givenmicroarchitecture. Our group—here, referredto as “we”—focuses on how to generate ahigh-performance and low-power implemen-tation of a given microarchitecture with appli-cation-specific extensions. In this respect,automatic instruction set design is a goodcomplement to our work. Once the instruc-tion set additions are derived automaticallyby analyzing the benchmark program, theycan be given to the Xtensa processor genera-Ricardo E. GonzalezTensilica, Inc.SYSTEM DESIGNERS CAN OPTIMIZEXTENSA FOR THEIR EMBEDDEDAPPLICATION BY SIZING AND SELECTING FEATURES AND ADDING NEWINSTRUCTIONS. XTENSA PROVIDES AN INTEGRATED SOLUTION THAT ALLOWSEASY CUSTOMIZATION OF BOTH HARDWARE AND SOFTWARE. THIS PROCESSIS SIMPLE, FAST, AND ROBUST.0272-1732/00/$10.00  2000 IEEEXTENSA: A CONFIGURABLE ANDEXTENSIBLEPROCESSORtor to obtain a high-performance, low-powerimplementation.Reconfigurable or retargetable processorscouple a general-purpose computer engine withvarious amounts of hardware-programmablelogic. In the extreme, the entire processor isimplemented using hardware-programmablelogic. This technique, however, is limited by thelarge difference in operating frequency betweenprogrammable and nonprogrammable logic.Processors implemented entirely using pro-grammable logic operate an order of magnitudeslower than nonconfigurable processors imple-mented in a comparable process technology.Razdan and Smith present an interestingcompromise.5Their approach couples a cus-tom-designed high-performance processor withsmall amounts of hardware-programmablelogic. Their system uses compiler-generatedinformation to dynamically reconfigure a smallamount of hardware-programmable logic toimplement new application-specific function-al units. This technique also has limitations dueto the disparity in operating frequency of pro-grammable and nonprogrammable logic.Thus, the new functional units must beextremely simple or be deeply pipelined.Our approach is similar to that taken by Raz-dan and Smith, however we don’t attempt todynamically reconfigure the system. The Ten-silica processor generator adds the application-specific functionality at the time the hardwareis designed. Thus, the extensions are imple-mented in the same logic family as the rest ofthe processor. This eliminates the disadvantagesof using programmable logic for implement-ing the extensions, but precludes modificationof the extensions for different applications.Due to a lack of automated tools, designersincorporated application-specific functionalityin CPUs by adding specialized coprocessors.6,7This approach produces communication over-head between the CPU and the coprocessor,making system design more arduous.Recently, with the advent of synthesizeableprocessors, some groups have proposed man-ual modification of the register-transfer level(RTL) description of the processor and thesoftware development tools.8This approachis tedious and error prone. Furthermore, theextensions are only applicable to one imple-mentation. If users want to add similar exten-sions to a future implementation of the sameprocessor, they must modify the RTL again.Our research differs from previous studiesbecause we use a high-level language to expressprocessor extension. This language, called Ten-silica Instruction Extension (TIE), expressesthe semantics and encoding of instructions.TIE can add new functionality to the RTLdescription and automatically extend the soft-ware tools. This lets the system developer codeapplications in a high-level language, such asC or C++. TIE imposes restrictions on func-tions that designers can describe, which great-ly simplify verification of the processor andextensions. Because the extensions become anintegral part of the processor, there is no com-munication overhead.Synthesizeable processorsTraditionally, processors are customdesigned. If designers employ logic synthesis,it is only to generate control modules they feelare not timing critical. Designers take greatcare in controlling the layout to avoid para-sitics and capacitive coupling between nodes(allowing the use of dynamic circuits). Fur-thermore, custom design allows the use ofsophisticated circuit structures including con-tent addressable memories and specializedRAMs. These circuit structures can efficient-ly implement particular microarchitecturalfeatures, such as translation look-aside buffers,address lookup, and fast RAM access. Customcircuit design can result in very high operat-ing frequencies, as evidenced by recentannouncements from Intel, AMD, IBM, andothers. Well-designed


View Full Document
Download EXTENSIBLE PROCESSOR
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view EXTENSIBLE PROCESSOR and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view EXTENSIBLE PROCESSOR 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?