New version page

Fast Cache

Upgrade to remove ads

This preview shows page 1-2 out of 6 pages.

Save
View Full Document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Upgrade to remove ads
Unformatted text preview:

Fast Cache and Bus Power Estimation for Parameterized System-on-a-Chip DesignAbstractWe present a technique for fast estimation of the powerconsumed by the cache and bus sub-system of a parameterizedsystem-on-a-chip design for a given application. The techniqueuses a two-step approach of first collecting intermediate dataabout an application using simulation, and then usingequations to rapidly predict the performance and powerconsumption for each of thousands of possible configurationsof system parameters, such as cache size and associativity andbus size and encoding. The estimations display good absoluteas well as relative accuracy for various examples, and areobtained in dramatically less time than other techniques,making possible the future use of powerful search heuristics.KeywordsSystem-on-a-chip, low power, estimation, intellectual property,cache, on-chip bus.1. IntroductionSilicon capacity continues to increase faster than the abilityfor designers to use that silicon, resulting in the well-knownproductivity gap [18]. Many researchers propose extensivereuse of pre-designed intellectual property cores to reduce thisgap [8], where typical cores include microprocessors,microcontrollers, digital signal processors, encoders/decoders,bus interfaces, and numerous other common peripheralcomponents. Two complementary core-based designapproaches are emerging. One approach, based on a traditionalcapture-and-simulate [5] paradigm, assumes that a designerpieces together many cores obtained from various sources [24](adding some custom logic), simulates extensively, and thengenerates new silicon implementing the system-on-a-chip. Theother approach, which this paper addresses and which we referto as configure-and-execute, assumes the designer starts with apre-designed system-on-a-chip1, and then configures thatsystem (including adding and deleting some cores) beforegenerating new silicon [16][20][21][22]. The configure-and-execute approach has an advantage of enabling softwaredevelopment on real silicon, reducing the need for lengthyhardware/software co-simulations. Several commercialproducts now support such an approach for various applicationdomains [14][23], such as networks and communications. 1 Such pre-designed silicon has been referred to as a referencedesign, fig chip (configurable chip), and silicon platform byvarious authors.A key to the success of a configure-and-execute approach isthat the pre-designed system’s architecture be heavilyparameterized, so that design metrics like power, performanceand size, can be optimized for a particular application’s designconstraints, by selecting particular parameter values beforegenerating new silicon. We focus in this paper on parameters ofthe system cache and its associated on-chip buses, the CPU tocache bus, and the cache to main memory bus, as cache and bushave been shown to contribute to a significant percentage ofsystem power. The main contribution of this paper is thecreation of a fast cache power/performance estimation methodand its coupling with a fast bus estimation method, enablingfuture heuristics that could simultaneously explore the largedesign space defined by cache and bus parameters. Suchsimultaneous exploration was recently shown to be crucial tooptimizing deep-submicron designs [7], in which bus powerconsumption begins to surpass that of cache, and in which thecache and bus parameters must therefore be carefully tuned toone another.Section 2 highlights the basic idea of parameterized systemdesign. Section 3 describes related work in cache and buspower estimation and optimization. Section 4 describes ouroverall estimation approach. Section 5 describes the cachemodel used. Section 6 shows how to couple the cache modelwith our previously developed bus model. Section 7 describesour experimental results showing the speed and excellentaccuracy of our approach. Section 8 provides conclusions.2. Parameterized system designOur long-term goal is to develop an environment supportingFigure 1: Steps in parameterized system design.ReferencedesignApplicationdevelopmentParameteroptimizationNew silicongenerationCharacterizingsimulationSearchheuristicsEstimationequationsOptimiz.criteriaParameter explorationParameter optimizationIntermediate dataTony D. Givargis, Frank VahidDepartment of Computer Science and EngineeringUniversity of California, Riverside, CA 92521{givargis, vahid}@cs.ucr.eduJörg HenkelC&C Research Laboratories, NEC USA4 Independence Way, Princeton, NJ [email protected] parameterized system design approach. Such an approachconsists of three main steps, as illustrated in Figure 1.1. Application development begins with a commerciallyavailable "reference design," implemented on a configurableprototyping system-on-a-chip ("fig chip"). Figure 2 illustrates atypical reference design system-on-a-chip [24] consisting of amicroprocessor core, cache, main memory, and direct-memoryaccess (DMA) controller, all connected via a system bus. Alsoon that bus is a bridge to a set of peripheral cores, which differdepending on the class of intended applications (e.g.,networking), and to reconfigurable logic or to add-on chips.The desired application is developed on this fig chip, whichsupports in-circuit emulation and hence at-speed applicationexecution, overcoming the problem of prohibitively longsimulation time for systems-on-a-chip. Some additional corescould be added (using the reconfigurable logic) and unneededones shut off. Numerous system-on-a-chip developers havebegun to emphasize the importance of starting with such areference design rather than composing cores from scratch[16][20][21][22].2. Parameter optimization occurs once the application hasbeen developed with the aid of the fig chip. The architecture’sparameters are optimized for that application and itsaccompanying power, performance and size optimizationcriteria. Critical architectural parameters may include busparameters like data size, address and data encodingtechniques, multiplexing, etc., cache parameters like cachesize, associativity, write-back techniques, block size/line size,etc., DMA parameters, and parameters relating to


Download Fast Cache
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Fast Cache and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Fast Cache 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?