DOC PREVIEW
UW-Madison ECE 734 - SUPERSCALAR DESIGN SPACE EXPLORATION AND OPTIMIZATION FRAMEWORK FOR DSP OPERATIONS

This preview shows page 1-2-3-4-5 out of 15 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Superscalar Design space exploration and optimization framework for DSP OperationsAbstractSUPERSCALAR DESIGN SPACE EXPLORATION ANDOPTIMIZATION FRAMEWORK FOR DSP OPERATIONSRehan AhmedDepartment of Electrical and Computer Engineering, University of Madison Wi,Email: [email protected] design methodology for superscalar architecture optimization for DSP operations isproposed. The optimization considers both performance and power consumption metrics.Two basic approaches are initially evaluated. The first approach follows a heuristicrecursive algorithm based on the general working of superscalar architecture. The secondapproach uses simulated annealing to converge to an optimized configuration. A hybridapproach that uses initial configuration obtained from simulated annealing and furtherimprovement using architecture heuristics has also been developed. Over 200%improvement over initial configuration has been achieved through the hybrid method.The results have been validated through various DSP benchmark applications in theMiBench Suite [1]. Index Terms— Superscalar, Search and optimization, Simplescalar, Wattch1. INTRODUCTIONPower consumption has become a primary concern in modern processor design. With theadvent of mobile processing platforms, it is becoming more and more critical to designefficient architectures; taking into account power consumption at initial design stages.This paper presents an approach for guided design space exploration of superscalararchitecture. The methodology used is a combination of random search and optimizationalgorithm and a heuristic approach based on the functioning of superscalar architecture.Previous approaches in the area of superscalar optimization use simulated annealing [2],random search [3], genetic algorithms [4] and multidimensional optimization algorithms.However, in these approaches, only a limited set of architectural parameters has beenconsidered. This results in an exploration of a small portion of the actual design space.Furthermore, both power consumption and performance have not been concurrentlyconsidered in the existing approaches. Other methods used for superscalar optimizationimprove upon the individual functional blocks such as cache [6, 7] resulting in an overallreduction in the efficiency and performance of a given architecture. The methodologyfollowed in this paper performs evaluations and optimizations at global architecturallevel. This paper extends our previous work in superscalar architecture design andoptimization [8]. In our prior work, a methodology based on heuristics was proposed;based on which, an automated design and optimization tool (SSOPT) was developed. Theoptimized architecture specifically targeted sample rate conversion operations forsoftware radio applications. In this paper, architectural improvements have beenconducted for generic DSP benchmark applications. Furthermore, the optimizationapproach has been extended by evaluating two new approaches. Simulated annealing hasbeen used for conducting global architectural optimization and these optimization resultshave been compared with those of SSOPT [8]. Based on these results, a hybrid approachhas been proposed which utilizes the strong points of both its constituent approaches.During all optimizations, performance, power consumption and complexity ofarchitectures have been considered to validate the feasibility of optimized configurations.In the remaining paper, Section 2 explains the simulation framework developed and usedthroughout this paper. Section 3 includes the working of SSOPT; the optimization toolbased on the functioning of superscalar architecture. Section 4 covers the operation ofsimulated annealing based optimization approach. This is followed by a brief explanationof benchmark applications and the optimization results of both approaches in section 5.The hybrid approach based on the simulation results of section 5 is given in section 6,followed by conclusion and references.2. SIMULATION FRAMEWORK2.1. Simulation ToolsAll optimization methodologies evaluated in this paper consider powerconsumption, performance and the complexity of superscalar architectures for validatingprocessor configurations. For getting power and performance measures, Simplescalararchitectural modeling tool [9] and its power extension Wattch [10] have been used. Allpower estimates have been given at an operating frequency of 1GHz and 100nm CMOStechnology is assumed. Conditional clock gating (cc3) has been assumed for all powerestimates. This assumes full power consumption for active units and 10% powerconsumption for inactive architectural units. Although, complexity and area estimateshave not been considered in the optimization’s objective function, they have beenestimated using the cost model given in [11]. An automated tool has been developedwhich implements the cost model given in [11]. Complexity estimates of optimizedarchitectural configurations have been compared with commercial architectures. 2.2. Benchmark ApplicationsThis paper targets optimizations for a range of complex DSP operations. For this reason,a subset of MiBench [1] benchmark suite for communication operations has been used.The applications for which optimization have been conducted are: (i) FFT, (ii) IFFT, (iii)CRC32, (iv) ADPCM Encode, (v)ADPCM Decode. FFT and IFFT perform Fast FourierTransform and its inverse respectively. CRC performs Cyclic Redundancy Check on agiven data. ADPCM encode and decode functions implement Adaptive Differential PulseCode Modulation.2.3. Gain EvaluationAs mentioned in section 2.1, the optimization approaches proposed in this paperconsider power consumption, performance and complexity estimates for validatingconfiguration changes. All methodologies effect architectural changes iteratively andcompare the changed configuration to the former configuration. For such a comparison,a parameter ‘Efficiency-Gain’ has been evaluated which is a combination of performanceand efficiency measures. The basic expression for efficiency gain is given in equation 1.The parameter takes a weighted sum of IPC (Instruction per Cycle) gain


View Full Document

UW-Madison ECE 734 - SUPERSCALAR DESIGN SPACE EXPLORATION AND OPTIMIZATION FRAMEWORK FOR DSP OPERATIONS

Documents in this Course
Load more
Download SUPERSCALAR DESIGN SPACE EXPLORATION AND OPTIMIZATION FRAMEWORK FOR DSP OPERATIONS
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view SUPERSCALAR DESIGN SPACE EXPLORATION AND OPTIMIZATION FRAMEWORK FOR DSP OPERATIONS and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view SUPERSCALAR DESIGN SPACE EXPLORATION AND OPTIMIZATION FRAMEWORK FOR DSP OPERATIONS 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?