New version page

Instruction Scheduling

This preview shows page 1-2 out of 6 pages.

View Full Document
View Full Document

End of preview. Want to read all 6 pages?

Upload your study docs or become a GradeBuddy member to access this document.

View Full Document
Unformatted text preview:

Instruction Scheduling Using MAX − MIN Ant SystemOptimizationGang Wang Wenrui Gong Ryan KastnerDepartment of Electrical and Computer EngineeringUniversity of California at Santa BarbaraSanta Barbara, CA 93106-9560{wanggang, gong, kastner}@ece.ucsb.eduABSTRACTInstruction scheduling is a fundamental step for mapping an appli-cation to a computational device. It takes a behavioral applicationspecification and produces a schedule for the instructions onto acollection of processing units. The objective is to minimize thecompletion time of the given application while effectively utilizingthe computational resources. The instruction scheduling problemis NP-hard, thus effective heuristic methods are necessary to pro-vide a qualitative scheduling solution. In this paper, we present anovel instruction scheduling algorithm using MAX-MIN Ant Sys-tem Optimization approach. The algorithm utilizes a unique hy-brid approach by combining the ant system meta-heuristic withlist scheduling, where the local and global heuristics are dynami-cally adjusted to the input application in an iterative manner. Com-pared with force-directed scheduling and a number of different listscheduling heuristics, our algorithm generates better results over allthe tested benchmarks with better stability. Furthermore, by solv-ing the test samples optimally using ILP formulation, we show thatour algorithm consistently achieves a near optimal solution.Categories and Subject Descriptors: J.6 [Computer-Aided Engi-neering]: Computer-Aided Design (CAD)General Terms: Algorithms, Design, TheoryKeywords: Instruction Scheduling, Force-Directed Scheduling, ListScheduling, MAX-MIN Ant System1. INTRODUCTIONAs the fabrication technology advances and transistors becomemore plentiful, modern computing systems achieve better systemperformance by increasing the amount of computation units. It isestimated that we will be able to integrate more than a half billiontransistors on a 468 mm2chip by the year of 2009 [1]. This willyield tremendous potential for future computing systems, however,it imposes big challenges on how to effectively use and design suchcomplicated systems.As computing systems become more complex, so do the appli-cations that can run on them. In order to efficiently and effectivelymap applications onto these systems, designers will increasinglyrely on automated design tools. One fundamental process of thesetools is creating a mapping of a behavioral model of an applica-Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.GLSVLSI’05, April 17–19, 2005, Chicago, Illinois, USA.Copyright 2005 ACM 1-59593-057-4/05/0004 ...$5.00.tion to the computing system. For example, the tool may take a Cfunction and create the code to program a microprocessor. This isviewed as software synthesis or compilation. Or the tool may take atransaction level behavior and create a register transfer level (RTL)circuit description. This is often called hardware or behavioral syn-thesis [11]. Both hardware and software synthesis flows are neededfor the use and design of future computing systems.Instruction scheduling (IS) is an important problem in the aboveprocess. An inappropriate scheduling of the instructions can fail toexploit the true potential of the system and can offset the gain frompossible parallelism. Instruction scheduling appears in a number ofdifferent important problems, e.g. compiler design for superscalarand VLIW microprocessors distributed clustering computation ar-chitectures [3] and behavioral synthesis of ASICs and FPGAs [11].Instruction scheduling is often done on a behavioral descrip-tion of the application. This description is typically decomposedinto several blocks (e.g. basic blocks), and each of the blocksis represented by a data flow graph (DFG). As an example, Fig-ure 2 gives the DFG for an Auto Regression (AR) filter. Based onthe constraint, instruction scheduling can be classified as resource-constrained or performance-constrained. Given a DFG, clock cy-cle, resource count and resource delays, a resource-constrained schedul-ing finds the minimum number of control time-steps for executingthe DFG. Performance-constrained scheduling tries to determinethe minimum number of resources needed for a given schedulingdeadline. Though performance constraints are imposed in manycases, resource-constrained scheduling is more frequent in prac-tice. This is because in most of the cases, the DFGs are constructedand scheduled almost independently. Furthermore, if we want tomaximize resource sharing, each block should use same or simi-lar resources, which is hardly ensured by performance constrainedschedulers. The performance constraint of each block is not easyto define since blocks are typically serialized and budgeting globalperformance constraint for each block is not trivial [10].The instruction scheduling problem is NP-hard [5]. Althoughit is possible to formulate and solve the problem using Integer Lin-ear Programming (ILP) , the feasible solution space quickly be-comes intractable for larger problem instances. In order to addressthis problem, a range of heuristic methods with polynomial run-time complexity have been proposed. These methods include listscheduling [17, 2] , forced-directed scheduling [14] , genetic al-gorithm [8], tabu search, simulated annealing, graph theoretic andcomputational geometry approaches[10, 3].Among them, list scheduling is most common due to its simplic-ity of implementation and capability of generating reasonably goodresults for small sized problems. The success of the list scheduleris highly dependent on the priority function and the structure ofthe input application (DFG) [11, 9]. One commonly used priorityfunction assigns the priority inversely proportional to the mobility.Many other priority functions have been proposed [2, 9, 8, 4]. Itis commonly agreed that there is no single good heuristic for pri-oritizing the DFG nodes across a range of applications. Our resultsin Section 4 confirm this. Force-directed scheduling proposed byPaulin and Knight [14] is another popularly used method. Essen-tially, it is an extension of list scheduling. In


Loading Unlocking...
Login

Join to view Instruction Scheduling and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Instruction Scheduling and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?