New version page

Scheduling Algorithm

This preview shows page 1-2-24-25 out of 25 pages.

View Full Document
View Full Document

End of preview. Want to read all 25 pages?

Upload your study docs or become a GradeBuddy member to access this document.

View Full Document
Unformatted text preview:

A Scheduling Algorithm for Optimization andEarly Planning in High-Level SynthesisSEDA OGRENCI MEMIKNorthwestern UniversityRYAN KASTNERUniversity of California, Santa BarbaraELAHEH BOZORGZADEHUniversity of California, IrvineandMAJID SARRAFZADEHUniversity of California, Los AngelesComplexities of applications implemented on embedded and programmable systems grow with theadvances in capacities and capabilities of these systems. Mapping applications onto them manuallyis becoming a very tedious task. This draws attention to using high-level synthesis within designflows. Meanwhile, it is essential to provide a flexible formulation of optimization objectives as wellas to perform efficient planning for various design objectives early on in the design flow. In this work,we address these issues in the context of data flow graph (DFG) scheduling, which is an essentialelement within the high-level synthesis flow. We present an algorithm that schedules a chain of op-erations with data dependencies among consecutive operations at a single step. This local problemis repeated to generate the schedule for the whole DFG. The local problem is formulated as a maxi-mum weight noncrossing bipartite matching. We use a technique from the computational geometrydomain to solve the matching problem. This technique provides a theoretical guarantee on the so-lution quality for scheduling a single chain of operations. Although still being local, this provides arelatively wider perspective on the global scheduling objectives. In our experiments we comparedthe latencies obtained using our algorithm with the optimal latencies given by the exact solution tothe integer linear programming (ILP) formulation of the problem. In 9 out of 14 DFGs tested, ourAt the time the research for this article was carried Out, S. O. Memik was affiliated with theUniversity of California, Los Angeles, Los Angeles, CA.Authors’ addresses: S. O. Memik, Department of Electrical and Computer Engineering, Northwest-ern University, 2145 Sheridan Road, Evanston, IL 60208; email: [email protected]; R.Kastner, Department of Electrical and Computer Engineering, University of California, SantaBarbara, Engineering I, Room 4123, Santa Barbara, CA 93106; email: [email protected];E. Bozorgzadeh, Computer Sciences Department, University of California, Irvine, 408E ComputerScience Building, Irvine, CA 92697-3425; email: [email protected]; M. Sarrafzadeh, Computer ScienceDepartment, University of Califonia, Los Angeles, 3532C Boelter Hall, Los Angeles, CA 90095-1596;email: [email protected] to make digital or hard copies of part or all of this work for personal or classroom use isgranted without fee provided that copies are not made or distributed for profit or direct commercialadvantage and that copies show this notice on the first page or initial screen of a display alongwith the full citation. Copyrights for components of this work owned by others than ACM must behonored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers,to redistribute to lists, or to use any component of this work in other works requires prior specificpermission and/or a fee. Permissions may be requested from Publications Dept., ACM, Inc., 1515Broadway, New York, NY 10036 USA, fax: +1 (212) 869-0481, or [email protected]2005 ACM 1084-4309/05/0100-0033 $5.00ACM Transactions on Design Automation of Electronic Systems, Vol. 10, No. 1, January 2005, Pages 33–57.34•S. O. Memik et al.algorithm found the optimal solution, while generating latencies comparable to the optimal solu-tion in the remaining five benchmarks. The formulation of the objective function in our algorithmprovides flexibility to incorporate different optimization goals. We present examples of how to ex-ploit the versatility of our algorithm with specific examples of objective functions and experimentalresults on the ability of our algorithm to capture these objectives efficiently in the final schedules.Categories and Subject Descriptors: B.5.2 [Register-Transfer-Level Implementation]: DesignAIDS—Automatic synthesis;J.6[Computer-Aided Engineering]: Computer-aided designGeneral Terms: Design, AlgorithmsAdditional Key Words and Phrases: Scheduling, high-level synthesis, data flow graph, bipartitematching1. INTRODUCTIONTraditionally, translation of application descriptions into a synthesizable hard-ware description language (HDL) was a manual process. Then, designs speci-fied with an HDL were mapped onto embedded or programmable systems bylogic and physical synthesis tools. Due to the increasing complexity of the ap-plications, mapping applications manually is becoming harder. Therefore, in-creasing the level of abstraction for designers and automating the mappingprocess is becoming more attractive. This alternative paradigm involves au-tomatic compilation from high-level descriptions of applications, such as froma high-level programming language (e.g., C, C++). This approach offers de-signers a very convenient and familiar computational model. There are sev-eral proposed compilation flows from a high-level of abstraction to hardware[Wazlowski et al. 1993; Hammes et al. 1999; Gokhale et al. 2000; So et al. 2002;Haldar et al. 2001; Schreiber et al. 2002].Figure 1 depicts an example flow for automatic mapping of applications ontovarious hardware platforms. The application described in a high-level program-ming language is processed by the compiler stage. The compiler generates anintermediate representation (IR) and performs several optimizations such asconstant propagation, loop unrolling, and function inlining on this IR. Whileinternal representations in different compilers take different forms and names,essentially they capture two basic pieces of information about an application:control flow and data dependency. A high-level synthesis stage follows the com-piler stage and takes the optimized IR as input and generates the register trans-fer level (RTL) description of the design. Back-end tools perform logic synthesisand physical synthesis on this RTL description and create the bit-stream datato program the target system. In the most general form of this flow, feedbackpaths between major steps may exist to incorporate physical level informa-tion into high-level synthesis or hardware/synthesis-related information intocompiler stage (denoted by dotted arcs in Figure 1). Feedback is employed insuch flows to improve the interaction between different


Loading Unlocking...
Login

Join to view Scheduling Algorithm and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Scheduling Algorithm and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?