Chico CSCI 693 - Incremental Dynamic Code Generation with Trace Trees - D921859

Home> Schools> California State University, Chico> Computer Science (CSCI) > CSCI 693> Incremental Dynamic Code Generation with Trace Trees

Chico CSCI 693 - Incremental Dynamic Code Generation with Trace Trees

School name California State University, Chico

Course Csci 693- Research Methods in Computer Science

Pages 11

Download Save

Unformatted text preview:

Incremental Dynamic Code Generation with Trace TreesAndreas GalUniversity of California, [email protected] FranzUniversity of California, [email protected] unit of compilation for traditional just-in-time compilers is themethod. We have explored trace-based compilation, in which theunit of compilation is a loop, potentially spanning multiple methodsand even library code. Using a new intermediate representation thatis discovered and updated lazily on-demand while the program isbeing executed, our compiler generates code that is competitivewith traditional dynamic compilers, but that uses only a fractionof the compile time and memory footprint.Categories and Subject Descriptors D.3.4 [Programming Lan-guages]: Processors—Incremental compilers; optimizationGeneral Terms Design, Experimentation, PerformanceKeywords Java Virtual Machine, trace-based just-in-time compi-lation, compiler-internal representations, Trace Trees1. IntroductionJust-in-time compilers are often quite similar in structure to theirstatic counterparts. While they may be employing techniquesspecifically to reduce compilation time (for example, using linear-scan register allocation instead of a graph-coloring algorithm), theytypically start off by constructing a control-flow graph (CFG) forthe code to be compiled, then perform a series of optimization stepsbased on this graph, and as a final step traverse the CFG and emitnative code. In addition to a simple CFG, more ambitious opti-mizing compilers often use an intermediate representation basedon Static Single Assignment (SSA) form [8]. Generating SSA isan expensive operation, which is why in the dynamic compilationcontext this technique is used only for “workstation class” compil-ers.In this paper, we explore a different approach to building com-pilers in which no CFG is ever constructed. Instead, our compilerrecords and generates code from dynamically recorded code traces.Each code trace represents a loop in the program and may poten-tially span several basic blocks across several methods, even in-cluding library code.It has been shown [12] that generating SSA for a trace is muchsimpler than doing so for a general CFG, and that constructing acompiler based on trace-driven SSA generation has benefits. How-ever, this earlier work requires the program to have a dominant tracethat is taken on most loop iterations. If a program has several alter-[Copyright notice will appear here once ’preprint’ option is removed.]native execution paths that were all almost equally likely, then thisexisting technique is not competitive with CFG-based compilers.The work presented in this paper closes the gap that remainsbetween trace-based compilers and CFG-based ones. We describea new way of building compilers based on dynamically discoveredtraces, and our method is successful even when there are severalalternative traces of which none is dominant. Key to our approachis a new data structure for representing partially overlapping traces,such as a loop that contains an if-then-else condition. Ourmethod is able to dynamically and lazily discover both alternativesand jointly optimize both paths through the loop.We have built a prototype just-in-time compiler based on ournew compilation method. In this paper, we present benchmarksshowing that our compiler generates code that is almost as good asthat generated by Sun’s Java HotSpot compiler. The latter is a CFG-based compiler that is several orders of magnitude larger and slowerand is a mature product developed by a large team of programmersover several years. Our compiler is a research prototype developedby a single graduate student in under a year.The rest of this paper is organized as follows: In Section 2, weintroduce the control-flow graph model that underpins the inter-mediate representations used in virtually every existing compiler.Section 3 discusses our alternative Trace Tree representation, itsconstruction, and its on-demand extension. Section 4 explains howsuch Trace Trees are compiled. In Section 5, we discuss our proto-type implementation of a dynamic compiler based on Trace Trees.Related work is discussed in Section 6, and the paper ends with ourconclusions in Section 7.2. The Traditional Control Flow Graph ModelThe traditional control flow graph model represents a program asG = (B , E) where G is a directed graph, B is the set of basicblocks {b1, b2, . . . , bn} in G, and E is a set of directed edges{(bi, bj), (bk, bl), . . .}. Figure 1 shows the graphical representationof such a graph. Since methods can be understood as sub-programs,we can use the terms program and methods interchangeably in thiscontext.Each basic block b ∈ B is a linear sequence of instructions.A basic block is always entered at the top (first instruction), andalways continues until the last instruction is reached. After execut-ing all instructions in a basic block bi, execution continues with animmediate successor block of bi.The existence of a direct edge from bito such a successor blockbjis indicated through an ordered pair (bi, bj) of nodes. Note thatblocks can succeed themselves (a tight loop consisting of a singlebasic block), and thus the elements of said pair can be identical.The set of all immediate successor blocks of a block biis char-acterized through a successor function Γ1G(bi) = {bj|(bi, bj) ∈E}, and it can be empty only for the terminal node x ∈ B, whichterminates the program: Γ1G(x) = ∅.1 2006/12/231: code;2: do {if (condition) {3: code;} else {4: code;}5: } while (condition);6: code;6123 45Figure 1. A sample program and its corresponding control flowgraph. The starting point of each basic block is indicated through acorresponding label in the program code.The set of immediate predecessors of bj(the set of basic blocksthat can branch to bj) is characterized through the inverse of thesuccessor function: Γ−1G(bj) = {bi|(bi, bj) ∈ E}. It can be emptyonly for the entry node e ∈ B, which is the first block executedwhen running P : Γ−1G(e) = ∅.A path P along edges of a graph G can be expressed a se-quence of nodes (b1, b2, . . . , bn) of G. Each node in the sequenceis an immediate successor of the predecessor node in the sequence:bi+1∈ Γ1G(bi). A path does not have to consist of distinct blocksand can contain the same blocks (and implicitly the same edges)repeatedly.Figure 1 shows a sample program consisting of an if-then-elsecondition inside a do/while loop and the corresponding con-trol

View Full Document


School:
Email:
New Password:
Confirm Password:

Chico CSCI 693 - Incremental Dynamic Code Generation with Trace Trees

Sign up for free to view:

Please select your school