Chico CSCI 693 - Incremental Dynamic Code Generation with Trace Trees

Unformatted text preview:

Incremental Dynamic Code Generation with Trace TreesAndreas GalUniversity of California, [email protected] FranzUniversity of California, [email protected] unit of compilation for traditional just-in-time compilers is themethod. We have explored trace-based compilation, in which theunit of compilation is a loop, potentially spanning multiple methodsand even library code. Using a new intermediate representation thatis discovered and updated lazily on-demand while the program isbeing executed, our compiler generates code that is competitivewith traditional dynamic compilers, but that uses only a fractionof the compile time and memory footprint.Categories and Subject Descriptors D.3.4 [Programming Lan-guages]: Processors—Incremental compilers; optimizationGeneral Terms Design, Experimentation, PerformanceKeywords Java Virtual Machine, trace-based just-in-time compi-lation, compiler-internal representations, Trace Trees1. IntroductionJust-in-time compilers are often quite similar in structure to theirstatic counterparts. While they may be employing techniquesspecifically to reduce compilation time (for example, using linear-scan register allocation instead of a graph-coloring algorithm), theytypically start off by constructing a control-flow graph (CFG) forthe code to be compiled, then perform a series of optimization stepsbased on this graph, and as a final step traverse the CFG and emitnative code. In addition to a simple CFG, more ambitious opti-mizing compilers often use an intermediate representation basedon Static Single Assignment (SSA) form [8]. Generating SSA isan expensive operation, which is why in the dynamic compilationcontext this technique is used only for “workstation class” compil-ers.In this paper, we explore a different approach to building com-pilers in which no CFG is ever constructed. Instead, our compilerrecords and generates code from dynamically recorded code traces.Each code trace represents a loop in the program and may poten-tially span several basic blocks across several methods, even in-cluding library code.It has been shown [12] that generating SSA for a trace is muchsimpler than doing so for a general CFG, and that constructing acompiler based on trace-driven SSA generation has benefits. How-ever, this earlier work requires the program to have a dominant tracethat is taken on most loop iterations. If a program has several alter-[Copyright notice will appear here once ’preprint’ option is removed.]native execution paths that were all almost equally likely, then thisexisting technique is not competitive with CFG-based compilers.The work presented in this paper closes the gap that remainsbetween trace-based compilers and CFG-based ones. We describea new way of building compilers based on dynamically discoveredtraces, and our method is successful even when there are severalalternative traces of which none is dominant. Key to our approachis a new data structure for representing partially overlapping traces,such as a loop that contains an if-then-else condition. Ourmethod is able to dynamically and lazily discover both alternativesand jointly optimize both paths through the loop.We have built a prototype just-in-time compiler based on ournew compilation method. In this paper, we present benchmarksshowing that our compiler generates code that is almost as good asthat generated by Sun’s Java HotSpot compiler. The latter is a CFG-based compiler that is several orders of magnitude larger and slowerand is a mature product developed by a large team of programmersover several years. Our compiler is a research prototype developedby a single graduate student in under a year.The rest of this paper is organized as follows: In Section 2, weintroduce the control-flow graph model that underpins the inter-mediate representations used in virtually every existing compiler.Section 3 discusses our alternative Trace Tree representation, itsconstruction, and its on-demand extension. Section 4 explains howsuch Trace Trees are compiled. In Section 5, we discuss our proto-type implementation of a dynamic compiler based on Trace Trees.Related work is discussed in Section 6, and the paper ends with ourconclusions in Section 7.2. The Traditional Control Flow Graph ModelThe traditional control flow graph model represents a program asG = (B , E) where G is a directed graph, B is the set of basicblocks {b1, b2, . . . , bn} in G, and E is a set of directed edges{(bi, bj), (bk, bl), . . .}. Figure 1 shows the graphical representationof such a graph. Since methods can be understood as sub-programs,we can use the terms program and methods interchangeably in thiscontext.Each basic block b ∈ B is a linear sequence of instructions.A basic block is always entered at the top (first instruction), andalways continues until the last instruction is reached. After execut-ing all instructions in a basic block bi, execution continues with animmediate successor block of bi.The existence of a direct edge from bito such a successor blockbjis indicated through an ordered pair (bi, bj) of nodes. Note thatblocks can succeed themselves (a tight loop consisting of a singlebasic block), and thus the elements of said pair can be identical.The set of all immediate successor blocks of a block biis char-acterized through a successor function Γ1G(bi) = {bj|(bi, bj) ∈E}, and it can be empty only for the terminal node x ∈ B, whichterminates the program: Γ1G(x) = ∅.1 2006/12/231: code;2: do {if (condition) {3: code;} else {4: code;}5: } while (condition);6: code;6123 45Figure 1. A sample program and its corresponding control flowgraph. The starting point of each basic block is indicated through acorresponding label in the program code.The set of immediate predecessors of bj(the set of basic blocksthat can branch to bj) is characterized through the inverse of thesuccessor function: Γ−1G(bj) = {bi|(bi, bj) ∈ E}. It can be emptyonly for the entry node e ∈ B, which is the first block executedwhen running P : Γ−1G(e) = ∅.A path P along edges of a graph G can be expressed a se-quence of nodes (b1, b2, . . . , bn) of G. Each node in the sequenceis an immediate successor of the predecessor node in the sequence:bi+1∈ Γ1G(bi). A path does not have to consist of distinct blocksand can contain the same blocks (and implicitly the same edges)repeatedly.Figure 1 shows a sample program consisting of an if-then-elsecondition inside a do/while loop and the corresponding con-trol


View Full Document

Chico CSCI 693 - Incremental Dynamic Code Generation with Trace Trees

Download Incremental Dynamic Code Generation with Trace Trees
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Incremental Dynamic Code Generation with Trace Trees and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Incremental Dynamic Code Generation with Trace Trees 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?