Automatic Pool Allocation: Improving Performance byControlling Data Structure Layout in the HeapChris Lattner Vikram AdveUniversity of Illinois at Urbana-Champaign{lattner,vadve}@cs.uiuc.eduAbstractThis paper describes Automatic Pool Allocation, a transformationframework that segregates distinct instances of heap-based datastructures into seperate memory pools and allows heuristics to beused to partially control the internal layout of those data structures.The primary goal of this work is performance improvement, notautomatic memory management, and the paper makes several newcontributions. The key contribution is a new compiler algorithmfor partitioning heap objects in imperative programs based on acontext-sensitive pointer analysis, including a novel strategy forcorrect handling of indirect (and potentially unsafe) function calls.The transformation does not require type safe programs and worksfor the full generality of C and C++. Second, the paper describesseveral optimizations that exploit data structure partitioning to fur-ther improve program performance. Third, the paper evaluates howmemory hierarchy behavior and overall program performance areimpacted by the new transformations. Using a number of bench-marks and a few applications, we find that compilation times areextremely low, and overall running times for heap intensive pro-grams speed up by 10-25% in many cases, about 2x in two cases,and more than 10x in two small benchmarks. Overall, we believethis work provides a new framework for optimizing pointer inten-sive programs by segregating and controlling the layout of heap-based data structures.Categories and Subject Descriptors D.3.4 [Processors]: Compil-ers, Optimization, Memory managementGeneral Terms Algorithms, PerformanceKeywords Recursive data structure, data layout, cache, staticanalysis, pool allocation1. IntroductionOne of the most important tasks for modern compilers and run-time systems is the management of memory usage in programs,including safety checking, optimization, and storage management.Unfortunately, compilers have proved much more effective at an-Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. To copy otherwise, to republish, to post on servers or to redistributeto lists, requires prior specific permission and/or a fee.PLDI’05,June 12–15, 2005, Chicago, Illinois, USA.Copyright 2005 ACM 1-59593-056-6/05/0006...$5.00.alyzing and controlling memory access patterns for dense arraysthan for pointer-based data structures. A key difference betweenthe two is that compilers have precise knowledge of the runtimelayout of arrays in memory, whereas they have much less informa-tion about complex data structures allocated on the heap. In such(pointer-based) data structures, both the relative layout of distinctdata structures in memory (which affects working set sizes) andthe relative layout of nodes within a single data structure (whichaffects memory traversal patterns) are difficult to predict. One di-rect consequence is that irregular memory traversal patterns oftenhave worse performance, both because of poor spatial locality andbecause techniques like hardware stride prefetching are not effec-tive. A potentially more far-reaching consequence of the lack oflayout information is that many compiler techniques (e.g., softwareprefetching, data layout transformations, and safety analysis) areeither less effective or not applicable to complex data structures.Despite the potential importance of data structure layouts, com-piler transformations for pointer-intensive programs are performedprimarily using pointer and dependence analysis, and not by con-trolling and using information about the layout of pointer-baseddata structures.Several compiler techniques attempt to modify the layout ofpointer-based data structures by giving hints or memory layout di-rectives to a runtime library [12], memory allocator [9], or garbagecollector [28, 29]. None of these techniques attempt to extract in-formation about or to control the relative layouts of objects within adata structure or of distinct data structures, nor can they be directlyextended to do so. Reliable data layout information and control arenecessary for data layout properties to be used as a basis for furthercompiler transformations.An alternative approach for segregating heap objects undercompiler control is the work on automatic region inference forML [45, 44, 24] and more recently for Java [14, 10]. These tech-niques partition objects into heap regions based on lifetimes, withthe primary goal of providing automatic memory management withlittle or no garbage collection for type-safe languages. In contrast,our primary goal in this work is to improve program performanceby segregating heap data structures and by enabling further layout-based optimizations on these data structures (in fact, we do not tryto reclaim memory automatically, except in limited cases). Becauseof their different goals, these techniques do not explore how to ex-ploit data structure partitioning to optimize memory hierarchy per-formance and do not support non-type-safe languages like C andC++, which are important for many performance-sensitive applica-tions. These previous approaches are compared with our work inmore detail in Section 10.This paper describes Automatic Pool Allocation, a transforma-tion framework for arbitrary imperative programs that segregates129distinct instances of pointer-based data structures in the heap intoseperate memory pools, and allows different heuristics to be usedto partially control the internal layout of those data structures. Forexample, each distinct instance of a list, tree, or graph identified bythe compiler would be allocated to a separate pool. The paper alsodescribes several simple optimizations that exploit the partitioningof data structures on the heap to further improve program perfor-mance. These optimizations are possible because Automatic PoolAllocation is a rigorous transformation performed by the compiler.In other work, we have used Automatic Pool Allocation and itsunderlying pointer analysis to develop other new, compiler tech-niques operating at the “macroscopic” level, i.e., at the level ofentire data structures (rather than individual pointers or objects).These
View Full Document