DOC PREVIEW
CMU CS 15745 - Cache-Conscious Structure Definition

This preview shows page 1-2-3-4 out of 12 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Cache-Conscious Structure Definition Trishul M. Chilimbi Bob Davidson Computer Sciences Department Microsoft Corporation University of Wisconsin-Madison One Microsoft Way 1210 W. Dayton Street Redmond, WA 98052 Madison, WI 53706 [email protected] James R. Larus Microsoft Research One Microsoft Way Redmond, WA 98052 [email protected] [email protected] ABSTRACT A program’s cache performance can be improved by changing the organization and layout of its data-even complex, pointer-based data structures. Previous techniques improved the cache perfor- mance of these structures by arranging distinct instances to increase reference locality. These techniques produced significant perfor- mance improvements, but worked best for small structures that could be packed into a cache block. This paper extends that work by concentrating on the internal orga- nization of fields in a data structure. It describes two techniques- structure splitting and field reordering--that improve the cache behavior of structures larger than a cache block. For structures com- parable in size to a cache block, structure splitting can increase the number of hot fields that can be placed in a cache block. In five Java programs, structure splitting reduced cache miss rates 1 O-27% and improved performance 648% beyond the benefits of previ- ously described cache-conscious reorganization techniques. For large structures, which span many cache blocks, reordering fields, to place those with high temporal affinity in the same cache block can also improve cache utilization. This paper describes bbcache, a tool that recommends C structure field reorderings. Preliminary measurements indicate that reordering fields in 5 active structures improves the performance of Microsoft SQL Server 7.0 2-3%. Keywords cache-conscious definition, structure splitting, class splitting, field reorganization 1. INTRODUCTION An effective way to mitigate the continually increasing processor- memory performance gap is to allocate data structures in a manner that increases a program’s reference locality and improves its cache performance [ 1, 5, 61. Cache-conscious data layout, which clusters temporally related objects into the same cache block or into non- conflicting blocks, has been shown to produce significant perfor- mance gains. This paper continues the study of data placement optimizations along the orthogonal direction of reordering the internal layout of a structure or class’s fields. The paper describes two cache-conscious definition technique-truchrre splitting and field reordering- that can improve the cache behavior of programs. In other words, Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advan- tage and that copies beer this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SIGPLAN ‘99 (PLDI) 5/99 Atlanta, GA, USA Q 1999 ACM l-58113-083-X/99/0004...55.00 previous techniques focused on the external arrangement of struc- ture instances, while this paper focuses on their internal organiza- tion. In particular, previous techniques (with the exception of techniques for reducing cache conflicts) worked best for structures smaller than half of a cache block. The techniques in this paper apply to larger structures as well. Figure 1 indicates different opportunities for improving cache per- formance. Caches have fmite capacity (CC main memory capacity) and transfer data in units called cache blocks that encompass multi- ple words. Caches also have finite associativity, and this restricts where a block can be placed in the cache. Placing contemporane- ously accessed structure elements in the same cache block improves cache block utilization and provides an implicit prefetch. Moreover, it makes more efficient use of cache space by reducing a structure’s cache block footprint. Finally, mapping concurrently accessed structure elements (which will not all fit in a single cache block) to non-conflicting cache blocks reduces cache misses. The techniques in this paper directly improve a structure’s cache block utilization and reduce its cache block working set. In addition, since decreas- ing a structure’s cache footprint also reduces the number of blocks that potentially conflict, the techniques may indirectly reduce con- flict misses. Figure 2 illustrates the relationship of cache-conscious definition technique to the size of structure instances. Instances significantly smaller than a cache block (case 1), are unlikely to benefit from additional manipulation at definition time. Previous technique+- such as Chilimbi and Lams cache-conscious object co-location, which uses a copying garbage collector to place objects referenced together near each other in memory [S+are effective. If the structure instance size is comparable to the size of a cache block (case 2), splitting structure elements into a hot and cold por- tion can produce hot structure pieces smaller than a cache block, which permits application of cache-conscious reorganization tech- niques to these portions. As this paper will show, many Java objects belong to this category. In addition, since Java is a type-safe lan- guage, class splitting can be automated. The first step in this pro- cess is to profile a Java program to determine member access frequency. These counts identify class member fields as hot (fre- quently accessed) or cold (rarely accessed). A compiler extracts cold fields from the class and places them in a new object, which is referenced indirectly from the original object. Accesses to cold fields require an extra indirection to the new class, while hot field accesses remain unchanged. At run time, Chilimbi and Larus’ cache-conscious garbage collector co-locates the modified object instances. For five medium-sized Java benchmarks, class splitting combined with Chilimbi and Lams cache-conscious object co- location reduced L2 cache miss rates by 29-43%, with class split- ting accounting for 2642% of this reduction, and improved perfor- mance by 18-28%, with class splitting contributing 22-66% of


View Full Document

CMU CS 15745 - Cache-Conscious Structure Definition

Documents in this Course
Lecture

Lecture

14 pages

Lecture

Lecture

19 pages

Lecture

Lecture

8 pages

Lecture

Lecture

5 pages

Lecture

Lecture

6 pages

lecture

lecture

17 pages

Lecture 3

Lecture 3

12 pages

Lecture

Lecture

17 pages

Lecture

Lecture

18 pages

lecture

lecture

14 pages

lecture

lecture

8 pages

lecture

lecture

5 pages

Lecture

Lecture

19 pages

lecture

lecture

10 pages

Lecture

Lecture

20 pages

Lecture

Lecture

8 pages

Lecture

Lecture

7 pages

lecture

lecture

59 pages

Lecture

Lecture

10 pages

Task 2

Task 2

2 pages

Handout

Handout

18 pages

Load more
Download Cache-Conscious Structure Definition
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Cache-Conscious Structure Definition and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Cache-Conscious Structure Definition 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?