Unformatted text preview:

6.897: Advanced Data Structures Spring 2003Lecture 17 — April 28, 2003Prof. Erik Demaine Scribe: Jonathan Brunsman, Glenn Eguchi1 OverviewThis lecture will cover sorting in the cache-oblivious world. Sorting seems like an unusual topic fora data structures course, but as we will see, the results of our discussion of cache-oblivious sortingwill lead to our development of cache-oblivious priority queues. We first review external-memorysorting before moving on to cache-oblivious sorting, priority queues, and Funnel Heaps.1.1 NotationThis lecture uses capital letters in our analysis. We choose this notation because some papers usethe notation n =NBand m =MB, where N is the number of elements, M is the size of the cache,and B is the block size. This notation is confusing so we will avoid it.2 External-Memory SortingFirst, recall from last class that we obtained a bound of O(NBlogMBNB) in memory transfers forsorting in the comparison model. We obtained this bound by using anMB-way merge sort. In thislecture we will refer to that bound as the sorting bound, because there is a matching lower boundfor sorting in the comparison model [1]. The logarithmic baseMBand the multiplicative constantNBwere significant because they were departures from our idea of classic merge sort, which wasbounded by O(n lg n).Another observation to keep in mind throughout the discussion of sorting is that if we want a datastructure capable of sorting n elements and require that its performance match the sorting bound,we will need its performance to be O(1BlogMBNB) per element.3 Cache-Oblivious SortingAs we begin our discussion of cache-oblivious sorting, note that the algorithms and data structureswe have already seen have not depended on the parameter M, the cache size. Our sorting algorithm,however, will depend on M, thereby increasing the complexity of our analysis.The cache-oblivious sorting algorithm presented here will be a version of funnel sort [7, 4], whichis similar to merge sort. There are two principle versions of this sort: an aggressive funnel sort,developed first [7], and a lazy funnel sort, developed recently [4]. We give the latter, simplerdescription.13.1 Tall-Cache AssumptionIn our previous discussions of cache-oblivious data structures, we required that the quotientMBmustbe at least 1 so that we can store something useful in the cache. This assumption is rather weakand in this lecture we shall replace it with a stronger assumption, called the tall-cache assumption,which will be useful in cache-oblivious sorting. In fact, Brodal and Fagerberg [5] proved that a tall-cache assumption is necessary to achieve the sorting bound via a cache-oblivious algorithm. Thisassumption generally states that the cache is taller than it is wide. More precisely, the tall-cacheassumption requires that M = Ω(B1+γ) for some constant γ > 0.1In practice, a common cachesize is M = Ω(B2) (i.e., γ = 1), and we will make this stronger assumption for simplicity in thealgorithms and data structures below.3.2 K-FunnelThe basic approach of Funnel Sort is similar to that of a merge sort, and the overall goal will be todevelop a fast merger process. If we can merge quickly in the cache-oblivious world, then we cansort quickly as well.A funnel merges several sorted lists into one sorted list in an output buffer. Similarly, a K-funnelmerges K sorted lists into one sorted list. Each of these lists be large so that the total size is at leastK3. The K-funnel sorts these K lists using O(K3BlogMBK3B+ K) memory transfers. Note that theadditive term K is required because we must look at each list at least once. Note also that, whilethe merging process in a typical merge sort requires linear time, funnel merging will require thelogarithmic term. This difference stems from counting memory transfers, and will become apparentat the end of our analysis of Funnel Sort.The K-funnel is a static data structure, defined recursively in terms of√K-funnels; see Figure 1.The K-funnel has an output buffer of size K3and K input sources of possibly varying size but totalsize at least K3. Within the K-funnel are two levels of√K-funnels separated by middle buffers ofsize K3/2. There are√K middle buffers (each storing the output of one lower√K-funnel), so thetotal buffer space separating the√K-funnels is K2.3.3 Lazy MergeThe lazy merge process fills the middle buffers at the subsequent level within the K-funnel. Thealgorithm for lazy merge is simple: if the two lower-level buffers both have data, copy that data anddo a binary merge on the two buffers. This procedure is very compact since we are only consideringthree specific buffers in memory: the two input buffers and the resulting, merged buffer.When one child buffer is empty, however, we must recurse to fill that buffer. As a base case, wedefine the input to the leaves as the input lists to sort. If that input is empty then we shall indicateit through the use an EOF or other such marker.The difficult part of the lazy merge process is the analysis. The recursive layout will be similarto that used in van Emde Boas structures but now we must include the middle buffers between1We use the letter γ because it is the third letter of the alphabet, standing for “cache”.2Output BufferK funnelKfunnel......  BuffersInputsFunnel of size KAt least total size of KEach buffer of size KTotal size = KSize K...1/23/2233Figure 1: A K-funnelfunnels in our analysis. This requirement is not problematic as long as we maintain the funnels inconsecutive memory.3.4 Analysis of Lazy MergeWe begin our analysis of lazy merge by examining its space requirement. First, let’s consider whenthe funnels do and do not fit inside the cache by examining the space required by the funnels:S(K) = (√K + 1)S(√K) + K2The three terms of the recurrence are easily derived. First, the number of recursions is√K + 1.Second, each of these recursions costs S√K. Finally, the size of the buffer is K2(not K3since wedo not count the final output buffer).We solve this recurrence using recursion trees. Since there are O(K) leaves, each requiring constantspace, the root cost of K2clearly dominates and the recurrence simplifies to O(K2). More precisely,we shall say that S(K) = cK2for some constant c ≥ 1.The analysis of the time requirement for lazy merge proceeds similarly to that of the van EmdeBoas structures. We recurse to a specific


View Full Document

MIT 6 897 - Advanced Data Structures

Download Advanced Data Structures
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Advanced Data Structures and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Advanced Data Structures 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?