Unformatted text preview:

6.897: Advanced Data Structures Spring 2003Lecture 1 — February 9, 2003Prof. Erik Demaine Scribe: Shantonu Sen1 Fixed-Universe Successor Problem1.1 MotivationFrequently, you need to store a dynamic set of n integers such that you can perform fast lookupsto determine whether an element is part of the set (i.e. a membership test). This problem can besolved in constant time per operation using hashing, as we’ll see in the next lecture. Two relatedpieces of information is the successor and predecessor of an integer (not necessarily in the set),that is, the two elements in the set that are immediately greater than and less than the integer,respectively. The successor and predecessor of an integer are particularly useful when the integeris not in the set, because they designate where that integer would “fit” if it were in the sorted set.The classic solution to this problem is to maintain a balanced binary tree of the integers in yourset. The membership test for a binary balanced tree can be performed in O(lg n) via binary search.The predecessor and successor functions can also be computed in O(lg n) time, by first attemptingto check membership of the search element, and then moving up the tree and “leftwards” (asformalized later) for the predecessor, and moving up the tree and “rightwards” for successor.One way to make the problem easier is to impose that the universe is not the set of all integers,but integers from 0 to u − 1. This assumption is called the fixed-universe assumption. Togetherwith the operations described above, the problem is called the fixed-universe successor problem orfixed-universe predecessor problem. This problem is also referred to as Interval Union-Split-Find[MNA88] and as priority queues [vEKZ77].1.2 Formal definitionWe would like to create a data structure with the follow properties:Goal: Maintain a dynamic subset S of size n from the universe U = {0, 1, 2, 3, . . . , u −1} of size uSupported operations:• Insert(x ∈ U /∈ S): add an element to S• Delete(x ∈ S): remove an element from S• Successor(x ∈ U): find the smallest element ∈ S that is > x• Predecessor(x ∈ U): find the largest element ∈ S that is < xDesired performance: Better than O(lg n) for all operations, with the time bound possibly de-pending on u. Traditional balanced binary tree operate in O(lg n) without the fixed-universeassumption. In particular, can we achieve O(lg lg u)?11.3 Known resultsThe standard solution to successor problem uses a balanced binary search trees, with a runningtime of O(lg n) per operation, using the comparison model on a pointer machine without the fixed-universe assumption. There are several other solutions, depending on the model of computationwe consider.1.3.1 ModelsWe will consider many different models of computation during this class. For starters, here are thethree that have been studied for the fixed-universe successor problem:Pointer-machine model. On a pointer machine, the data structure is described by a directedgraph, where each node stores a constant number of labeled outgoing pointers and a constantnumber of integers. In other words, you have a constant branching factor. For the fixed-universe successor problem, there is a pointer to each element in the universe U, and theinput to an operation is one of these pointers.Random Access Machine (RAM). In a RAM, memory is laid out as a finite array of slots. Ifyou know the index of an entry, you can jump to its location and do a memory access in O(1)time.For example, below is a representation of a memory store that holds the numbers 1, 3, and 6in the first 3 slots. Memory addresses can be loaded from and stored to, and the results ofloads can be combined with arithmetic or logical operators.The cost of an algorithm using a RAM is linear in the number of total instructions. That is,both memory accesses and arithmetic/logical operations cost O(1) time.0 1 2 ...1 3 6Cell-probe model. The cell-probe model is just like a RAM, except that arithmetic/logical com-putation is “free”. In this model, an algorithm’s cost is linear in the number of memoryaccesses. This model is rather unrealistic, but is commonly used for lower bounds; any lowerbound on the cell-probe model also applies to the RAM.1.3.2 ResultsHere are several upper and lower bounds for the successor and fixed-universe successor problem,and their models of computation.• Balanced binary search trees– Comparison model using a pointer machine (no fixed-universe assumption)2– O(lg n) time per operation– O(n) space• van Emde Boas [vEKZ77, vEB77] (this lecture)– Fixed-universe model on pointer machine (but we will describe the algorithm as workingon a RAM)– O(lg lg u) time per operation– O(u) space• Lower bound by Mehlhorn, N¨aher, and Alt [MNA88]– Pointer machine– Lower bound of Ω(lg lg u) time per operation• y-fast trees [Wil83]– Randomized algorithm on a RAM– O(lg lg u) time per operation– O(n) space• Lower bound by Beame and Fich [BF02]– Cell-probe model– Static case, no Insert/Delete– Lower bound of ΩÃmin(lg lg ulg lg lg u,slg nlg lg n)!time per operation, for any data struc-ture using only O(nO(1)) space• Exponential search trees [AT99]– RAM model– OÃmin(lg lg ulg lg lg ulg lg n,slg nlg lg n)!worst-case time per operation2 O(lg lg u) solution: van Emde Boas structureOur goal for this lecture is to achieve O(lg lg u) running time. Based on existing techniques foranalyzing asymptotic running time of algorithms, we have some intuition about how we might endup with this type of running time.One approach of traversing a data structure might involve doing binary search over O(lg u) things.Because a binary search runs in logarithmic time, this would yield performance of O(lg lg u).3Another possibility is creating a recurrence relation whose solution is O(lg lg u). For instance,consider the relation:T (u) = T (√u) + O(1)T0(lg u) = T0(lg√u) + O(1) — let T0(lg v) = T (v)T0(lg u) = T0(12lg u) + O(1) — pull out the square rootT0(x) = T0(12x) + O(1) — substitute x = lg uT0is O(lg x) — using Master MethodT0is O(lg lg u) — substitute for x = lg uThe van Emde Boas structure will employ the second technique to achieve O(lg lg u) running time.In a certain sense, it will also correspond to binary searching over the Θ(lg u) levels of a completebinary tree on u leaves.2.1 Starting point: precompute answersOne technique for solving the problem is to use an array to


View Full Document

MIT 6 897 - Lecture Notes

Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?