Princeton COS 521 - Lecture 4 - D1378341

Home> Schools> Princeton University> Computer Science (COS) > COS 521> Lecture 4

Princeton COS 521 - Lecture 4

Course Cos 521- Intro. To Genomics and Computation

Pages 8

Download Save

Unformatted text preview:

princeton univ. F’08 cos 521: Advanced Algorithm DesignLecture 4: Competitive analysis of data structuresLecturer: Sanjeev Arora Scribe:Rong Ge1 Introduction1.1 Competitive RatioIn the last lecture we gave the definition of competitive ratio for data structures, and showeda simple Move-To-Front algorithm that achieves competitive ratio 2 in the linked list model.Recall that competitive ratio is defined asDefinition 1 (Competitive Ratio) An algorithm A has a competitive ratio of α if forall sequence of operations σ, we haveCOSTA(σ) ≤ αCOSTOP T(σ) (1)where COSTA(σ) is the cost of algorithm A on the sequence σ, and COSTOP T(σ) is thecost of the best algorithm for the same sequence.1.2 BST modelToday we are going to apply competitive analysis on binary search trees. In the binarysearch tree model (shown in figure 1), each node v has a key k, all nodes that are in theleft subtree of v has keys smaller or equal to k, and nodes that are in the right subtree of vhas keys larger than k. The cost of following pointers in the tree and performing rotationsare all constants.Key = KsubtreeKey ≤ KsubtreeKey > KFigure 1: Binary Search TreesThe operations that binary search tree supports include INSERT , DELET E andF IND.12Splay Tree [ST85] is conjectured to have competitive ratio O(1), but the best boundknown is only O(log n). Recently, [DHIP07] proposed a new data structure Tango Tree thathas a competitive ratio O(log log n).1.3 Interleave BoundOne of the difficulties in proving competitive ratio upperbounds is that we do not knowhow the optimal algorithm works. To solve this problem, we first come up with somelowerbound IL(σ) such that ∀σ COSTOP T(σ) ≥ IL(σ), then show that COSTT ANGO(σ) ≤O(log log n) · IL(σ).In this lecture, we assume for simplicity that the request sequence consists only ofF IND’s. It is easy to extend the result when we allow IN SERT and DELET E. Nowwe may as well assume for notational ease that the set of keys is {1, 2, ..., n} because theBST’s structure only depends upon the comparison results between the keys. Let σ =(σ1, σ2, σ3, ...) be a sequence of F IN D operations.We define P as the static complete binary search tree on the set {1, 2, ..., n}. Forcomputing IL, we maintain a bit MR[i] for each i ∈ {1, 2, ..., n}, where MR stands for“Most Recent”.MR[i] = 1 iff the most recent F IN D operation that was routed through node i in Pwent to its right subtree.Since P is a complete binary search tree, for each single F IN D operation, there’s atmost dlog ne changes in MR values.Definition 2 (Interleave Bound) IL(σ) = Total number of changes in MR bits whileperforming the F IN D operations in σ.Wilber [Wil89] showed the following algorithm that states IL is a lowerbound forCOSTOP T(σ).Theorem 1COSTOP T(σ) = Ω(IL(σ)) (2)We will show the proof of this theorem later.2 Tango Tree Description2.1 Tango Tree StructureNow we specify the structure of Tango Trees using the binary search tree P and MR bits.First we define the notion of Prefered PathDefinition 3 (Prefered Path) Prefered Path (Figure 2) is the path descending fromroot to a leaf following the MR bits. (that is, going left when MR=0 and right when MR=1)It’s easy to see that by removing the edges in Prefered Path from the tree P, we get adisjoint union of dlog ne subtrees. Using these subtrees, we can define PP-DecompositionP P D(T ) for a tree T recursively3PreferedPathKey=n/2Figure 2: Prefered PathDefinition 4 (PP Decomposition)P P D(T ) = {P ref eredP ath(T )} ∪ PP-Decomposition for trees in {T − P ref eredP ath(T )}(3)Recall the fact that Red-Black Trees can support IN SERT , DELET E, F IN D inO(log k) time, where k is the number of nodes in the tree. Using Red-Black Trees, wedefine Tango Tree recursivelyDefinition 5 (Tango Tree) A Tango Tree (Figure 3) is a binary search tree in whichwe store the nodes in Prefered Path in a Red-Black Tree, and “Hang” Tango Trees for thedlog ne subtrees in the appropriate places in this Red-Black Tree.log log nRB-Tree forPrefered Path(log n nodes)“Hanged”Tango SubreeFigure 3: Tango Tree Structure42.2 F I N D operations for Tango TreeWhen a F IN D opperation happens, the MR bits may change. The changes in MR bitswill cause PP-Decomposition to change. How do we update tango tree?In fact, it is possible to do it in time O(log log n) · #changes in MR bits. To do this, weneed to store auxiliary information Dep, MinDep and MaxDep in the nodes of Red-Blacktree. The Dep value for a node is the depth of the node in the static tree P ; the M inDepvalue for a node is the minimum value of Dep among its children; the MaxDep value is themaximum value of Dep among its children. Maintaining these auxiliary values do not effectthe complexity of Red-Black Tree operations (see [CLRS01, chaptor 14]) We also claim thatin Red-Black Trees, we can do the following SPLIT and MERGE operations in O(log k)time where k is the number of nodes (see [CLRS01, Problem 13-2]).Definition 6 (SPLIT and MERGE) SP LIT (T, x), where T is a Red-Black Tree and xis a node in the tree, splits the tree into two Red-Black Trees where one tree includes all thenodes that has key < x; the other tree includes all the nodes that has key > x.MERGE(T1, T2, x), where T1is a RB Tree whose nodes have key < x, T2is a RB Treewhose nodes have key > x, merge the two trees into a single RB Tree, which contains allnodes in T1, T2and a node with key = x.Observe that each prefered path in the PP-Decomposition involves a contiguous intervalof depths (actually, it involves depths in the interval [t, log n] where t is the minimum depth).Using the SPLIT and MERGE operations, we claim that given a depth d, we can cut thenodes whose Dep > d in a Red-Black Tree. Also, we can join two Red-Black trees whereone only contains nodes with Dep > d, and we have performed a cut to the other tree sothat its Dep > d nodes are all lost.The key observation here is in Red-Black Tree of any path, the keys of nodes that haveDep > d form an interval [l, r] (because they are the intersection of a subtree of P andthe path). We can find the nodes with keys l and r following informations in M inDepand M axDep. Then we find the predecessor l0of l and the successor r0of r. All of theseoperations takes O(log k) time in Red-Black Tree.To do cut, we do a SP LIT at l0and then SP LIT at r0. Then we have a tree whosenodes have l0< key < r0and is therefore all the nodes with Dep > d. We mark this treehas “hanged” and

View Full Document


School:
Email:
New Password:
Confirm Password:

Princeton COS 521 - Lecture 4

Sign up for free to view:

Please select your school