# Princeton COS 521 - Lecture 4 (8 pages)

Previewing pages*1, 2, 3*of 8 page document

**View the full content.**## Lecture 4

Previewing pages
*1, 2, 3*
of
actual document.

**View the full content.**View Full Document

## Lecture 4

0 0 70 views

- Pages:
- 8
- School:
- Princeton University
- Course:
- Cos 521 - Intro. To Genomics and Computation

**Unformatted text preview:**

princeton univ F 08 cos 521 Advanced Algorithm Design Lecture 4 Competitive analysis of data structures Lecturer Sanjeev Arora 1 1 1 Scribe Rong Ge Introduction Competitive Ratio In the last lecture we gave the definition of competitive ratio for data structures and showed a simple Move To Front algorithm that achieves competitive ratio 2 in the linked list model Recall that competitive ratio is defined as Definition 1 Competitive Ratio An algorithm A has a competitive ratio of if for all sequence of operations we have COSTA COSTOP T 1 where COSTA is the cost of algorithm A on the sequence and COSTOP T is the cost of the best algorithm for the same sequence 1 2 BST model Today we are going to apply competitive analysis on binary search trees In the binary search tree model shown in figure 1 each node v has a key k all nodes that are in the left subtree of v has keys smaller or equal to k and nodes that are in the right subtree of v has keys larger than k The cost of following pointers in the tree and performing rotations are all constants Key K subtree Key K subtree Key K Figure 1 Binary Search Trees The operations that binary search tree supports include IN SERT DELET E and F IN D 1 2 Splay Tree ST85 is conjectured to have competitive ratio O 1 but the best bound known is only O log n Recently DHIP07 proposed a new data structure Tango Tree that has a competitive ratio O log log n 1 3 Interleave Bound One of the difficulties in proving competitive ratio upperbounds is that we do not know how the optimal algorithm works To solve this problem we first come up with some lowerbound IL such that COSTOP T IL then show that COSTT AN GO O log log n IL In this lecture we assume for simplicity that the request sequence consists only of F IN D s It is easy to extend the result when we allow IN SERT and DELET E Now we may as well assume for notational ease that the set of keys is 1 2 n because the BST s structure only depends upon the comparison results between the keys Let 1 2 3 be a sequence of F IN D operations We define P as the static complete binary search tree on the set 1 2 n For computing IL we maintain a bit M R i for each i 1 2 n where MR stands for Most Recent M R i 1 iff the most recent F IN D operation that was routed through node i in P went to its right subtree Since P is a complete binary search tree for each single F IN D operation there s at most dlog ne changes in M R values Definition 2 Interleave Bound IL Total number of changes in MR bits while performing the F IN D operations in Wilber Wil89 showed the following algorithm that states IL is a lowerbound for COSTOP T Theorem 1 COSTOP T IL 2 We will show the proof of this theorem later 2 2 1 Tango Tree Description Tango Tree Structure Now we specify the structure of Tango Trees using the binary search tree P and MR bits First we define the notion of Prefered Path Definition 3 Prefered Path Prefered Path Figure 2 is the path descending from root to a leaf following the MR bits that is going left when MR 0 and right when MR 1 It s easy to see that by removing the edges in Prefered Path from the tree P we get a disjoint union of dlog ne subtrees Using these subtrees we can define PP Decomposition P P D T for a tree T recursively 3 Prefered Path Key n 2 Figure 2 Prefered Path Definition 4 PP Decomposition P P D T P ref eredP ath T PP Decomposition for trees in T P ref eredP ath T 3 Recall the fact that Red Black Trees can support IN SERT DELET E F IN D in O log k time where k is the number of nodes in the tree Using Red Black Trees we define Tango Tree recursively Definition 5 Tango Tree A Tango Tree Figure 3 is a binary search tree in which we store the nodes in Prefered Path in a Red Black Tree and Hang Tango Trees for the dlog ne subtrees in the appropriate places in this Red Black Tree log log n RB Tree for Prefered Path log n nodes Hanged Tango Subree Figure 3 Tango Tree Structure 4 2 2 F IN D operations for Tango Tree When a F IN D opperation happens the MR bits may change The changes in MR bits will cause PP Decomposition to change How do we update tango tree In fact it is possible to do it in time O log log n changes in MR bits To do this we need to store auxiliary information Dep M inDep and M axDep in the nodes of Red Black tree The Dep value for a node is the depth of the node in the static tree P the M inDep value for a node is the minimum value of Dep among its children the M axDep value is the maximum value of Dep among its children Maintaining these auxiliary values do not effect the complexity of Red Black Tree operations see CLRS01 chaptor 14 We also claim that in Red Black Trees we can do the following SPLIT and MERGE operations in O log k time where k is the number of nodes see CLRS01 Problem 13 2 Definition 6 SPLIT and MERGE SP LIT T x where T is a Red Black Tree and x is a node in the tree splits the tree into two Red Black Trees where one tree includes all the nodes that has key x the other tree includes all the nodes that has key x M ERGE T1 T2 x where T1 is a RB Tree whose nodes have key x T2 is a RB Tree whose nodes have key x merge the two trees into a single RB Tree which contains all nodes in T1 T2 and a node with key x Observe that each prefered path in the PP Decomposition involves a contiguous interval of depths actually it involves depths in the interval t log n where t is the minimum depth Using the SPLIT and MERGE operations we claim that given a depth d we can cut the nodes whose Dep d in a Red Black Tree Also we can join two Red Black trees where one only contains nodes with Dep d and we have performed a cut to the other tree so that its Dep d nodes are all lost The key observation here is in Red Black Tree of any path the keys of nodes that have Dep d form an interval l r because they are the intersection of a subtree of P and the path We can find the nodes with keys l …

View Full Document