Robert Sedgewick and Kevin Wayne • Copyright © 2005 • http://www.Princeton.EDU/~cos2264.4 Balanced TreesReference: Chapter 13, Algorithms in Java, 3rd Edition, Robert Sedgewick.2Symbol Table ReviewSymbol table: key-value pair abstraction.! Insert a value with specified key.! Search for value given key.! Delete value with given key.Randomized BST.! O(log N) time per op.! Store subtree count in each node.! Generate random numbers for each insert/delete op.This lecture.! Splay trees.! 2-3-4 trees.! Red-black trees.! B-trees.unless you get ridiculously unlucky32-3-4 Trees2-3-4 tree.! Scheme to keep tree balanced.! Generalize node to allow multiple keys.Allow 1, 2, or 3 keys per node.! 2-node: one key, two children.! 3-node: two keys, three children.! 4-node: three keys, four children.H I NF RUA C Ek ! F F ! k ! R R ! k42-3-4 Trees: Search and InsertSearch.! Compare search key against keys in node.! Find interval containing search key.! Follow associated link (recursively).Insert.! Search to bottom for key.! 2-node at bottom: convert to 3-node.! 3-node at bottom: convert to 4-node.! 4-node at bottom: ??H I NF RUA C Ek ! F F ! k ! R R ! k52-3-4 Trees: Splitting Four NodesTransform tree on the way down.! Ensures last node is not a 4-node.! Local transformation to split 4-nodes:Invariant: current node is not a 4-node.! One of two above transformations must apply at next node.! Insertion at bottom is easy since it's not a 4-node.62-3-4 Trees: Splitting a Four NodeSplitting a four node: move middle key up.A-CK Q WDE-J L-P R-V X-ZA-CKD QE-J L-P R-V X-ZW72-3-4 TreesTree grows up from the bottom.EAPEXML8Balance in 2-3-4 TreesProperty. All paths from top to bottom have exactly the same length.Tree height.! Worst case: lg N [all 2-nodes]! Best case: log4 N = 1/2 lg N [all 4-nodes]! Between 10 and 20 for a million nodes.! Between 15 and 30 for a billion nodes.Note. Comparison within nodes not accounted for.92-3-4 Trees: Implementation?Direct implementation complicated because of:! Maintaining multiple node types.! Implementation of getChild.! Large number of cases for split.private Node insert(Node h, Key key, Value val) { Node x = h; while (x != null) { x = x.getChild(key); if (x.is4Node()) x.split(); } if (x.is2Node()) x.make3Node(key, val); else if (x.is3Node()) x.make4Node(key, val);}fantasy code10Red-Black TreesRepresent 2-3-4 trees as binary trees.! Use "internal" edges for 3- and 4- nodes.! Correspondence between 2-3-4 trees and red-black trees.! Not 1-1 because 3-nodes swing either way.red11Splitting Nodes in Red-Black TreesTwo easy cases: switch colors.Two hard cases: use rotations.do single rotationdo double rotation12Red-Black Tree Node Split Exampleright rotate R "left rotate E "change colorsinserting G13Red-Black Tree ConstructionEAPEXML14Balance in Red-Black TreesProperty. Length of longest path is at most twice the length ofshortest path.Tree height. Worst case: 2 lg N.Note. Comparison within nodes are counted.15Symbol Table: Implementations Cost Summary* assumes hash map is random for all keys† N is the number of nodes ever inserted‡ probabilistic guarantee§ amortized guaranteeSorted arrayImplementationUnsorted listBSTRandomized BSTSplaylog NSearchNNlog N ‡log N §NInsert1Nlog N ‡log N §log NSearchNlog N †log Nlog N §NInsert1log N †log Nlog N §NDelete1log N †log Nlog N §Worst Case Average CaseRed-Black log N log N log N log N log NNDelete1Nlog N ‡log N §log NHashing N 1 1* 1* 1*N16Red-Black Trees in PracticeRed-black trees vs. splay trees.! Fewer rotations than splay trees.! One extra bit per node for color.Red-black trees vs. hashing.! Hashing code is simpler and usually faster:arithmetic to compute hash vs. comparison.! Hashing performance guarantee is weaker.! BSTs have more flexibility and can support wider range of ops.Red-black trees are widely used as system symbol tables.! Java: TreeMap, TreeSet.! C++ STL: map, multimap, multiset.possible to eliminateat most 2 per insertion17Red Black Tree: Java LibraryJava has built-in libraries for symbol tables.! TreeMap = red black tree implementation.Duplicate policy.! Java HashMap allows null values.! Our implementations forbid null values.import java.util.TreeMap;public class TreeMapDemo { public static void main(String[] args) { TreeMap<String, String> st = new TreeMap <String, String>(); st.put("www.cs.princeton.edu", "128.112.136.11"); st.put("www.princeton.edu", "128.112.128.15"); System.out.println(st.get("www.cs.princeton.edu")); }}18B-TreesB-Tree. Generalizes 2-3-4 trees by allowing up to M links per node.Main application: file systems.! Reading a page into memory from disk is expensive.! Accessing info on a page in memory is free.! Goal: minimize # page accesses.! Node size M = page size.Space-time tradeoff.! M large # only a few levels in tree.! M small # less wasted space.! Typical M = 1000, N < 1 trillion.Bottom line: number of page accesses is logMN per op.! 3 or 4 in practice!19B-Tree ExampleM = 5PageKey = Value = intinsert 27520B-Tree Example (cont)21Symbol Table: Implementations Cost SummaryB-Tree: Number of PAGE accesses is logMN per op.Sorted arrayImplementationUnsorted listBSTRandomized BSTSplaylog NSearchNNlog N ‡log N §NInsertNNlog N ‡log N §log NSearchNlog N †log Nlog N §N / 2InsertNlog N †log Nlog N §N / 2DeleteNlog N †log Nlog N §Worst Case Average CaseRed-Black log N log N log N log N log NNDeleteNNlog N ‡log N §log NB-Tree 1 1 1 1 11page accessesHashing N 1 1* 1* 1*N22B-Trees in the WildFile systems.! Windows: HPFS.! Mac: HFS, HFS+.! Linux: ReiserFS, XFS, Ext3FS, JFS.Databases.! Most common index type in modern databases.! ORACLE, DB2, INGRES, SQL, PostgreSQL, . . .Variants.! B trees: Bayer-McCreight (1972, Boeing)! B+ trees: all data in external nodes.! B* trees: keeps pages at least 2/3 full.! R-trees for spatial searching: GIS, VLSI.journaling23SummaryGoal: ST implementation with log N guarantee for all ops.! Probabilistic: randomized BST.! Amortized: splay tree.! Worst-case: red-black tree.! Algorithms are variations on a theme: rotations when inserting.Abstraction extends to give search algorithms for huge files.!
View Full Document