1Union-Find: A Data Structure for Disjoint Set Operations2Equivalence RelationsA equivalence relation R is defined on a set S, if for every pair of elements (a,b) in S, a R b is either false or truea R b is true iff:(Reflexive) a R a, for each element a in S(Symmetric) a R b if and only if b R a(Transitive) a R b and b R c implies a R cThe equivalence class of an element a (in S) is the subset of S that contains all elements related to a3Equivalence Relations: ExamplesElectrical cable connectivity networkCities belonging to the same country4The ProblemGiven a set S of n elements, [a1…an], compute all the equivalent class of all its elements5Properties of Equivalence ClassesOBSERVATION:Each element has to belong to exactly one equivalence classCOROLLARY:All equivalence classes are mutually disjointWhat we are after is the set of all “maximal”equivalence classes6Disjoint Set OperationsTo identify all equivalence classes1.Initially, put each each element in a set of its own2.Permit only two types of operations:–Find(x): Returns the equivalence class of x–Union(x, y): Merges the equivalence classes corresponding to elements x and y, if and only if x is “related” to y7Steps in the Union (x, y)1.EqClassx= Find (x)2.EqClassy= Find (y)3.EqClassxy= EqClass1U EqClass1Merge or union8A Simple Algorithm Using Union and Find operationsInitially, put each element in a set of its ownFOR EACH element pair (a,b):Check [a R b = true]IF a R b THENUnion(a,b)O(n2) iterations9InitializationInitially, each element is put in one set of its own Start with n sets10After Union(4,5)After Union(6,7)11After Union(5,6)12The Union-Find Data StructureTwo basic operationsFind (x)Union (x, y)The set can be input as an arrayUniquely identify each element by its array index(ie., value does not matter for find or union)13Union-Find Data Structure14• Entry s[i] points to ithparent •-1 means rootUnion performed arbitrarily16Analysis of the simple versionEach Union() takes only O(1) in the worst caseEach Find() also could take O(n) timeTherefore, m operations, where m>>n, would take O(mn) in the worst-case17Smarter Union AlgorithmsProblem with the arbitrary root attachment strategy in the simple approach is that:The tree, in the worst-case, could just grow along one long (O(n)) pathIdea: Prevent such long chains from happeningHow ? Enforce Union() to happen in a “balanced” way18Heuristic: Union-By-SizeAttach the root of the “smaller” tree to the root of the “larger” treeHow to Union(3,4) ?Size=4Size=119Union-By-Size: An arbitrary union:Good wayBad way20Alternative Heuristic: Union-By-HeightAttach the root of the “shallower” tree to the root of the “deeper” treeHow to Union(3,4) ?Height=2Height=021Union-By-Height: (aka “Union-By-Rank”)22How Good Are These Union Heuristics?Worst-case treeMaximum depth restricted to O(log n)23Code for Union-By-Rank24Analysis: Smart Union HeuristicsFind() now takes O(log n), while Union() still takes O(1) timeFor m operations: O(m log n) run-timeCan it be better?What is still causing the (log n) factor is the number of hops from each node to the rootIdea: Get the root as close as possible to each nodePath Compression!25Path Compression HeuristicDuring Find(x) operation:Update all the nodes along the path from x to the root point directly to the rootA two-pass algorithmrootxFind(x):It can be proven thatpath compression aloneensures that find(x) canbe achieved in O(log n) over m operations 1stPass2ndPass26Path Compression: Code27Heuristics & their GainsO(m Inv.Ackermann(m,n))= O(m log*n)Union-by-rank, Path compression FindO(m log n)Arbitrary Union, Path compression FindO(m log n)Union-by-rank,Simple FindO(m log n)Union-by-size, Simple FindO(m n)Arbitrary Union, Simple FindWorst-case run-time for m operationsExtremely slowGrowing function28What is Inverse Ackermann Function? A(1,j) = 2jfor j>=1A(i,1)=A(i-1,2) for i>=2A(i,j)= A(i-1,A(i,j-1)) for i,j>=2InvAck(m,n) = min{i | A(i,floor(m/n))>log N}InvAck(m,n) = O(log*n) A very slow functionEven Slower!(pronounced “log star n”)29How Slow is Inverse Ackermann Function? InvAck(m,n) = O(log*n) log*n = log log log log ……. nHow many times we have to repeatedly take log on n to make the value to 1?log*65536=4, but log*265536=5A very slow functionEven Slower!Union-by-Rank & Path-Compression: CodeInit()Find()Union()31An Application: Maze3233Example 2: Jigsaw PuzzlePicture Source: http://ssed.gsfc.nasa.gov/lepedu/jigsaw.htmlMerging Criterion: Visual & Geometric
View Full Document