March 29, 2005 Lecture 12: Closest Pair 1Closest PairPiotr IndykMarch 29, 2005 Lecture 12: Closest Pair 2Closest Pair• Find a closest pair among p1…pn∈Rd• Easy to do in O(dn2) time – For all pi≠pj, compute ||pi– pj|| and choose the minimum• We will aim for better time, as long as d is “small”• For now, focus on d=2March 29, 2005 Lecture 12: Closest Pair 3Divide and conquer• Divide: – Compute the median of x-coordinates– Split the points into PLand PR, each of size n/2• Conquer: compute the closest pairs for PL and PR• Combine the results (the hard part)March 29, 2005 Lecture 12: Closest Pair 4Combine• Let k=min(k1,k2)• Observe:– Need to check only pairs which cross the dividing line– Only interested in pairs within distance < k• Suffices to look at points in the 2k-width strip around the median linek1k22kMarch 29, 2005 Lecture 12: Closest Pair 5Scanning the strip• Sort all points in the strip by their y-coordinates, forming q1…qt, t ≤ n.• Let yibe the y-coordinate of qi• kmin= k• For i=1 to t– j=i-1– While yi-yj< k• If ||qi–qj||<kminthen kmin=||qi–qj||• j:=j-1• Report kmin(and the corresponding pair)kkkkkMarch 29, 2005 Lecture 12: Closest Pair 6Analysis• Correctness: easy• Running time is more involved• Can we have many qj’s that are within distance k from qi?•No• Proof by packingargumentkMarch 29, 2005 Lecture 12: Closest Pair 7Analysis, ctd.Theorem: there are at most 7 qj’s , j<i, such that yi-yj≤ k.Proof:• Each such qjmust lie either in the left or in the right k × ksquare• Within each square, all points have distance distance ≥ k from others• We can pack at most 4 such points into one square, so we have 8 points total (incl. qi) qiMarch 29, 2005 Lecture 12: Closest Pair 8At most 4 • Split the square into 4 sub-squares of size k/2 × k/2• Diameter of each square is k/21/2 < k → at most one point per sub-squareMarch 29, 2005 Lecture 12: Closest Pair 9Running time• Divide: O(n)• Combine: O(n log n) because we sort by y• However, we can:– Sort all points by y at the beginning– Divide preserves the y-order of pointsThen combine takes only O(n)• We get T(n)=2T(n/2)+O(n), so T(n)=O(n log n)March 29, 2005 Lecture 12: Closest Pair 10Higher dimensions (sketch)• Divide: split P into PLand PRusing the hyperplane x=t• Conquer: as before• Combine:– Need to take care of points with x in [t-k,t+k]– This is essentially the same problem, but in d-1dimensions– We get: • T(n,d)=2T(n/2, d)+T(n,d-1)• T(n,1)=Od(1) n– Solves to: T(n,d)=n logd-1nMarch 29, 2005 Lecture 12: Closest Pair 11Closest Pair with Help• Given: P={p1…pn} of points from Rd, such that the closest distance is in (t,c t]• Goal: find the closest pair• Will give an O((2c d1/2)dn) time algorithm• Note: by scaling we can assume t=1March 29, 2005 Lecture 12: Closest Pair 12Algorithm• Impose a cubic grid onto Rd, where each cell is a 1/d1/2×1/d1/2cube• Put each point into a bucket corresponding to the cell it belongs to• Diameter of each cell is ≤1, so at most one point per cell• For each p∈P, check all points in cells intersecting a ball B(p,c)• How many cells are there ?– All are contained in a d-dimensional box of side 2(c+1/d1/2) ≤ 2(c+1)– At most (2 d1/2(c+1) )dsuch cells• Total time: O((2c d1/2)dn)pMarch 29, 2005 Lecture 12: Closest Pair 13How to find good t ?• Repeat:– Choose a random point p in P– Let t=t(p)=D(p,P-{p})– Impose a grid with side t’< t/(1+d1/2), i.e., such that any pair of adjacent cells has diameter <t– Put the points into the grid cells– Remove all points whose all adjacent cells are empty• Until P is empty• Observation: the values t are decreasingpMarch 29, 2005 Lecture 12: Closest Pair 14Correctness• Consider t computed in the last iteration– There is a pair of points with distance t (it defines t)– There is no pair of points with distance t’ or less (otherwise they would have been placed in adjacent cells, and the algorithm would have continued)– We get c=t/t’~ 2 d1/2March 29, 2005 Lecture 12: Closest Pair 15Running time• Consider t(p1)…t(pm)• An iteration is lucky if t(pi) ≥ t for at last half of points pi• The probability of being lucky is ≥1/2• Expected #iterations till a lucky one is ≤2• After we are lucky, the number of points is ≤ m/2• Total expected time = 3dtimes
View Full Document