Introduction to Algorithms6.046J/18.401JOrder statisticsRandomized divide-and-conquer algorithmExampleIntuition for analysisAnalysis of expected timeAnalysis (continued)Calculating expectationCalculating expectationCalculating expectationCalculating expectationCalculating expectationHairy recurrenceSubstitution methodSubstitution methodSubstitution methodSubstitution methodSummary of randomized order-statistic selectionWorst-case linear-time order statisticsChoosing the pivotChoosing the pivotChoosing the pivotChoosing the pivotAnalysisAnalysisAnalysisMinor simplificationDeveloping the recurrenceSolving the recurrenceConclusionsSeptember 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.1Introduction to Algorithms6.046J/18.401JLECTURE 6Order Statistics• Randomized divide and conquer• Analysis of expected time• Worst-case linear-time order statistics• AnalysisProf. Erik DemaineSeptember 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.2Order statisticsSelect the ith smallest of n elements (the element with rank i).• i = 1: minimum;• i = n: maximum;• i = ⎣(n+1)/2⎦ or ⎡(n+1)/2⎤: median.Naive algorithm: Sort and index ith element.Worst-case running time = Θ(n lg n) + Θ(1)= Θ(n lg n),using merge sort or heapsort (not quicksort).September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.3Randomized divide-and-conquer algorithmRAND-SELECT(A, p, q, i) ⊳ ith smallest of A[p ..q] if p = q then return A[p]r ← RAND-PARTITION(A, p, q)k ← r – p + 1⊳ k = rank(A[r])if i = k then return A[r]if i < k then return RAND-SELECT(A, p, r – 1, i )else return RAND-SELECT(A, r + 1, q, i – k )≤ A[r]≤ A[r]≥ A[r]≥ A[r]rpqkSeptember 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.4ExampleSelect the i = 7th smallest:i = 7pivot6610101313558833221111k = 4Select the 7 – 4 = 3rd smallest recursively.2255336688131310101111Partition:September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.5Intuition for analysis(All our analyses today assume that all elements are distinct.)Lucky:101log9/10==nnCASE 3T(n)= T(9n/10) + Θ(n)= Θ(n)Unlucky:T(n)= T(n –1) + Θ(n)= Θ(n2)arithmetic seriesWorse than sorting!September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.6Analysis of expected timeThe analysis follows that of randomized quicksort, but it’s a little different.Let T(n) = the random variable for the running time of RAND-SELECT on an input of size n, assuming random numbers are independent.For k = 0, 1, …, n–1, define the indicator random variableXk =1 if PARTITION generates a k : n–k–1 split,0 otherwise.September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.7Analysis (continued)T(n) = T(max{0, n–1}) + Θ(n) if 0:n–1 split,T(max{1, n–2}) + Θ(n) if 1:n–2 split,MT(max{n–1, 0}) + Θ(n) if n–1 : 0 split,()∑−=Θ+−−=10)(})1,(max{nkknknkTX.To obtain an upper bound, assume that the ith element always falls in the larger side of the partition:September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.8Calculating expectation()⎥⎦⎤⎢⎣⎡Θ+−−=∑−=10)(})1,(max{)]([nkknknkTXEnTETake expectations of both sides.September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.9Calculating expectation()()[]∑∑−=−=Θ+−−=⎥⎦⎤⎢⎣⎡Θ+−−=1010)(})1,(max{)(})1,(max{)]([nkknkknknkTXEnknkTXEnTELinearity of expectation.September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.10Calculating expectation()()[][][ ]∑∑∑−=−=−=Θ+−−⋅=Θ+−−=⎥⎦⎤⎢⎣⎡Θ+−−=101010)(})1,(max{)(})1,(max{)(})1,(max{)]([nkknkknkknknkTEXEnknkTXEnknkTXEnTEIndependence of Xkfrom other random choices.September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.11Calculating expectation()()[][][ ][]∑∑∑∑∑−=−=−=−=−=Θ+−−=Θ+−−⋅=Θ+−−=⎥⎦⎤⎢⎣⎡Θ+−−=1010101010)(1})1,(max{1)(})1,(max{)(})1,(max{)(})1,(max{)]([nknknkknkknkknnknkTEnnknkTEXEnknkTXEnknkTXEnTELinearity of expectation; E[Xk] = 1/n .September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.12Calculating expectation()()[][][ ][][]⎣⎦)()(2)(1})1,(max{1)(})1,(max{)(})1,(max{)(})1,(max{)]([12/1010101010nkTEnnnknkTEnnknkTEXEnknkTXEnknkTXEnTEnnknknknkknkknkkΘ+≤Θ+−−=Θ+−−⋅=Θ+−−=⎥⎦⎤⎢⎣⎡Θ+−−=∑∑∑∑∑∑−=−=−=−=−=−=Upper terms appear twice.September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.13Hairy recurrence[]⎣⎦)()(2)]([12/nkTEnnTEnnkΘ+=∑−=(But not quite as hairy as the quicksort one.)Prove: E[T(n)] ≤ cn for constant c > 0 .• The constant c can be chosen large enough so that E[T(n)] ≤ cn for the base cases.Use fact:⎣⎦212/83nknnk∑−=≤(exercise).September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.14Substitution method[]⎣⎦)(2)(12/ncknnTEnnkΘ+≤∑−=Substitute inductive hypothesis.September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.15Substitution method[]⎣⎦)(832)(2)(212/nnncncknnTEnnkΘ+⎟⎠⎞⎜⎝⎛≤Θ+≤∑−=Use fact.September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.16Substitution method[]⎣⎦⎟⎠⎞⎜⎝⎛Θ−−=Θ+⎟⎠⎞⎜⎝⎛≤Θ+≤∑−=)(4)(832)(2)(212/ncncnnnncncknnTEnnkExpress as desired – residual.September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.17Substitution method[]⎣⎦cnncncnnnncncknnTEnnk≤⎟⎠⎞⎜⎝⎛Θ−−=Θ+⎟⎠⎞⎜⎝⎛≤Θ+≤∑−=)(4)(832)(2)(212/,if c is chosen large enough so that cn/4 dominates the Θ(n).September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.18Summary of randomized order-statistic selection• Works fast: linear expected time.• Excellent algorithm in practice.• But, the worst case is very bad: Θ(n2).Q. Is there an algorithm that runs in linear time in the worst case?IDEA: Generate a good pivot recursively.A. Yes, due to Blum, Floyd, Pratt, Rivest, and Tarjan [1973].September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.19Worst-case linear-time order statisticsif i = k then return xelseif i < k
View Full Document