c Balaji Raghavachari 54 Algorithm Design and Analysis✬✫✩✪Classification of Sorting based on location of data• External sorting: data is stored i n disk.– Data size is too large for computer’s m e mory.– Time is dominated by disk acce ss.– Merge sort is t he best choice.• Internal sorting: data is stored within memory.– Many algorithms are available: Merge sort,Quick sort, Heap sort, Insertion sort,Counting sort, Bucket sort, Radix sort.– Choice of algorithm depends on application.– Make use of knowledge of distribution of da ta.c Balaji Raghavachari 55 Algorithm Design and Analysis✬✫✩✪Comparison-based sorting algorithms• Operations used for sorting: comparison, swap.• Most general paradigm; applies to a l l sorting problems.• O(n log n) algorithms: Mer ge Sort, Heap sort.• O(n log n) expected tim e: Quick sort• Algorithm that runs in O(n2) tim e that should be usedonly when the data size is small or if the data i s almostsorted: Insertion sort.c Balaji Raghavachari 56 Algorithm Design and Analysis✬✫✩✪Lower bound for comparison-based sorting• Consider a comparison-based sorting algorithm X on nelements. Behavior of X on all inputs of size n can becaptured by a decision tree:– A binary tree in which eac h internal node representsa comparison of two elements (some aiand aj).– Edges to children are marked ≤ and > depending onthe result of the comparison.– Leaves are represented by permutation of inputsequence that corresponds to the sorted order.• Worst-case R.T. of X ≥ height of its decision tree.• Every permutation of the input has to be representedby a leaf. Therefore number of leaves ≥ n!.• Height of tree ≥ log(n!) =Pnk=1log k = Ω(n log n).c Balaji Raghavachari 57 Algorithm Design and Analysis✬✫✩✪Summary: Comparison-based sorting• Any algorithm that i s comparison-based (compare andswap) requires Ω(n log n) i n the worst case to sort anarray of n element s.• Hence Mer gesort and Heapsort are optimal algorithms.• Decision tree argument can be ext ended to prove thatcomparison-based sorting algorithms require at leastΩ(n log n) time to sort half of n! pe r mutations of{1, . . . , n}, i.e., most inputs require Ω(n log n) time.• Lower bound theorem can be extended to show that theexpected R.T. of randomized algorithms i s Ω(n log n).c Balaji Raghavachari 58 Algorithm Design and Analysis✬✫✩✪Counting sort• Used when elements are in the range 1..k for a smallinteger k. Count how many 1’s, how many 2’s, etc.there are in the given array.• Example: array A[1..n] are records that need to besorted by field named “key” (whose range is 1..k).– Initialize an array L of k linked lists to be empty.– For i = 1 to n, insert A[i] into L[q], where q = A[i].key.– For j = 1 to k, output elements of L[j].• Running time is O(n + k), which is O(n) if k = O(n).• Algorithm is stable if the order of equal elements is thesame as in the input array; easily achieved by insertingat the end of the lists.• For another version of Counting sort, see next slidec Balaji Raghavachari 59 Algorithm Design and Analysis✬✫✩✪Counting sort algorithmCounting-Sort(A,n) // Running time is O(n+k)// Sort A[1..n] with field key integer in 1..k// Count how many of each 1..k appear in Afor j := 1 to k do C[j] := 0for i := 1 to n do C[A[i].key]++for j := 2 to k do C[j] := C[j] + C[j-1]// C[j] elements of A[1..n] less than or equal to j// j should occupy B[C[j-1]+1..C[j]] in sorted outputfor i := n downto 1 do // reverse order for stable sortingj := A[i].keyB[C[j]] := A[i]C[j]--Output Bc Balaji Raghavachari 60 Algorithm Design and Analysis✬✫✩✪Radix sort• Given an array of n d- digit numbers, where each digittakes k possible values, sort them:for i := 1 to d do // least significant digit to the mostUse counting sort on the numbers, by looking atonly digit i (stable sorting is important)• Loop invariant to show correctness of Radix sort:After i passes, the array is sorted if we consider onlythe first i digits• Running time is O(d(n + k). If d = O(1) and k = O(n),then the running tim e of Radix sort is O(n)c Balaji Raghavachari 61 Algorithm Design and Analysis✬✫✩✪Bucke t sort• Algorithm used for sor ting statistical samples.• Divide range [B0, Bn] into n “buckets”:[B0, B1), [B1, B2), . . . , [Bn−1, Bn].• For i = 1 to n, insert A[i] into bucket k ifBk−1≤ A[i].key < Bk.• Sort the elements within each bucket.• For k = 1 to n, output the elements in bucket k.• Algorithm’s success depends on two aspects:1. Finding k for a given A[i] quickly.2. Selecting the ranges such that the expected numberof element s in each bucket i s O(1).• Expected R.T. is O(n) if both conditions are
View Full Document