Elementary Sorting AlgorithmsSorting – DefinitionsSorting – DefinitionsSorting TerminologySorting CategoriesElementary Sorting MethodsSelection SortAlgorithm AnalysisInsertion SortSlide 10Worst-case AnalysisAverage-case AnalysisSlide 13Analysis of Inversions in PermutationsTheoremShellsortSlide 17Shell SortSlide 19Slide 20Slide 21How to Choose kSlide 23Slide 24Slide 25Slide 26Elementary Sorting AlgorithmsElementary Sorting AlgorithmsMany of the slides are from Prof. Plaisted’s resources at University of North Carolina at Chapel HillSorting – DefinitionsInput: n records, R1 … Rn , from a file.Each record Ri has a key Ki possibly other (satellite) information The keys must have an ordering relation that satisfies the following properties: Trichotomy: For any two keys a and b, exactly one of a b, a = b, or a b is true.Transitivity: For any three keys a, b, and c, if a b and b c, then a c.The relation = is a total ordering (linear ordering) on keys.Sorting – Definitions Sorting: determine a permutation = (p1, … , pn) of n records that puts the keys in non-decreasing order Kp1 < … < Kpn. Permutation: a one-to-one function from {1, …, n} onto itself. There are n! distinct permutations of n items.Rank: Given a collection of n keys, the rank of a key is the number of keys that precede it. That is, rank(Kj) = |{Ki| Ki < Kj}|. If the keys are distinct, then the rank of a key gives its position in the output file.Sorting TerminologyInternal (the file is stored in main memory and can be randomly accessed) vs. External (the file is stored in secondary memory & can be accessed sequentially only)Comparison-based sort: uses only the relation among keys, not any special property of the representation of the keys themselves.Stable sort: records with equal keys retain their original relative order; i.e., i < j & Kpi = Kpj pi < pj Array-based (consecutive keys are stored in consecutive memory locations) vs. List-based sort (may be stored in nonconsecutive locations in a linked manner)In-place sort: needs only a constant amount of extra space in addition to that needed to store keys.Sorting CategoriesSorting by Insertion insertion sort, shellsort Sorting by Exchange bubble sort, quicksortSorting by Selection selection sort, heapsort Sorting by Merging merge sortSorting by Distribution radix sortElementary Sorting MethodsEasier to understand the basic mechanisms of sorting.Good for small files.Good for well-structured files that are relatively easy to sort, such as those almost sorted.Can be used to improve efficiency of more powerful methods.Selection SortSelection-Sort(A, n)1. for i = n downto 2 do2. max i3. for j = i – 1 downto 1 do4. if A[max] < A[j] then5. max j6. t A[max]7. A[max] A[i]8. A[i] tSelection-Sort(A, n)1. for i = n downto 2 do2. max i3. for j = i – 1 downto 1 do4. if A[max] < A[j] then5. max j6. t A[max]7. A[max] A[i]8. A[i] tAlgorithm AnalysisIs it in-place?Is it stable?The number of comparisons is (n2) in all cases.Can be improved by a some modifications, which leads to heapsort (see next lecture).Insertion SortInsertionSort(A, n)1. for j = 2 to n do2. key A[j]3. i j – 14. while i > 0 and key < A[i]5. A[i+1] A[i]6. i i – 17. A[i+1] keyInsertionSort(A, n)1. for j = 2 to n do2. key A[j]3. i j – 14. while i > 0 and key < A[i]5. A[i+1] A[i]6. i i – 17. A[i+1] keyAlgorithm AnalysisIs it in-place?Is it stable?No. of Comparisons:If A is sorted: (n) comparisonsIf A is reverse sorted: (n2) comparisonsIf A is randomly permuted: (n2) comparisonsWorst-case AnalysisThe maximum number of comparisons while inserting A[i] is (i-1). So, the number of comparisons isCwc(n) i = 2 to n (i -1) = j = 1 to n-1 j = n(n-1)/2 = (n2) For which input does insertion sort perform n(n-1)/2 comparisons?Average-case AnalysisWant to determine the average number of comparisons taken over all possible inputs.Determine the average no. of comparisons for a key A[j].A[j] can belong to any of the j locations, 1..j, with equal probability.The number of key comparisons for A[j] is j–k+1, if A[j] belongs to location k, 1 < k j and is j–1 if it belongs to location 1. jjjjjkjjjkjjkjk1211121111)1(111111Average no. of comparisons for inserting key A[j] is:Average-case Analysis)( )(ln)( 11434 121)(22222nnOninniinCniniavgSumming over the no. of comparisons for all keys,Therefore, Tavg(n) = (n2)Analysis of Inversions in PermutationsInversion: A pair (i, j) is called an inversion of a permutation if i < j and (i) > (j).Worst Case: n(n-1)/2 inversions. For what permutation?Average Case: Let T = {1, 2, …, n } be the transpose of .Consider the pair (i, j) with i < j, there are n(n-1)/2 pairs.(i, j) is an inversion of if and only if (n-j+1, n-i+1) is not an inversion of T. This implies that the pair (, T) together have n(n-1)/2 inversions. The average number of inversions is n(n-1)/4.Theorem Theorem: Any algorithm that sorts by comparison of keys and removes at most one inversion after each comparison must do at least n(n-1)/2 comparisons in the worst case and at least n(n-1)/4 comparisons on the average. Theorem: Any algorithm that sorts by comparison of keys and removes at most one inversion after each comparison must do at least n(n-1)/2 comparisons in the worst case and at least n(n-1)/4 comparisons on the average. So, if we want to do better than (n2) , we have to remove more than a constant number of inversions with each comparison.ShellsortSimple extension of insertion sort.Created by Donald Shell in 1959the first sub quadratic sorting algorithmShell sort is a good choice for moderate amounts of dataStart with sub arrays created by looking at data that is far apart and then reduce the gap sizeGains speed by allowing exchanges with elements that
View Full Document