CS61B Lecture #27Quicksort: Speed through ProbabilityExample of QuicksortPerformance of QuicksortQuick SelectionSelection ExampleSelection PerformanceBetter than N lg N?Beyond Comparison: Distribution CountingDistribution Counting ExampleRadix SortMSD Radix SortPerformance of Radix SortAnd Don't Forget Search TreesSummaryCS61B Lecture #27Today: Sorting, continued• Quicksort• Selection• Distribution counting• Radix sortsNext topic readings:Data Structures, Chapter 9.Public Service Announcement: Residential Computing is hiring. Be anRCC and get paid for your computer skills. Flexible Hours, Work Study:$11.97/hour Past RCC’s have gone on to places like Google, Apple, Mi-crosoft and eBay. For more information visit:http://rescomp.berkel ey.edu/rcchiring.Last modified: Wed Mar 22 17:16:35 2006 CS61B: Lecture #27 1Quicksort: Speed through ProbabilityIdea:•Partitiondata into pieces: everything > apivotvalue at the highend of the sequence to be sorted, and everythin g ≤ on the low end.• Repeat recursively on the high and low pieces.• For sp eed, stop when pieces are “small enough” and do insertion sorton the whole thing.• Reason: insertion sort has low constant factors. By des i g n, no itemwill move out of its will move out of its p i ece [why?], so when pie cesare small, #inversions is, too.• Have to choose pivot well. E.g.:medianof first, last and middleitems of sequence.Last modified: Wed Mar 22 17:16:35 2006 CS61B: Lecture #27 2Example of Quicksort• In this example, we continue until pieces are size ≤ 4.• Pivots for next step are starred. A rrange to move pivot to dividingline each time.• Last step is insertion sort.16 10 13 18 -4 -7 12 -5 19 15 0 222934-1*-4 -5 -7 -1 18 13 12 10 19 15 0 22293416*-4 -5 -7 -1 15 1312*10 0 16 19*222934 18-4 -5 -7 -1 10 0 12 15 13 16 18 19 2934 22• Now everything is “close to” right, so just do insertion sort:-7 -5 -4 -1 0 10 12 13 15 16 18 19 2229 34Last modified: Wed Mar 22 17:16:35 2006 CS61B: Lecture #27 3Performance of Quicksort• Probabalistic time:– If choice of pivots good, divide data in two each time: Θ(N lg N)with a good constant factor relative to merg e or h eap sort.– If choice of pivots bad, most items on one side each time: Θ(N2).– Ω(N lg N) in best case, so insertion sort better for nearly or-dered input sets.• Interesting point: randomly shuffling the data before sorting makesΩ(N2) timeveryunlikely!Last modified: Wed Mar 22 17:16:35 2006 CS61B: Lecture #27 4Quick SelectionThe Selection Problem: for given k, find kthsmallest element in data.• Obvious method: sort, select element #k, time Θ(N lg N).• If k ≤ some constant, can easily do in Θ(N) time:– Go through array, keep smallest k items.• GetprobablyΘ(N)timefor all k by adapting quicksort:– Partition around some pivot, p, as in quicksort, arrange that pivotends up at dividing line.– Suppose that in the result, pivot is at index m, all elements ≤pivot have indicies ≤ m.– If m = k, you’re done: p is answer.– If m > k, recursively select kthfrom left half of sequence.– If m < k, recursively select (k − m − 1)thfrom right half ofsequence.Last modified: Wed Mar 22 17:16:35 2006 CS61B: Lecture #27 5Selection ExampleProblem: Find just item #10 in the sorted version of arr a y:Initial contents:51 60 21 -4 37 4 49 1040*59 0 13 2 39 11 46 310Looking for #10 to left of pivot 40:13 31 21 -4 37 4* 11 10 39 2 0 40 59 51 4946600Looking for #6 to right of pivot 4:-4 0 2 4 37 13 11 10 39 2131* 40 59 51 4946604Looking for #1 to right of pivot 31:-4 0 2 4 21 13 11 10 31 3937 40 59 51 4946609Just two elements; just sort and return #1:-4 0 2 4 21 13 11 10 31 3739 40 59 51 4946609Result: 39Last modified: Wed Mar 22 17:16:35 2006 CS61B: Lecture #27 6Selection Performance• For this algorithm, if m roughly in middle each time, cost isC(N) =1, if N = 1,N + C(N/2), otherwise.= N + N/2 + . . . + 1= 2N − 1 ∈ Θ(N)• But in worst case, get Θ(N2), as for quicksort.• By another, non-obvious algorithm, can get Θ(N) worst-case timefor all k (take CS170).Last modified: Wed Mar 22 17:16:35 2006 CS61B: Lecture #27 7Better than N lg N?• Can prove thatif all you can do to keys is compare themthen sortingmust take Ω(N lg N).• Basic idea: there are N! possible ways the input data could bescrambled.• Therefore, your p rogram must be prepared to do N! different com-binations of move operations.• Therefore, there must be N! possible combinations of outcomes ofall the if tests in your program (we’re assuming that comparisons are2-way).• Since each if test goes two ways, number of possible different out-comes for k if tests is 2k.• Thus, need enough tests so that 2k> N!, which means k ∈ Ω(lg N!).• Using Stirling’s approximation,m! ∈√2πmmem1 + Θ1m,this tells us thatk ∈ Ω(N lg N).Last modified: Wed Mar 22 17:16:35 2006 CS61B: Lecture #27 8Beyond Comparison: Distribution Counting• But suppose can do more than compare keys?• For example, how can we sort a set of N integer keys whose valuesrange from 0 to kN, for some small constant k?• One technique:countthe number of items < 1, < 2, etc.• If Mp=#items with value < p, then in sorted order, the jthitemwith value p must be #Mp+ j.• Giveslinear-timealgorithm.Last modified: Wed Mar 22 17:16:35 2006 CS61B: Lecture #27 9Distribution Counting Example• Suppose all items are between 0 and 9 as in th i s example:7 0 4 0 9 1 9 1 9 5 3 7 3 1 6 7 4 2 030311223241516370839Counts0< 03< 16< 27< 39< 411< 512< 613< 716< 816< 9Running sum000 0 131 1 263 3 494 5116127137 7 9169 9• “Counts” line gives # occurrences of each key.• “Running sum” gives cumulative count of keys ≤ each value. . .• . . . which tells us where to put each key:• The first instance of key k goes into slot m, where m is the numberof key instances that are < k.Last modified: Wed Mar 22 17:16:35 2006 CS61B: Lecture #27 10Radix SortIdea: Sort keysone character at a time.• Can use distribution counting for each digit.• Can work eithe r right to left (LSD radix sort) or left to right (MSDradix sort)• LSD radix sort is venerable: used for punched cards.Initial: set, cat, cad, con, bat, can, be, let, betbe‘t’cad‘d’cancon‘n’betletbatcatset‘t’Pass 1(by char #2)be, cad, con, can, set, cat, bat, let, betbatcatcancad‘a’betletsetbe‘e’con‘o’Pass 2(by char #1)cad, can, cat, bat, be,
View Full Document