DOC PREVIEW
Princeton COS 226 - Advanced Topics in Sorting

This preview shows page 1-2-3 out of 9 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Algorithms in Java, 4th Edition·Robert Sedgewick and Kevin Wayne·Copyright © 2008·February 20, 2008 12:17:14 AMAdvanced Topics in Sorting‣selection‣duplicate keys‣system sorts‣comparators2SelectionGoal. Find the kth largest element.Ex. Min (k = 0), max (k = N-1), median (k = N/2).Applications.•Order statistics.•Find the “top k.”Use theory as a guide.•Easy O(N log N) upper bound.•Easy O(N) upper bound for k = 1, 2, 3.•Easy Ω(N) lower bound.Which is true?•Ω(N log N) lower bound?•O(N) upper bound?is selection as hard as sorting?is there a linear-time algorithm for all k?Partition array so that:•Element a[i] is in place.•No larger element to the left of i.•No smaller element to the right of i.Repeat in one subarray, depending on i; finished when i equals k.3Quick-selectpublic static Comparable select(Comparable[] a, int k){ StdRandom.shuffle(a); int lo = 0, hi = a.length - 1; while (hi > lo) { int i = partition(a, lo, hi); if (i < k) lo = i + 1; else if (i > k) hi = i - 1; else return a[k]; } return a[k];}vvlo hilo hi v vibeforeafterif a[k] is here,set hi to i-1if a[k] is here,set lo to i+14Quick-select: mathematical analysisProposition. Quick-select takes linear time on average.Pf sketch.•Intuitively, each partitioning step roughly splits array in half:N + N/2 + N/4 + … + 1 ~ 2N compares.•Formal analysis similar to quicksort analysis yields:Ex. (2 + 2 ln 2) N compares to find the median.Remark. Quick-select might use ~ N2/2 compares, but as with quicksort,the random shuffle provides a probabilistic guarantee.CN = 2 N + k ln ( N / k) + (N - k) ln (N / (N - k))5Theoretical context for selectionChallenge. Design a selection algorithm whose running time is linear in the worst-case.Theorem. [Blum, Floyd, Pratt, Rivest, Tarjan, 1973] There exists a compare-based selection algorithm that takes linear time in the worst case.Remark. Algorithm is too complicated to be useful in practice.Use theory as a guide.•Still worthwhile to seek practical linear-time (worst-case) algorithm.•Until one is discovered, use quick-select if you don’t need a full sort.6Generic methodsIn our select() implementation, client needs a cast.The compiler is also unhappy.Q. How to fix? % javac Quick.java Note: Quick.java uses unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. Double[] a = new Double[N]; for (int i = 0; i < N; i++) a[i] = StdRandom.uniform(); Double median = (Double) Quick.select(a, N/2);hazardous castrequired7Generic methodsSafe version. Compiles cleanly, no cast needed in client.Remark. Obnoxious code needed in system sort; not in this course (for brevity).public class Quick{ public static <Key extends Comparable<Key>> Key select(Key[] a, int k) { /* as before */ } public static <Key extends Comparable<Key>> void sort(Key[] a) { /* as before */ } private static <Key extends Comparable<Key>> int partition(Key[] a, int lo, int hi) { /* as before */ } private static <Key extends Comparable<Key>> boolean less(Key v, Key w) { /* as before */ } private static <Key extends Comparable<Key>> void exch(Key[] a, int i, int j) { Key swap = a[i]; a[i] = a[j]; a[j] = swap; } } generic type variable(value inferred from argument a[])return type matches array typecan declare variables of generic type‣selection‣duplicate keys‣comparators‣applications89Duplicate keysOften, purpose of sort is to bring records with duplicate keys together.•Sort population by age.•Find collinear points.•Remove duplicates from mailing list.•Sort job applicants by college attended. Typical characteristics of such applications.•Huge file.•Small number of key values.see Assignment 3Chicago 09:00:00Phoenix 09:00:03Houston 09:00:13Chicago 09:00:59Houston 09:01:10Chicago 09:03:13Seattle 09:10:11Seattle 09:10:25Phoenix 09:14:25Chicago 09:19:32Chicago 09:19:46Chicago 09:21:05Seattle 09:22:43Seattle 09:22:54Chicago 09:25:52Chicago 09:35:21Seattle 09:36:14Phoenix 09:37:44Chicago 09:00:00Chicago 09:00:59Chicago 09:03:13Chicago 09:19:32Chicago 09:19:46Chicago 09:21:05Chicago 09:25:52Chicago 09:35:21Houston 09:00:13Houston 09:01:10Phoenix 09:00:03Phoenix 09:14:25Phoenix 09:37:44Seattle 09:10:11Seattle 09:10:25Seattle 09:22:43Seattle 09:22:54Seattle 09:36:14Chicago 09:25:52Chicago 09:03:13Chicago 09:21:05Chicago 09:19:46Chicago 09:19:32Chicago 09:00:00Chicago 09:35:21Chicago 09:00:59Houston 09:01:10Houston 09:00:13Phoenix 09:37:44Phoenix 09:00:03Phoenix 09:14:25Seattle 09:10:25Seattle 09:36:14Seattle 09:22:43Seattle 09:10:11Seattle 09:22:54Stability when sorting on a second keysortedsorted by time sorted by city (unstable) sorted by city (stable)NOTsortedkey10Duplicate keysMergesort with duplicate keys. Always ~ N lg N compares.Quicksort with duplicate keys.•Algorithm goes quadratic unless partitioning stops on equal keys!•1990s C user found this defect in qsort().several textbook and system implementationsalso have this defectDuplicate keys: the problemAssume all keys are equal. Recursive code guarantees this case predominates!Mistake. Put all keys equal to the partitioning element on one side.Consequence. ~ N2 / 2 compares when all keys equal.Recommended. Stop scans on keys equal to the partitioning element.Consequence. ~ N lg N compares when all keys equal.Desirable. Put all keys equal to the partitioning element in place.11B A A B A B B B C C C A A A A A A A A A A AB A A B A B C C B C B A A A A A A A A A A AA A A B B B B B C C C A A A A A A A A A A AGoal. Partition array into 3 parts so that:•Elements between lt and gt equal to partition element v.•No larger elements to left of lt.•No smaller elements to right of gt.Dutch national flag problem. [Edsger Dijkstra]•Convention wisdom until mid 1990s: not worth doing.•New approach discovered when fixing mistake in C library qsort().•Now incorporated into qsort() and Java system sort.123-way partitioningv>v<v=vlo hilt gtlo hi3-way partitioningbeforeafter133-way partitioning: Dijkstra's solution3-way partitioning.•Let v be partitioning element a[lo].•Scan i from left to right.-a[i] less than v : exchange a[lt] with a[i] and increment both lt and i-a[i] greater than v : exchange a[gt] with a[i] and decrement gt-a[i] equal to v : increment iAll the right properties.•In-place.•Not much code.•Small overhead if no equal


View Full Document

Princeton COS 226 - Advanced Topics in Sorting

Documents in this Course
QUICKSORT

QUICKSORT

14 pages

QUICKSORT

QUICKSORT

55 pages

STRINGS

STRINGS

69 pages

Lecture

Lecture

4 pages

STRINGS

STRINGS

18 pages

Hashing

Hashing

7 pages

MERGESORT

MERGESORT

14 pages

Quicksort

Quicksort

12 pages

Strings

Strings

10 pages

Overview

Overview

15 pages

Hashing

Hashing

13 pages

Mergesort

Mergesort

15 pages

Tries

Tries

13 pages

Final

Final

12 pages

Final

Final

12 pages

Mergesort

Mergesort

13 pages

Final

Final

10 pages

Load more
Download Advanced Topics in Sorting
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Advanced Topics in Sorting and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Advanced Topics in Sorting 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?