CompSci 100E26.1Data Compressionÿ Compression is a high-profile application .zip, .mp3, .jpg, .gif, .gz, … What property of MP3 was a significant factor in what made Napster work?ÿ Why do we care? Secondary storage capacity doubles every year Disk space fills up quickly on every computer system More data to compress than ever beforeCompSci 100E26.2More on Compressionÿ What’s the difference between compression techniques? .mp3 files and .zip files? .gif and .jpg? Lossless and lossyÿ Is it possible to compress (lossless) every file? Why?ÿ Lossy methods Good for pictures, video, and audio (JPEG, MPEG, etc.)ÿ Lossless methods Run-length encoding, Huffman, LZW, …11 3 5 3 2 6 2 6 5 3 5 3 5 3 10CompSci 100E26.3Priority Queueÿ Compression motivates the study of the ADT priority queue Supports two basic operationso insert -- an element into the priority queueo delete -- the minimal element from the priority queue Implementations may allow getmin separate from deleteo Analogous to top/pop, front/dequeue in stacks, queuesÿ See PQDemo.java code below sorts, complexity?Scanner s;PriortyQueue pq = new PriorityQueue();while (s.hasNext()) pq.add(s.next());while (pq.size() > 0) {System.out.println(pq.remove());}CompSci 100E26.4Priority Queue implementationsÿ Implementing priority queues: average and worst caseInsert worstHeapGetmin(delete)worstBalanced treeSearch treeSorted vectorUnsorted vectorGetmin(delete)averageInsert averagelog n log n O(n) O(n)O(1) log n log nlog nlog nlog n log n log nO(n)O(1) O(n)O(1)O(n) O(1) O(n) O(1) Heap has O(1) find-min (no delete) and O(n) build heapCompSci 100E26.5PriorityQueue.java (Java 5)ÿ What about objects inserted into pq? If deletemin is supported, what properties must inserted objects have, e.g., insert non-comparable? Change what minimal means? Implementation uses heapÿ If we use a Comparator for comparing entries we can make a min-heap act like a max-heap, see PQDemo Where is class Comparator declaration? How used? What's a static inner class? A non-static inner class?ÿ In Java 5 there is a Queue interface and PriorityQueue class The PriorityQueue class also uses a heapCompSci 100E26.6Sorting w/o Collections.sort(…)public static void sort(ArrayList a){PriorityQueue pq = new PriorityQueue();for(int k=0;k<a.size();k++)pq.add(a.get(k));for(int k=0;k<a.size();k++)a.set(k,pq.remove());}ÿ How does this work, regardless of pqueue implementation?ÿ What is the complexity of this method? add O(1), remove O(log n)? If add O(log n)? heapsort uses array as the priority queue rather than separate pqobject. From a big-Oh perspective no difference: O(n log n)o Is there a difference? What’s hidden with O notation?CompSci 100E26.7Priority Queue implementationÿ PriorityQueue uses heaps, fast and reasonably simple Why not use inheritance hierarchy as was used with Map? Trade-offs when using HashMap and TreeMap:o Time, spaceo Ordering properties, e.g., what does TreeMap support?ÿ Changing method of comparison when calculating priority? Create object to replace, or in lieu of compareToo Comparable interface compares this to passed object o Comparator interface compares two passed objects Both comparison methods: compareTo() and compare()o Compare two objects (parameters or self and parameter)o Returns –1, 0, +1 depending on <, ==, >CompSci 100E26.8Heap Definitionÿ Heap is an array-based implementation of a binary tree used for implementing priority queues, supports: insert, findmin, deletemin: complexities?ÿ Using array minimizes storage (no explicit pointers), faster too --- children are located by index/position in arrayÿ Heap is a binary tree with shape property, heap/value property shape: tree filled at all levels (except perhaps last) and filled left-to-right (complete binary tree) each node has value smaller than both
View Full Document