DOC PREVIEW
UE CS 215 - CS 215 ­ Fundamentals of Programming II

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 215 - Fundamentals of Programming IISpring 2011 - Project 730 pointsOut: April 6, 2011Due: April 15, 2011 (Friday)This project is to build a Huffman coding tree and use it to encode and decode messages. Huffman Codes As you should know, characters are encoded in a computer in ASCII format. The ASCII format assigns 8 bits to each character. This makes it easy to determine where the bits for one character start and end in a text file. Although ASCII is convenient, it is also wasteful of space. If we look at a file of English text, we can see there are far fewer than 256 characters. (256 is the maximum number of characters 8 bits can encode.) To save space, it is possible to compress a text file by encoding each character with less than 8 bits. The assignment of such an encoding can be done in various different ways. One way is to design a Huffman code. A Huffman code is a variable-length encoding scheme (i.e., the code for each character is not necessarily the same length) and has the prefix property (no sequence of bits that represents a character is a prefix of a longer sequence for some other character). This allows the code to be represented using a binary tree (called a codetree) with the following properties: ● Every node is either a leaf or has exactly 2 children. ● Letters appear only at the leaves of the tree. ● Each letter appears at most once in the codetree; thus there is a unique root-to-leaf path encoding for each letter. For example, consider the following codetree: The code for each letter is a string formed by appending a '0' when taking a left branch and a '1' for a right branch when traversing the root-to-leaf path. In the codetree above, the code for 'u' is “010” (left, right, left), the code for 's' is “00”, and the code for 'n' is “10”. A message is encoded by appending the codes for letters in 04/05/2011 1 of 6 D. Hwangrootsuyn ithe message. For example, the code for the message “sun” is “0001010”, which is formed by appending the codes for 's', 'u', and 'n'. The development of a Huffman code is left as a research exercise for the reader. It is based on symbol frequency such that the shorter length codes are assigned to symbols with higher frequency, resulting in an optimal (i.e., fewest bits) encoding for texts exhibiting the same frequency. The construction algorithm builds the codetree from these frequencies. For this project, you will be given codefiles containing a Huffman code. Your assignment is to write a HuffmanTree class (specification given below) that can read a codefile and construct a codetree for use in encoding and decoding messages, and write a (main) program to encode and decode messages using a HuffmanTree object. BTNode<T> structThis project is to use the BTNode struct template and the Binary Tree Toolkit functions as covered in Lecture 33 (Monday, April 4) and defined in file bintree.h. The use of the toolkit functions is expected where applicable.Specifications for HuffmanTree Class Note: this is not a template class, though it does use the BTNode struct template. Attributes There is only one attribute of this class, a single BTNode<char> pointer variable that points to the root of the constructed codetree. Operations Explicit-value constructor - builds the codetree from an input stream connected to a codefile. ● Analysis Objects Type Movement Nameinput stream connected to codefileistream received & passed back codeFileThe codefile will consist of an integer n, followed by n lines of the form letter code where code is the Huffman encoding for letter. There may be two special “letters”, % and $, that represents the space character and the newline character, respectively, in the codefile (since we cannot distinguish a space or a newline as a message symbol versus a space or a newline as a delimiter between codefile items). The constructor is to read the codefile and build the codetree. For example, the codefile for the codetree above would be:5i 11n 10s 00u 010y 01104/05/2011 2 of 6 D. HwangThe basic algorithm is to read each letter (as a character) and its code (as a string) and use the code to traverse the tree (starting with the root) as one would do when decoding. If an interior node does not exist, create it and continue. When the end of the code is reached, you are at the leaf node that stores the letter. To aid in displaying the tree, the interior nodes should be constructed containing the character '*' for their values. The value for the node containing the space character should be ' ' (not %), and the value for the node containing the newline character should be '\n' (not $). Copy constructor - creates a copy of the source HuffmanTree● Analysis Objects Type Movement Nameoriginal tree HuffmanTree received sourceDestructor - deletes HuffmanTree nodes● Analysis - no objects operator= - assignment operator, make this HuffmanTree a copy as the source HuffmanTree, returns a reference to this object ● Analysis Objects Type Movement Nameoriginal tree HuffmanTree received sourcethis tree HuffmanTree & returned *thisWriteTree - writes out the codetree “sideways” to an output stream using WriteTreeHelper● Analysis Objects Type Movement Nameoutput stream ostream received & passed back outWriteTreeHelper - private recursive helper function to write out tree to an output stream using an RNL traversal (right, visit node, left). This function is the same as the toolkit tree_print function with the addition of an output stream to use rather than writing to cout directly. Each value should be written on a separate line with (8*depth) number of spaces before it. The space character should be written (displayed) as % (not as ' '), and the newline character should be written (displayed) as $ (not as '\n' or the actual newline character). ● Analysis 04/05/2011 3 of 6 D. HwangObjects Type Movement Nameoutput stream ostream received & passed back outroot of tree to write BTNode<char>* received rootdepth of node int received depthEncode - encodes message from an input stream, writing results to an output stream, using CharToBits ● Analysis Objects Type Movement Nameinput stream connected to message fileistream received & passed back messageFileoutput stream ostream received & passed back outThe basic algorithm for this function is to repeatedly read a character (including spaces) from the input stream, encode it using CharToBits, then write the encoding to the output stream.


View Full Document

UE CS 215 - CS 215 ­ Fundamentals of Programming II

Documents in this Course
Lecture 4

Lecture 4

14 pages

Lecture 5

Lecture 5

18 pages

Lecture 6

Lecture 6

17 pages

Lecture 7

Lecture 7

28 pages

Lecture 1

Lecture 1

16 pages

Lecture 5

Lecture 5

15 pages

Lecture 7

Lecture 7

28 pages

Load more
Download CS 215 ­ Fundamentals of Programming II
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view CS 215 ­ Fundamentals of Programming II and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view CS 215 ­ Fundamentals of Programming II 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?