DOC PREVIEW
MIT 6 375 - Study Guide

This preview shows page 1-2-17-18-19-36-37 out of 37 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 37 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 37 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 37 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 37 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 37 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 37 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 37 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 37 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Behram Mistree & Dmitry Kashlev6.375 Final Project ReportGZIP Encoding and Decoding in HardwareGZIP IntroductionGZIP is a software application used for file compression. It is widely used by many UNIX systems. GZIP provides an effectively lossless data compression. GZIP is based on DEFLATE algorithm which is a combination of LZ77 and Huffman coding. The DEFLATE algorithm is specified in RFC1951, while GZIP file format is specified by RFC1952.Figure 1 below describes the general architecture of GZIP. GZIP contains an encoder and a decoder, each of which have Huffman and LZ77 modules. The text input is fed into LZ77 encoder, and output of LZ77 encoder is connected to the input of Huffman encoder. For the purpose of our final project we connected the output of Huffman encoder to input of Huffman decoder. This is where the data becomes compressed. The output of Huffman decoder, in turn, is connected to input of LZ77 decoder. The output of LZ77 decoder is the uncompressed text.1LZ77 Overview:Suppose that you wanted to encode a file that consisted of the sentence “I really, really want to encode a sentence,” repeated 1000 times. You could determine the optimal way of encoding the sentence and then repeat that optimal encoding 1000 times. LZ77 eschews this method and instead encodes the first appearance of the sentence and then replaces all subsequent repetitions of the original sentence with “pointers” to the first sentence. For instance:I really, really want to encode a sentenceI really, really want to encode a sentenceI really, really want to encode a sentenceI really, really want to encode a sentenceI really, really want to encode a sentenceI really, really want to encode a sentencewould be encoded as:I really, really want to encode a sentence<Pointer to sentence 1><Pointer to sentence 1><Pointer to sentence 1>2LZ77 EncoderHuffman EncoderInput FileCompressed FileHuffman DecoderLZ77 DecoderCompressed FileOriginal FileFigure 1: General architecture<Pointer to sentence 1><Pointer to sentence 1>A pointer has two parts to its structure: a distance and a length. The distance is a value ranging from 1 to 32,768 and corresponds to how many spaces prior to the current position the repeated snippet of text is located. The length is a value ranging from 3 to 258 bytes that represents how long the string to insert is. Revisiting our example, because the length of the “I really, really want to encode a sentence” is 42, we would get:I really, really want to encode a sentence<Distance: 42; Length: 42><Distance: 42; Length: 42><Distance: 42; Length: 42><Distance: 42; Length: 42><Distance: 42; Length: 42>The basic principle behind LZ77 encoding is that one can replace a phrase that appears multiple times in a text with a pointer to the previous occurrence of that same piece of text. The LZ77 specification requires that the encoder have access to either 2KB, 4KB, 8KB, or 32KB of data prior to the current character or word being encoded. Because the default option for most software implemented GZIP encoders requires access to 32KB of data, we built our hardware using a 32KB window of accessible values. (However, a single trivial parameter change in our code would allow for the 2KB, 4KB, and 8KB cases.)We need some way to keep track of the previous 32KB of data. To accomplish this goal, we built a separate RAM module that allows reads and writes. To assure a good model of real-world memory, this 3decoupled memory module does not allow combinational reads. We built a controller module for the RAM module called Memory Checker. Its primary purpose is to check whether valid locations in memory contain characters equal to those passed into it by calls to its methods. In addition, Memory Checker is responsible for ensuring that, during writes, characters are inserted into the correct positions in the RAM memory module.The interface with comments for the Memory Checker module is provided below. method Action write_next( Char data ); write_next takes in a character of data and puts it in the appropriate space in memory. method Action check_equals (Rindx rindx, Char val); method ActionValue#(Bool) get_equals(); check_equals takes in an memory index position (rindx) and a character. The corresponding call get_equals returns whether the character value passed in was equal to the element stored in memory at position rindx. method Action set_find_triplet( Char data1, Char data2, Char data3 ); method ActionValue#(Maybe#(TripletReturner)) get_find_triplet(); set_find_triplet takes in three characters. The corresponding call to get_find_triplet returns an invalid data type if all three characters were not found consecutively in the memory. If all three characters were found consecutively in the memory, get_find_triplet returns the index of the memory where the first of the three characters were found tagged valid. The set_find_triplet and get_find_triplet methods are the crux of the Memory Check module. Figure 2 presents a high level block diagram of the inner workings of Memory Check when set_find_triplet is called. Briefly: 1. set_find_triplet: A call to set_find_triplet initializes register search_ptr to the earliest memory index written to.42. memory request: The Memory Check module requests data from the locations in memory with indices search_ptr, search_ptr +1, and search_ptr +2.3. Memory Response: Memory Response returns values.a) If the memory response contains values equal all three of the values passed in to set_find_triplet, then we set get_find_triplet to return search_ptr tagged valid.b) Check search_ptr: If the memory response contains values that do not equal all three of the values passed in to set_find_triplet, then we increment search_ptr.i. If we already checked the character value stored in the RAM index search_ptr +2, then we set get_find_triplet to return invalid data.ii. If we have not already checked the character value stored in the RAM index search_ptr +2, then we go back to step 2, memory request.56Figure 2: High level block diagram of logic for set_find_triplet and get_find_triplet methods of Memory Checker.7Figure 3: Semi-hardware level diagram of Memory Checker module.We have created complete drawings by hand of the hardware generated by the LZ77 encoder (minus compiler optimizations). They provide a very detailed and specific understanding of how the hardware for the LZ77 encoder works. Such


View Full Document

MIT 6 375 - Study Guide

Documents in this Course
IP Lookup

IP Lookup

15 pages

Verilog 1

Verilog 1

19 pages

Verilog 2

Verilog 2

23 pages

Encoding

Encoding

21 pages

Quiz

Quiz

10 pages

IP Lookup

IP Lookup

30 pages

Load more
Download Study Guide
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Study Guide and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Study Guide 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?