DOC PREVIEW
MIT 6 050J - Codes

This preview shows page 1-2-20-21 out of 21 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

2 Codes2.1 Symbol Space Size2.2 Fixed-Length and Variable-Length Codes2.2.1 Fixed Length2.2.2 Variable Length2.3 Use of Spare Capacity2.3.1 Binary Coded Decimal (BCD)2.3.2 Genetic Code2.3.3 Telephone Area Codes2.3.4 IP Addresses2.3.5 ASCII2.4 Extension of Codes2.5 Detail: ASCII2.6 Detail: Integer Codes2.7 Detail: The Genetic Code*2.8 Detail: IP Addresses2.9 Detail: Morse CodeChapter 2CodesIn the previous chapter we examined the fundamental unit of information, the bit, and its various abstractrepresentations: the Boolean bit (with its associated Boolean algebra and realization in combinational logiccircuits), the control bit, the physical bit, the quantum bit, and the classical bit.A single bit is useful if exactly two answers to a question are possible. Examples include the result of acoin toss (heads or tails), the gender of a person (male or female), the verdict of a jury (guilty or not guilty),and the truth of an assertion (true or false). Most situations in life are more complicated. This chapter isconcerned with how complex objects can be represented not by a single bit, but by arrays of bits.It is convenient to focus on a very simple model of a system, shown in Figure 2.1, in which the input isone of a predetermined set of objects, or “symbols,” the identity of the particular symbol chosen is encodedin an array of bits, these bits are transmitted through space or time, and then are decoded at a later timeor in a different place to determine which symbol was originally chosen. In later chapters we will augmentthis model to deal with issues of robustness and efficiency.CoderChannelDecoder- - - -(Arrays of Bits) (Arrays of Bits)Input(Symbols)Output(Symbols)Figure 2.1: Simple model of a communication systemThe basic idea behind the design of codes is very simple: assign a different sequence of bits to eachsymbol being encoded. However, there are many ways to do this, some better than others. In this chapterwe will look at several aspects of code design, and show cases where the code was designed either well or notso well.Objects for which codes are include:• Letters of the alphabet: BCD, EBCDIC, ASCII, Unicode, Morse CodeAuthor: Paul Penfield, Jr.This document: http://www.mtl.mit.edu/Courses/6.050/2010/notes/chapter2.pdfVersion 1.6, February 2, 2010. Copyrightc 2010 Massachusetts Institute of TechnologyStart of notes · back · next | 6.050J/2.110J home page | Site map | Search | About this document | Comments and inquiries92.1 Symbol Space Size 10• Integers: Binary, Gray, 2’s complement• Numbers: Floating-Point• Proteins: Genetic Code• Telephones: NANP, International codes• Hosts: Ethernet, IP Addresses, Domain names• Images: TIFF, GIF, and JPEG• Audio: MP3• Video: MPEG2.1 Symbol Space SizeIn designing a code, the first question to ask is how many symbols need to be encoded. This is called thesymbol space size. We will consider symbol spaces of different sizes:• 1• 2• Integral power of 2• Finite• Infinite, Countable• Infinite, UncountableIf there is only 1 symbol, no bits are required to specify it (the symbol is already known). If there areexactly 2 symbols, the selection can be encoded in a single bit. If the number of possible symbols is 4, 8, 16,32, 64, or another integral power of 2, then the selection may be coded in the number of bits equal to thelogarithm, base 2, of the symbol space size. Thus 2 bits can designate the suit (clubs, diamonds, hearts, orspades) of a playing card, and 5 bits can encode the selection of one student in a class of 32. A dreidel isa four-sided toy marked with Hebrew letters, spun like a top in a children’s game, especially at Hanukkah.The result of each spin could be encoded in 2 bits.If the number of symbols is finite but not an integral power of 2, then the number of bits that would workfor the next higher integral power of 2 can be used to encode the selection, but there will be some unusedbit patterns. Examples include the 10 digits, the six faces of a cubic die, the 13 denominations of a playingcard, and the 26 letters of the English alphabet. In each case, there is spare capacity (6 unused patterns inthe 4-bit representation of digits, 2 unused patterns in the 3-bit representation of a die, etc.) What to dowith this spare capacity is an important design issue that will be discussed in the next section.If the number of symbols is infinite but countable (able to be put into a one-to-one relation with theintegers) then a bit string of a given length can only denote a finite number of items from this infinite set.Thus, a 4-bit code for non-negative integers might designate integers from 0 through 15, but would not beable to handle integers outside this range. If, as a result of some computation, it were necessary to representlarger numbers, then this “overflow” condition would have to be handled in some way.If the number of symbols is infinite and uncountable (such as the value of a physical quantity like voltageor acoustic pressure) then some technique of “discretization” must be used to replace a possible value by oneof a finite number of preselected values that is approximately the same. For example, if the numbers between0 and 1 were the symbols and if 2 bits were available for the coded representation, one approach might beto approximate all numbers between 0 and 0.25 by the number 0.125, all numbers between 0.25 and 0.5 by2.2 Fixed-Length and Variable-Length Codes 110.375, and so on. Whether such an approximation is adequate depends on how the decoded data is used.The approximation is not reversible, in that there is no decoder which will recover the original symbol givenjust the code for the approximate value. However, if the number of bits available is large enough, then formany purposes a decoder could provide a number that is close enough. Floating-point representation of realnumbers in computers is based on this philosophy.2.2 Fixed-Length and Variable-Length CodesA decision must be made very early in the design of a code whether to represent all symbols with codesof the same number of bits (fixed length) or to let some symbols use shorter codes than others (variablelength). Both schemes have their advantages.Fixed-length codes are usually easier to deal with because both the coder and decoder know in advancehow many bits are involved. With variable-length codes, the decoder needs to tell when the code for onesymbol ends and the next one begins. On the other hand,


View Full Document

MIT 6 050J - Codes

Download Codes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Codes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Codes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?