MIT 6 050J - Codes - D2009288

Home> Schools> Massachusetts Institute of Technology> Electrical Engineering and Computer Science (6) > 6 050J> Codes

DOC PREVIEW

MIT 6 050J - Codes

School name Massachusetts Institute of Technology

Course 6 050j- Information and Entropy

Pages 20

This preview shows page 1-2-19-20 out of 20 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 20 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 20 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 20 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 20 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 20 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

2 Codes2.1 Symbol Space Size2.2 Use of Spare Capacity2.2.1 Binary Coded Decimal (BCD)2.2.2 Genetic Code2.2.3 Telephone Area Codes2.2.4 IP Addresses2.2.5 ASCII2.3 Extension of Codes2.4 Fixed-Length and Variable-Length Codes2.4.1 Morse Code2.5 Detail: ASCII2.6 Detail: Integer Codes2.7 Detail: The Genetic Code*2.8 Detail: IP Addresses2.9 Detail: Morse CodeChapter 2CodesIn the previous chapter we examined the fundamental unit of information, the bit, and its various abstractrepresentations: the Boolean bit (with its asso ciated Boolean algebra and realization in combinational logiccircuits), the control bit, the quantum bit, and the classical bit.A single bit is useful if exactly two answers to a question are possible. Examples include the result of acoin toss (heads or tails), the gender of a person (male or female), the verdict of a jury (guilty or not guilty),and the truth of an assertion (true or false). Most situations in life are more complicated. This chapterconcerns ways in which complex objects can be represented not by a single bit, but by arrays of bits.It is convenient to focus on a very simple model of a system, shown in Figure 2.1, in which the input isone of a predetermined set of objects, or “symbols,” the identity of the particular symbol chosen is encodedin an array of bits, these bits are transmitted through space or time, and then are decoded at a later timeor in a different place to determine which symbol was originally chosen. In later chapters we will augmentthis model to deal with issues of robustness and efficiency.CoderChannelDecoder- - - -(Arrays of Bits) (Arrays of Bits)Input(Symb ols)Output(Symb ols)Figure 2.1: Simple model of a communication systemIn this chapter we will look into several aspects of the design of co des , and show some examples in whichthese aspects were either done well or not so well. Individual sections will describe codes that illustrate theimportant points. Some objects for which codes may be needed include:• Letters: BCD, EBCDIC, ASCII, Unicode, Morse Code• Integers: Binary, Gray, 2’s complementAuthor: Paul Penfield, Jr.This document: http://www.mtl.mit.edu/Courses/6.050/2008/notes/chapter2.pdfVersion 1.5, January 28, 2008 . Copyrightc 2008 Massachusetts Institute of Technolog yStart of notes · back · next | 6.050J/2.110J home page | Site map | Search | About this document | Comments and inquiries92.1 Symbol Space Size 10• Numbers: Floating-Point• Proteins: Genetic Code• Telephones: NANP, International codes• Hosts: Ethernet, IP Addresses, Domain names• Images: TIFF, GIF, and JPEG• Audio: MP3• Video: MPEG2.1 Symbol Space SizeThe first question to address is the number of symbols that need to be encoded. This is called the symbolspace size. We will consider symb ol spaces of different sizes:• 1• 2• Integral power of 2• Finite• Infinite, Countable• Infinite, UncountableIf the number of symbols is 2, then the selection can be encoded in a single bit. If the number of possiblesymbols is 4, 8, 16, 32, 64, or another integral power of 2, then the selection may be coded in the numberof bits equal to the logarithm, base 2, of the symbol space size. Thus 2 bits can designate the suit (clubs,diamonds, hearts, or spades) of a playing card, and 5 bits can encode the selection of one student in a class of32. As a special case, if there is only one symbol, no bits are required to specify it. A dreidel is a four-sidedtoy marked with Hebrew letters, and spun like a top in a children’s game, especially at Hanukkah. Theresult of each spin could be encoded in 2 bits.If the number of symbols is finite but not an integral power of 2, then the number of bits that would workfor the next higher integral power of 2 can be used to encode the selection, but there will be some unusedbit patterns. Examples include the 10 digits, the six faces of a cubic die, the 13 denominations of a playingcard, and the 26 letters of the English alphabet. In each case, there is spare capacity (6 unused patterns inthe 4-bit representation of digits, 2 unused patterns in the 3-bit representation of a die, etc.) What to dowith this spare capacity is an important design issue that will be discussed in the next section.If the number of symbols is infinite but countable (able to be put into a one-to-one relation with theintegers) then a bit string of a given le ngth can only denote a finite number of items from this infinite set.Thus, a 4-bit code for non-negative integers might designate integers from 0 through 15, but would not beable to handle integers outside this range. If, as a result of some computation, it were necessary to representlarger numbers, then this “overflow” condition would have to be handled in some way.If the number of symbols is infinite and uncountable (such as the value of a physical quantity like voltageor acoustic pressure) then some technique of “discretization” must be used to replace possible values by afinite number of selected values that are approximately the same. For example, if the numbers between 0and 1 were the symbols and if 2 bits were available for the coded representation, one approach might beto approximate all numbers between 0 and 0.25 by the number 0.125, all numbers betwee n 0.25 and 0.5 by0.375, and so on. Whether such an approximation is adequate depends on how the decoded data is used.2.2 Use of Spare Capacity 11The approximation is not reversible, in that there is no decoder which will recover the original symbol givenjust the code for the approximate value. However, if the number of bits available is large e nough, then formany purposes a decoder could provide a number that is close enough. Floating-point representation of realnumbe rs in computers is based on this philosophy.2.2 Use of Spare CapacityIn many situations there are some unused code patterns, be cause the number of symbols is not an integralpower of 2. There are many strategies to deal with this. Here are some:• Ignore• Map to other values• Reserve for future expansion• Use for control codes• Use for common abbreviationsThese approaches will be illustrated with examples of common codes.2.2.1 Binary Coded Decimal (BCD)A common way to represent the digits 0 - 9 is by the ten four-bit patterns shown in Table 2.1. There aresix bit patterns (for example 1010) that are not used, and the question is what to do with them. Here are afew ideas that come to mind.First, the unused bit patterns might simply be ignored.

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-19-20 out of 20 pages.

MIT 6 050J - Codes

Sign up for free to view:

Please select your school