DOC PREVIEW
Berkeley COMPSCI 70 - n8

This preview shows page 1-2 out of 7 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

EECS 70 Discrete Mathematics and Probability TheorySpring 2014 Anant Sahai Note 8Error Correcting CodesWe will consider two situations in which we wish to convey information on an unreliable channel. The first isexemplified by the internet, where the information (say a file) is broken up into packets, and the unreliabilityis manifest in the fact that some of the packets are lost (or erased) during transmission. Moreover the packetsare labeled with headers so that the recipient knows exactly which packets were received and which weredropped. We will refer to such errors as erasure errors. See the figure below:In the second situation, some of the packets are corrupted during transmission due to channel noise. Nowthe recipient has no idea which packets were corrupted and which were received unmodified:In the above example, packets 1 and 4 are corrupted. These types of errors are called general errors. Wewill discuss methods of encoding messages, called error correcting codes, which are capable of correctingboth erasure and general errors. The principle is to embed a message into a codeword (a process calledencoding), transmit the codeword, and then recover the message from the damaged/corrupted symbols thatare received (a process called decoding).This area of study is a part of “Information Theory,” one of the core computer sciences1along with the theoryof computation, control theory, communication theory, and estimation/learning theory. At the NationalScience Foundation, these are (largely) currently in the Computer and Information Science and Engineering(CISE) directorate in the Division of Computing and Communication Foundations (CCF). In particular,there is a rich area of coding theory and here in this note, we will simply touch a little part of it. If you wantto learn more, this material is built upon in 121, 229b, 229a, and other courses.On the practical side, this material is intensely useful. Every time you use your cellphone, satellite TV, DSL,cable-modem, hard-disk drive, solid-state drive, CD-ROM, DVD, Blu-ray, etc., error correcting codes arecrucial. Related ideas are being deployed in the Internet for streaming and in modern data-centers to makecloud computing and distributed storage possible. Perhaps most surprisingly, the actual (polynomial-based)codes that you are studying in this lecture notes are called Reed-Solomon2codes and are essentially what areused in many applications. These are representative of what are called algebraic-geometry codes, because1This is also reflected in an interesting history. The IEEE was formed by the merger of two societies — the Institute of RadioEngineers and the American Institute of Electrical Engineers. The Radio Engineers were intimately involved with communication,cryptography, radar, etc. The IRE and AIEE merged to form the IEEE in 1963, but the IRE was already bigger than the AIEE by1957. This merger was natural because electronics was becoming an important implementation substrate for both sets of engineersand the modern theory of electrical systems used the same mathematics as radio and signal processing (as well as control). This iswhy the word “EE” is often used to describe these computer sciences. At Berkeley, of course, this is all just a historical footnotesince you simply experience EECS as a single entity, but the history remains in some of the course numbers.2Like RSA, these carry the names of their inventors Reed and Solomon. The modern use of these codes is largely related to whatare called BCH codes, named after Bose, Chaudhuri, and Hocquenghem. We cannot get into these differences but the WikipediaEECS 70, Spring 2014, Note 8 1of their connections to algebraic geometry — the branch of mathematics that studies the roots of polynomialequations.Returning to the problem at hand. Assume that the information consists of n packets3. We can assumewithout loss of generality that the contents of each packet is a number modulo q (denoted by GF(q)), whereq is a prime. For example, the contents of the packet might be a 32-bit string and can therefore be regardedas a number between 0 and 232− 1; then we could choose q to be any prime4larger than 232. The propertiesof polynomials over GF(q) (i.e., with coefficients and values reduced modulo q) are the backbone of botherror-correcting schemes. To see this, let us denote the message to be sent by m1, . . . , mnand make thefollowing crucial observations:1) There is a unique polynomial P(x) of degree n − 1 such that P(i) = mifor 1 ≤ i ≤ n (i.e., P(x) containsall of the information about the message, and evaluating P(i) gives the intended contents of the i-th packet).2) The message to be sent is now m1= P(1), . . . , mn= P(n). We can generate additional packets by evaluat-ing P(x) at additional points n + 1, n + 2, . . . , n + j (remember, our transmitted codeword must be redundant,i.e., it must contain more packets than the original message to account for the lost or corrupted packets).Thus the transmitted codeword is c1= P(1), c2= P(2), . . . , cn+ j= P(n + j). (Alternatively, we can use then numbers defining the message as the coefficients of a degree n − 1 polynomial directly. Either way, thetransmitted codeword will be evaluations of this polynomial.) Since we are working modulo q, we mustmake sure that n + j ≤ q, but this condition does not impose a serious constraint since q is presumed to bevery large.Erasure ErrorsHere we consider the setting of packets being sent over the internet. In this setting, the packets are la-beled and so the recipient knows exactly which packets were dropped during transmission. One additionalobservation will be useful:3) By Property 2 in Note 7, we can uniquely reconstruct P(x) from its values at any n distinct points, sinceit has degree n − 1. This means that P(x) can be reconstructed from any n of the transmitted packets.Evaluating this reconstructed polynomial P(x) at x = 1, . . . , n yields the original message m1, . . . , mn. (Oralternatively, if the message was encoded into the polynomial’s coefficients, we could just read the messageout by simplifying the polynomial.)Recall that in our scheme, the transmitted codeword is c1= P(1), c2= P(2), . . . , cn+ j= P(n + j). Thus, ifwe hope to be able to correct k errors, we simply need to set j = k. The encoded codeword will then consistof n + k packets.ExampleSuppose Alice wants to send Bob a message of n = 4 packets and she wants to guard against k = 2 lostpackets.


View Full Document

Berkeley COMPSCI 70 - n8

Documents in this Course
Notes 1

Notes 1

12 pages

Note 2

Note 2

6 pages

Notes

Notes

6 pages

Notes

Notes

7 pages

Note 10

Note 10

8 pages

n20

n20

10 pages

n19

n19

10 pages

n18

n18

10 pages

n17

n17

6 pages

n16

n16

9 pages

n15

n15

10 pages

n14

n14

9 pages

n13

n13

6 pages

n12

n12

7 pages

n11

n11

6 pages

n10

n10

8 pages

n9

n9

5 pages

n7

n7

8 pages

n6

n6

5 pages

n5

n5

8 pages

n4

n4

7 pages

n3

n3

10 pages

n2

n2

7 pages

n1

n1

5 pages

Load more
Download n8
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view n8 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view n8 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?