MIT MAS 160 - Additional Notes: Error-Correcting Codes - D84032

Home> Schools> Massachusetts Institute of Technology> Media Arts and Sciences (Media Lab) (MAS) > MAS 160> Additional Notes: Error-Correcting Codes

DOC PREVIEW

MIT MAS 160 - Additional Notes: Error-Correcting Codes

School name Massachusetts Institute of Technology

Course Mas 160- Signals, Systems and Information for Media Technology

Pages 3

This preview shows page 1 out of 3 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 3 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 3 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

MAS 160/510 Additional Notes: Error-Correcting CodesThe fundamental idea behind coders for noisy discrete channels is redundancy. We’re going tomake individual messages somehow “more unique” so that noise won’t corrupt enough of the symbolsin a message to destroy its uniqueness, and we’re going to try to do it in a more organized way thanjust repeating the same message over and over again.Another key ingredient in coders for noisy channels is often called noise averaging, and we cansee what this means by a simple example. Supp ose on average 1% of the received bits have beencorrupted by the channel. Looking at a bit, can we tell if it’s a “good” bit or a “bad” bit? No. Ifwe look at a block of ten bits, can we identify the bad ones? No, but we can make some s tatisticalobservations about how likely it is that some number of bits is incorrect:p(10 good) = .9910≈ 9 × 10−1p(1 bad) = .999× .01 × (number of combinations :10!1!(10 − 1)!) ≈ 9 × 10−2p(2 bad) = .998× .012×10!2!(10 − 2)!≈ 4 × 10−3p(3 bad) = .997× .013×10!3!(10 − 3)!≈ 1 × 10−4Summing these and subtracting from 1, we find that the probability that more than three bits ina block have been corrupted is about 2 × 10−6. So on average only two blocks in a million willhave more than three bad bits. If we had some way of correcting no more than three bad bits outof a block of ten, 999,998 times out of a million we’d be alright, which might be good enough formany applications. Were we to make the blocks still bigger than 10 bits, we’d find that the averagenumber of bad bits per block would converge to the 1% figure.The redundancy requirement clearly suggests that if we have n-bit blocks we can’t allow all2npossible sequences to be valid codes, or else we wouldn’t know when a received block is bad.Going further, it seems as if correcting all possible errors of t bits/block will require us to makeevery legitimate message block differ from every other legitimate message block in at least 2t + 1bit positions (the number of bit positions in which two code blocks differ is called the Hammingdistance).An error-correcting code, then, is an orderly way of mapping some number of symbols into agreater number so as to increase the Hamming distance b e tween any two valid sequences. They canbe divided into two basic types:• Blo ck codes map a k-symbol input sequence into an n-symb ol output sequence, where each n-symbol output depends upon only the k symbols input and no others. Block codes are usuallyspecified as (n, k) codes, where k is in the range of 3 to a few hundred, and k/n is betweenroughly 1/4 and 7/8. Block codes are also sometimes called group codes.• Tree codes are more like state machines – each set of output symbols depends on the currentset of inputs as well as some number of earlier inputs. These codes are said to have memory,and are also called convolutional codes, since the encoding operation can be considered asconvolving a sequence of inputs with an “impulse response”.In this handout, we’ll look at block codes, which can be mo deled using binary linear algebra.The simplest block code is a parity-check code. We can write the (4,3) example as taking threebits (a0, a1, a2) and producing a 4-vector ~a. As this is usually written as a column vector, fortypographical conve nience we’ll show it transp os ed:~aT= (a0, a1, a2, a0+ a1+ a2),1where the addition is modulo-2, so 1 + 1 = 0 and subtraction is equivalent to addition. While thiswon’t correct errors (not enough distance between valid sequences to identify which bit is bad if thefourth entry isn’t the modulo-2 sum of the other three) it will identify blocks with a single bit error.A better code might be the (6, 3) code given by:~aT= (a0, a1, a2, a0+ a1, a1+ a2, a0+ a1+ a2).This can be written as a system of linear equations:a3= a0+ a1a4= a1+ a2a5= a0+ a1+ a2Since in modulo-2, −a = a we can saya0+ a1+ a3= 0a1+ a2+ a4= 0a0+ a1+ a2+ a5= 0and this can also be written as a “parity-check matrix”:H =110100011010111001.Now we can say that a valid code word is one that satisfies the equationH~a =~0.To test this idea, we can encode a particular sequence, say (100), which comes out as ~aT= (100101),and indeed we find that110100011010111001100101=000.There is a particular set of (2m− 1, 2m− m − 1) codes called Hamming codes, in which thecolumns of the parity-check matrix are the nonzero binary numbers up to 2m. Thus the matrix forthe (7, 4) Hamming code isH =000111101100111010101,and ~a is valid ifa3+ a4+ a5+ a6= 0a1+ a2+ a5+ a6= 0a0+ a2+ a4+ a6= 0.Since a0, a1, and a3appear only once here, it’ll be mathematically easier if we let a2, a4, a5, and a6be our 4-bit input block and compute the other three bits from the above equations. For example:(1010) ⇒ (a0a11a3010) = (1011010).2Now, how do we correct an error? Consider that we have sent some go od sequence ~g, but an errorpattern ~e has been added by the channel, so what is received is~r = ~g + ~e.Upon receiving ~r, we multiply it by H and get a vector ~s, called the “syndrome” (Webster: “Agroup of signs that occur together characterizing a particular abnormality.”). Thus~s = H~r = H~g + H~e.But recall that by definitionH~g =~0,so~s = H~e.In other words, the syndrome depends upon only the error, no matter what the message is. Let’stake our valid co de (1011010) above, and add an error in the last position: (1011011). Multiplyingit by H (try it yourself!) we get the answer~sT= (111),and therefore~eT= (0000001).In general, in Hamming codes, the syndrome resulting from an error in the n-th position is then-th column of the H array. But the n-th column is also the number of the bit position with theerror, making correction easy. More complicated methods exist for constructing codes that correctmultiple-bit errors, and for shuffling data to optimize for partcular kinds of errors (predominantlyimpulsive noise versus drop-outs of several bits in sequence, for instance). For more information seePeterson and We ldon, Error-Correcting Codes (Barker TK5101.P485), or Clark and Cain, Error-Correction Coding for Digital Communications (Barker

View Full Document