DOC PREVIEW
Berkeley COMPSCI 152 - Lecture 16 – Error Correcting Codes

This preview shows page 1-2-3-27-28-29 out of 29 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 152 Computer Architecture and Engineering Lecture 16 Error Correcting Codes 2006 10 24 John Lazzaro www cs berkeley edu lazzaro TAs Udam Saini and Jue Sun www inst eecs berkeley edu cs152 CS 152 L16 Error Correcting Codes UC Regents Fall 2006 UCB 1 F05 TA Notes Final Project Checkoff The SyncMeisters had everything working besides the test file hammer However hammer is the hardest test so in a sense their project was a bug or two away from working Writing back to the regfile when a stall occurs on the cache just before or after needing to write seems to mess up their processor The other tests worked fine on board CS 152 L16 Error Correcting Codes UC Regents Fall 2006 UCB 2 When cache bugs make it to product Testing our financial trading system we found a case where our software would get a bad calculation Once a week or so Eventually the problem turned out to be a failure in a CPU cache line refresh This was a hardware design fault in the PC The test suite included running the code for two weeks at maximum update rate without error so this bug was found Eric Ulevik CS 152 L16 Error Correcting Codes UC Regents Fall 2006 UCB 3 Today Computing in an imperfect world Detecting and correcting RAM bit errors Replacing lost network packets recovering from disk drive failure Detecting arbitrary bit errors in network packets CS 152 L16 Error Correcting Codes UC Regents Fall 2006 UCB 4 DRAM Challenge Cosmic Rays Bit Line Word Line Vdd Cell capacitor holds 25 000 electrons or less Cosmic rays that constantly bombard us can release the charge n oxide pCS 152 L16 Error Correcting Codes oxide n Cosmic ray hit UC Regents Fall 2006 UCB 5 Can this happen in SRAM Gnd Vdd Cosmic ray discharges C Vdd Gnd Gnd Vdd P1 P2 Gnd A race Can P1 restore middle node to Vdd before P2 flips other node CS 152 L16 Error Correcting Codes UC Regents Fall 2006 UCB 6 Practical effect of a cosmic ray ADDIU R1 R0 7 SW R1 100 R0 Address 100 0b00 0111 Cosmic ray hit LW R1 100 R0 Address 100 0b00 0011 After LW R1 holds 3 but it should hold 7 Bit flips on memory holding instructions are bad too CS 152 L16 Error Correcting Codes UC Regents Fall 2006 UCB 7 To detect errors add P a parity bit Extra parity bit for every word Not seen by software Hardware computes it on every write so that the number of 1 s in P every 33 bit word is even even parity Address 100 0b00 0111 1 Does this work if two bits flip If three Cosmic ray hit Address 100 0b00 0011 1 On a read count the number of 1s If odd a bit flipped So halt the program and reboot Application may know if this bit matters but there s no API to ask it CS 152 L16 Error Correcting Codes UC Regents Fall 2006 UCB 8 Error Correction Hamming Codes Richard Hamming Computing pioneer Famous quote Computers are not for numbers Computers are for understanding CS 152 L16 Error Correcting Codes UC Regents Fall 2006 UCB 9 Trick Compute parity of subsets of bits Consider 4 bit words D D D D 0 1 10 Add 3 parity bits P P P Each parity bit computed on a subset of bits P D xor D xor D 0 P D xor D xor D 0 P D xor D xor D 0 1 xor 1 xor 1 xor 1 0 xor 0 1 xor 0 1 xor Use this word bit arrangement D D D P D P P 0 1 1 0 01 1 Just believe for now we will justify later CS 152 L16 Error Correcting Codes UC Regents Fall 2006 UCB 10 Case 1 No cosmic ray hits We write Later we read D D D P D P P 0 1 1 0 01 1 No errors but how do we know that D D D P D P P 0 1 1 0 01 1 On readout we compute P xor D xor D xor D 0 P xor D xor D xor D 1 P xor D xor D xor D 1 0 xor 0 xor 0 xor 1 xor 1 xor 1 xor 1 0 C xor 0 0 C1 xor 0 0 C0 xor If C C C 0 no errors These equations come from how we computed P P P P D xor D xor D 0 P D xor D xor D 0 P D xor D xor D 0 CS 152 L16 Error Correcting Codes 1 xor 1 xor 1 xor 1 0 xor 0 1 xor 0 1 xor UC Regents Fall 2006 UCB 11 Case 2 A cosmic ray hits We write D D D P D P P 0 1 1 0 01 1 D D D P D P P 0 1 0 0 01 1 Later we read Cosmic ray hit D1 But how do we know that On readout we compute P xor D xor D xor D 0 P xor D xor D xor D 1 P xor D xor D xor D 1 Note we number the least significant bit with 1 not 0 0 is reserved for no errors CS 152 L16 Error Correcting Codes 0 xor 0 xor 0 xor 1 xor 1 xor 0 xor 0 1 C C C C b101 5 xor 0 0 C1 What does xor 0 1 C0 xor 7 654 3 2 1 D D D P D P P 0 1 0 0 01 1 5 mean The position of the flipped bit To repair just flip it back UC Regents Fall 2006 UCB 12 Why did we choose 3 parity bits Consider 4 bit words D D D D 0 1 10 Add 3 parity bits A Ci in C C C exists for each Pi P P P Observation The C C C bits need to encode the no error condition plus a number for each bit both data and parity bits For p parity bits and d data bits p d p 1 2 CS 152 L16 Error Correcting Codes UC Regents Fall 2006 UCB 13 Why did we arrange bits as we did Consider 4 bit words D D D D Add 3 parity bits P P P How do we re arrange bits With this order an odd parity means an error in 1 3 5 or 7 So P0 is the right parity bit to use C C C CS 152 L16 Error Correcting Codes 7 6 54 3 2 1 D D D P D P P D D D P D P P D D D P D P P D D D P D P P An odd parity means a mistake must be in 2 3 6 or 7 the four numbers possible if C1 1 A Ci in C C C exists for each Pi Start by numbering 1 to 7 Etc each bit …


View Full Document

Berkeley COMPSCI 152 - Lecture 16 – Error Correcting Codes

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Loading Unlocking...
Login

Join to view Lecture 16 – Error Correcting Codes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 16 – Error Correcting Codes and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?