DOC PREVIEW
MIT 6 033 - Fault-tolerant Computing

This preview shows page 1-2-3-4 out of 12 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Fault-tolerant Computing Frans Kaashoek 6.033 Spring 2007 April 4, 2007Where are we in 6.033? • Modularity to control complexity • Names are the glue to compose modules • Strong form of modularity: client/server • Limit propagation of errors • Implementations of client/server: • In a single computer using virtualization • In a network using protocols • Compose clients and services using names • DNSHow to respond to failures? • Failures are contained; they don’t propagate • Benevolent failures • Can we do better? • Keep computing despite failures? • Defend against malicious failures (attacks)? • Rest of semester: handle these “failures” • Fault-tolerant computing • Computer securityFault-tolerant computing • General introduction: today • Replication/Redundancy • The hard case: transactions • updating permanent data in the presence of concurrent actions and failures • Replication revisited: consistencyA fatal exception 0E has occurred at 0028:C00068F8 in PPT.EXE<01> +000059F8. The current application will be terminated.* Press any key to terminate the application.* Press CTRL+ALT+DEL to restart your computer. You will lose any unsaved information in all applications. Press any key to continueWindowsAvailability in practice • Carrier airlines (2002 FAA fact book) • 41 accidents, 6.7M departures  99.9993% availability • 911 Phone service (1993 NRIC report) • 29 minutes per line per year  99.994% • Standard phone service (various sources) • 53+ minutes per line per year  99.99+% • End-to-end Internet Availability  95% - 99.6%Disk failure conditional probability distribution Expected operating lifetime 1 / (reported MTTF) Infant mortality Burn out Bathtub curveHuman Mortality Rates (US, 1999) From: L. Gavrilov & N. Gavrilova, “Why We Fall Apart,” IEEE Spectrum, Sep. 2004. Data from http://www.mortality.orgFail-fast disk failfast_get (data, sn) { get (s, sn); if (checksum(s.data) = s.cksum) { data ← s.data; return OK; } else { return BAD; } }Careful disk careful_get (data, sn) { r ← 0; while (r < 10) { r ← failfast_get (data, sn); if (r = OK) return OK; r++; } return BAD; }Durable disk (RAID 1) durable_get (data, sn) { r ← disk1.careful_get (data, sn); if (r = OK) return OK; r ← disk2.careful_get (data, sn); signal(repair disk1); return r;


View Full Document

MIT 6 033 - Fault-tolerant Computing

Documents in this Course
TRIPLET

TRIPLET

12 pages

End Layer

End Layer

11 pages

Quiz 1

Quiz 1

4 pages

Threads

Threads

18 pages

Quiz I

Quiz I

15 pages

Atomicity

Atomicity

10 pages

QUIZ I

QUIZ I

7 pages

Load more
Download Fault-tolerant Computing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Fault-tolerant Computing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Fault-tolerant Computing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?