UI CS 449 - Fail-Stop Processes
Course Cs 449-
Pages 5

Unformatted text preview:

© 2007 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 23Fail-Stop ProcessesDiscussion based on the paper: Byzantine Generals in Action: Implementing Fail-Stop Processors, Fred B. Schneider, ACM Transactions on Computer Systems, Vol. 2, No..2, pp. 145-154, May 1984FSP-Properties–Halt-on-Failure Property»It will halt before performing an erroneous state transition visible to other proc's.–Failure Status Property»Any non-faulty process can detect the halting of any other process.–Stable Storage Property»Part of the processes memory is “stable”, i.e. unaffected by failurereadable by other processors1 © 2007 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 23Fail-Stop ProcessesGiven FSPs, design a reliable system–Non-trivial problem! (e.g. Hypercube)»needs re-routing (optimal)»reconfiguration»reallocationHow does one implement a FSP?–Impossible with finite hardware–Build a k-FSP–Fails safe for2© 2007 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 23Fail-Stop Processes–Assume stable storage, then the behavior of a FSP is characterized by:IF k+1 requests AND requests are identical AND requests are from different processes AND NOT failedTHEN process operationELSE failed=TRUE–Stable storage assumption may be quite optimistic.–Special design-considerations are necessary.3 © 2007 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 23Fail-Stop Processes–K-FSP are based on two types of real processes4© 2007 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 23Fail-Stop Processes–Block Diagram5 © 2007 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 23Fail-Stop ProcessesAssumptions–Network Assumptions»Messages are delivered uncorrupted»Origin of messages can be authenticated by receiver–Operating Assumptions»Ps fail independently»Failure of P is detected by S-Processes when P-Processes try to write.»Disagreement on a write request is confirmed by the S-Processes.»Agreement on a request must be reached before executing the write.»Only M1 ,M2, ..., M2k +1 are visible to outside (of FSP).6© 2007 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 23Fail-Stop Processes–Redundant in all P-Processes:»P broadcasts write request to all S's»S's exchange values+vote (Byzantine safe). P is commander, S's are lieutenants.–Operation IF all S agree THEN write ELSE stop machine7 © 2007 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 23Fail-Stop ProcessesStable Storage–Majority of copies are correct and identical.–A non-faulty FSP can always write to its own stable storage.–Any non-faulty process can read any stable storage.–Value of a memory location is maj(M1, ... , M2k +1)–An S-proc can write:IF exactly 1 request is received from each PAND all proc's are identicalTHEN writeELSE set a “failed” flag in memory and stop8© 2007 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 23Fail-Stop ProcessesOn the Number of Processors–Assume the application needs N processors»If we want to tolerate k faults we need N + k FSPs » i.e. (N + k) k-FSPs–Naive implementation»to implement 1 FSPk + 1 P-Proc's and 2k + 1 S-Proc's = 3k + 2»then to implement the N+k FSPs(N + k)(3k + 2) that’s a lot of processors!9 © 2007 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 23Fail-Stop Processes–It could be considered wasteful to dedicate an entire processor to running an S-Process.–Therefore assume a single processor is able to run s


View Full Document

UI CS 449 - Fail-Stop Processes

Course: Cs 449-
Pages: 5
Download Fail-Stop Processes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Fail-Stop Processes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Fail-Stop Processes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?