Unformatted text preview:

Page: 1 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 29 SIFT – SIFT = Software Implemented Fault Tolerance » Ultra reliable computer for critical aircraft control applications.  (desired probability of failure < 10-10/h over 10h mission) » Developed by SRI International for NASA Langley Research Center » Major goal was the application of formal verification of the critical functions  achieve fault-tolerance as much as possible by programs » Byzantine General Problem discovery » Objective for operating system was to hide the fault-tolerant mechanism from the application » Proof-of-concept demonstrator => bad news: 70-80% overhead (due to executives)Page: 2 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 29 SIFT  System Overview – structure of the SIFT system Wen78 fig.1Page: 3 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 29 SIFT  System Overview (cont.) – loosely coupled multiprocessor system » implements message passing » no shared memory – up to 8 processors » Bendix BDX930 processors, 32kB RAM, λ approx. 10-3 – general processors and I/O processors » have private memory » I/O procs. have less RAM – complete interconnectionPage: 4 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 29 SIFT  SIFT Operating System Functions – scheduling » OS must cause application tasks to be executed at proper time. – synchronization » OS must keep processors moderately synchronized (within 50µs) – consistency » OS must implement voting at input to process – communication » results of tasks must be transmitted to all processors in timely manner, i.e. shared resultsPage: 5 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 29 SIFT – fault masking » erroneous data, generated by faulty processor, must not be allowed to propagate, and source of erroneous data should be identified » => NMR – reconfiguration » OS must be able to detect a faulty unit and reconfigure » => graceful degradation (hybrid NMR) – input/output » OS controls received data from sensors and transmitted data to actuators » separate busses => sensors possibly act as faulty generalPage: 6 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 29 SIFT  Fault Isolation – provided at the boundaries between processors and buses – processor can read memory of other processors, but can write only to its own memory. – a non-faulty processor can obtain bad data due to bad bus – invalid control signals should not produce incorrect behavior of non-faulty processorsPage: 7 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 29 SIFT  Fault Masking – faulty units can provide processors with bad data – tasks place their outputs into a pre-broadcast buffer – this data is then broadcast to other processors – values from all processors are collected in the pre-vote buffer – the voted majority is then placed into the post-vote buffer – buffer management » 8 x 128 word data buffers » 1 buffer per processor » any message received from Processor i goes to Buffer i » each message is tagged with ID (0,...127) => 128 shared variable namesPage: 8 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 29 SIFT buffer management and votingPage: 9 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 29 SIFT  Voter – Voting on exact match basis => bit-by-bit majority voting – If the voter cannot find a majority » a default value specified by the application is selected – After identifying faulty unit, system is reconfigured – The number of processors executing a task can vary with the task » dynamic configuration of task redundancy » takes advantage of the fact that in flight control particular tasks are more critical than othersPage: 10 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 29 SIFT  Scheduler - outputs must be generated with specified frequency - delay between reading and outputting result must be bounded – table driven, making use of several independent schedules: » task schedule  list of tasks to execute, how often, and when. » Pre-vote schedule  list of buffers to be moved and data-file to the pre-vote buffer, how often they should be moved, an when. » voter schedule  list of buffers to be voted on, how often should be voted and when. » broadcast schedule  list of buffers to be broadcast, how often, and when. » copy schedule  list of buffers to be copied from one slot in the post-vote buffer to another.Page: 11 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 29 SIFT – Frame based, periodic schedulingPage: 12 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 29 SIFT – Scheduling strategies considered: » non-preemptive fixed priority scheduling  did not allow sufficient flexibility » priority scheduling  tasks with fastest iteration rate are given highest priority  works if utilization is below ln(2) » earliest deadline first  not useful, since task durations are very short » simply periodic scheduling  all periods are multiples of smaller periods – SIFT implements simple periodic scheduling with integer multiplicity of subframes.Page: 13 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 29 SIFT – Frame base scheduling » each task is assigned to one of several priority levels » each priority level corresponds to an iteration rate » each iteration rate is an integer multiple of the next lower one » subframes are 1.6 to 3.2 ms long » frames are up to 50 subframes long (< 160ms) » each task must fit into one subframe  left time is used for background diagnostic task » results are available in next subframe » schedule is table driven and static » at subframe boundary  suspend task  broadcast all data from last frame  invoke voting task  start scheduled taskPage: 14 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 29 SIFT  Clock Synchronization – observed scenario triggered


View Full Document

UI CS 449 - SIFT

Course: Cs 449-
Pages: 28
Download SIFT
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view SIFT and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view SIFT 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?