DOC PREVIEW
Berkeley ELENG C249A - Distributed Real-time Applications Fault Tolerant Scheduling

This preview shows page 1-2-3-4-24-25-26-50-51-52-53 out of 53 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

DRAFTS Distributed Real-time Applications Fault Tolerant SchedulingMotivationSlide 3Problem OverviewFaultsFault ModelSoftware RedundancyN-copies SolutionRedundancy ManagementPossible solutionsAutomotive DomainShortcomings of OTS solutionsSynthesis-based SolutionSchedule SynthesisSlide 15ContributionsProgramming ModelStatic Data-flow ModelPendulum ExampleModel ExtensionsData Tokens: EpochData Tokens: ValidFTDataFlow modelingActor ClassesSlide 25Simulation outputSummary on FTDFArchitecture ModelFault BehaviorSynthesis ProblemSlide 31Refined I/OFull ReplicationSlide 34Schedule Synthesis StrategyGenerating SchedulesSlide 37Merge into FTSHeuristic 1: Limit CPU LoadHeuristic 2: Limit Bus LoadTotal OrdersSchedule optimizationActive ReplicasDeallocation & DegradationAggressive Heuristics(Off-line) VerificationFunctional VerificationFunctional Verification (example - continued)F.Verification commentsConclusionsFuture WorkDBW exampleNow…DRAFTS1DRAFTSDistributed Real-time Applications Fault Tolerant SchedulingClaudio Pinello ([email protected])DRAFTS2Motivation•Drive-by-Wire applicationsDRAFTS3Motivation•No rods  increased passive safety•Interior design freedomBMW, Daimler, Cytroen, Chrysler, Bertone, SKF, etc…DRAFTS4Problem Overview•Fault tolerance: redundancy is key•Safety: system failure must be as unlikely as in traditional systemsDRAFTS5Faults •SW faults: bugs–can be reduced by disciplined coding–even better by code generation•HW faults–harsh environment–many units (>50 uProcessors in a car; subsystems with 10-15 uP’s)DRAFTS6Fault Model•Silent Faults–faults result in omission errors•Detectable Faults–faults result in detectably corrupted data (e.g. CRC-protected channels)•Non-silent Faults–faults result in value errors •Byzantine Faults–malicious attacks, non-silent faults, unbounded delays, etc…DRAFTS7Software Redundancy•Space redundancy–execute replicas on different HW–send results on different/multiple channelsDRAFTS8N-copies Solution•Pros:–reduced cost•Cons:–degradation, 1x speed–multiple designsAbstractinputFineCTRLArbiterBest AbstractOutIteratorCoarseCTRLPlantAbstractinputFineCTRLArbiterBest AbstractOutIteratorCoarseCTRLPlantAbstractinputFineCTRLArbiterBest AbstractOutIteratorCoarseCTRLPlantAbstractinputFineCTRLArbiterBest AbstractOutIteratorCoarseCTRLPlantAbstractinput AbstractOutIteratorPlantAbstractinput AbstractOutIteratorPlant•Pros:–design once•Cons:–N-x costs, 1x speedDRAFTS9Redundancy Management•Managing a distributed system with multiple results requires careful programming–keep N-copies synchronized–exchange and apply results–detect and isolate faults –recoverDRAFTS10Possible solutionsOff-The-Shelf solutions•TTP-based architectures •FT-CORBA middle-wareSynthesis•Debugged and portable librariesDevelopment toolsDRAFTS11Automotive Domain •Production costs dominate NRE costs–multi-vendor supply-chain–interest in full utilization of architectures•Validation and certification are critical–validate process–validate productDRAFTS12Shortcomings of OTS solutions•TTP–proprietary communication network–network redundancy default is 2-way–active replication  potential underutilization of resources•FT CORBA–fairly large overhead middlewareDRAFTS13Synthesis-based Solution•Synthesize only needed glue-code–at the extreme: get rid of OS•Customizable replication mechanisms –use passive replicas•Treat architecture as a distributed execution machine–exploit parallelism to speed up executionDRAFTS14Schedule SynthesisAbstractinputFineCTRLArbiterBest AbstractOutIteratorCoarseCTRLPlantCPUCPUCPUCPUCPUCPUMappingFineCTRLIteratorCoarseCTRLSensSensSensActActPlantInputInputCoarseCTRLArbiterBestArbiterBestOutputOutputIteratorIteratorCPUCPUCPUCPUCPUCPUActInputArbiterBestSensSensSens InputCoarseCTRL CoarseCTRLFineCTRLActOutputOutputArbiterBestDRAFTS15Synthesis-based Solution•Enables fast architecture explorationDRAFTS16Contributions•Programming Model•Metropolis platform•Schedule synthesis tool and optimization strategy•Verification ToolsDRAFTS17Programming Model•Definition of a programming model that–Is amenable to specifying feedback controllers–Is convenient for analysis, simulation and synthesis –Supports degraded functionality/accuracy–Supports redundancy–DeterministicDRAFTS18Static Data-flow Model•Pros:–Deterministic behavior •Actors perform deterministic computation (no internal states) •Requires all inputs to fire an actor–Explicit parallelism–Good for periodic algorithms•Shortcomings:–Requires all inputs to fire an actor, but source actors may fail!ABCDRAFTS19Pendulum ExampleAbstractinputFineCTRLArbiterBest AbstractOutIteratorCoarseCTRLPlantBang-BangLinearDRAFTS20Model Extensions•Node Criticality •Node Typing (sensor, input, arbiter, etc.)•Some types (input and arbiter) can fire with missing inputs•Tokens have “Epoch” and “Valid” fields•Specialized single-place buffer links–manage redundant sources (and destinations)DRAFTS21Data Tokens: Epoch•iteration index of the periodic algorithm•Actors ask for “current” inputs•Using >= we can account for missing results (self-synchronization)EpochData ValidDRAFTS22Data Tokens: Valid•Valid models the effect of fault detection:–True: data was received/produced correctly–False: data was not received on time or was corrupted•Firing rules (and actors) may use it to change their behaviorEpochData ValidDRAFTS23FTDataFlow modeling•Metropolis used as framework to develop the set of tools•FTDF is a platform library in Metropolis–modeling, simulation, fault injection–supports semi-automatic replication–results visualizationDRAFTS24Actor Classes•DF_SENactor sensor actor•DF_INactor input actor•DF_AINactor abstract input actor•DF_FUNactor data-flow actor•DF_ARBactor arbiter actor•DF_AOUTactor abstract output actor•DF_OUTactor output actor •DF_ACTactor actuator actor•DF_MEM state memory•DF_Injector fault injectionDRAFTS25Pendulum ExampleAbstractinputFineCTRLArbiterBest AbstractOutIteratorCoarseCTRLPlantInjectDRAFTS26Simulation outputFaultDRAFTS27Summary on FTDF•Extended SDF to deal with–missing/redundant inputs–different criticality–functionality types•Developed Metropolis platform–modeling, simulation, fault-injection, visualization of results–support for adding redundancyDRAFTS28Architecture Model•Architecture


View Full Document

Berkeley ELENG C249A - Distributed Real-time Applications Fault Tolerant Scheduling

Documents in this Course
Load more
Download Distributed Real-time Applications Fault Tolerant Scheduling
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Distributed Real-time Applications Fault Tolerant Scheduling and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Distributed Real-time Applications Fault Tolerant Scheduling 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?