University of Toronto University of Toronto Department of Computer Science Risk Management Lecture 10 Risk General ideas about Risk Risk Management About Risk Risk is the possibility of suffering loss Risk itself is not bad it is essential to progress The challenge is to manage the amount of risk Identifying Risks Assessing Risks Department of Computer Science Two Parts Risk Assessment Risk Control Case Study Mars Polar Lander Useful concepts For each risk Risk Exposure RE p unsat outcome X loss unsat outcome For each mitigation action Risk Reduction Leverage RRL REbefore REafter cost of intervention 1 Easterbrook 2004 University of Toronto University of Toronto Department of Computer Science Principles of Risk Management Global Perspective View software in context of a larger system For any opportunity identify both Potential value Potential impact of adverse results Anticipate possible outcomes Unique knowledge and insights Transform risk data into decisionmaking information Shared Product Vision Share information on current and emerging risks Impact Probability Timeframe Classify and Prioritise Risks Teamwork Plan Track Choose risk mitigation actions Work cooperatively to achieve the common goal Source Adapted from SEI Continuous Risk Management Guidebook Communicate For each risk evaluate Monitor risk indicators Pool talent skills and knowledge Easterbrook 2004 Analyse Control Correct for deviations from the risk mitigation plans Systematic techniques to discover risks Continuous Process Focus on results Search for and locate risks before they become problems Common purpose Collective responsibility Shared ownership Open Communications Value the individual voice Identify Everybody understands the mission Manage resources accordingly Free flowing information at all project levels Maintain constant vigilance Identify uncertainty Integrated Management Continually identify and manage risks Forward Looking View Department of Computer Science Continuous Risk Management Project management is risk management 2 Easterbrook 2004 Reassess risks 3 Easterbrook 2004 Source Adapted from SEI Continuous Risk Management Guidebook 4 1 University of Toronto University of Toronto Department of Computer Science Fault Tree Analysis Wrong or inadequate treatment administered Risk Assessment Event that results from a combination of causes Vital signs exceed critical limits but not corrected in time Quantitative Measure risk exposure using standard cost probability measures Basic fault event requiring no further elaboration Vital signs erroneously reported as exceeding limits Department of Computer Science Note probabilities are rarely independent Or gate Qualitative Develop a risk classification matrix And gate etc Frequency of measurement too low Computer does not read within required time limits Computer fails to raise alarm Human sets frequency too low Easterbrook 2004 Vital signs not reported Personnel Shortfalls use top talent team building training Unrealistic schedules budgets multisource estimation designing to cost requirements scrubbing Developing the wrong Software functions Developing the wrong User Interface Gold Plating requirements scrubbing cost benefit analysis designing to cost Easterbrook 2004 3 Loss of Mission Severe Severe High High Moderate Low Moderate Low Low 6 Department of Computer Science Case Study Mars Polar Lander Launched 3 Jan 1999 Mission Land near South Pole Shortfalls in externally furnished components Dig for water ice with a robotic arm Shortfalls in externally performed tasks Fate Arrived 3 Dec 1999 No signal received after initial phase of descent pre award audits competitive designs prototypes scenarios task analysis Severe University of Toronto Countermeasures Continuing stream of reqts changes early benchmarking inspections compatibility analysis better requirements analysis organizational operational analysis Severe 1 Inconvenience high change threshold information hiding incremental development Catastrophic Easterbrook 2004 Department of Computer Science 4 Loss of Spacecraft 2 Degraded Mission 5 Source Adapted from Leveson Safeware p321 Top 10 Development Risks Nurse fails to input them or does so incorrectly Sensor failure University of Toronto 5 Loss of Life Nurse does not respond to alarm Likelihood of Occurrence Very likely Possible Unlikely Catastrophic Catastrophic Severe Real time performance shortfalls Cause Several candidate causes targeted analysis simulations benchmarks models Most likely is premature engine shutdown due to noise on leg sensors Straining computer science capabilities technical analysis checking scientific literature Source Adapted from Boehm 1989 7 Easterbrook 2004 8 2 University of Toronto University of Toronto Department of Computer Science What happened Premature Shutdown Scenario Investigation hampered by lack of data Possible causes Factors System requirement to ignore the transient signals But the software requirements did not describe the effect s w designers didn t understand the effect so didn t implement the requirement Engineers present at code inspection didn t understand the effect Not caught in testing because Unit testing didn t include the transients Sensors improperly wired during integration tests no touchdown detected Full test not repeated after re wiring Lander failed to separate from cruise stage plausible but unlikely Landing site too steep plausible Heatshield failed plausible Loss of control due to dynamic effects plausible Loss of control due to center ofmass shift plausible Premature Shutdown of Descent Engines most likely Parachute drapes over lander plausible Backshell hits lander plausible but unlikely Result of error Engines shut down before spacecraft has landed When engine shutdown s w enabled flags indicated touchdown already occurred estimated at 40m above surface travelling at 13 m s estimated impact velocity 22m s spacecraft would not survive this nominal touchdown velocity 2 4m s 9 Easterbrook 2004 University of Toronto Cause of error Magnetic sensor on each leg senses touchdown Legs unfold at 1500m above surface transient signals on touchdown sensors during unfolding software accepts touchdown signals if they persist for 2 timeframes transient signals likely to be long enough on at least one leg spacecraft not designed to send telemetry during descent This decision severely criticized by review boards Department of Computer Science 10 Easterbrook 2004 University of Toronto Department of
View Full Document
Unlocking...