DOC PREVIEW
CORNELL CS 501 - Lecture 19 Reliability 1

This preview shows page 1-2-16-17-18-34-35 out of 35 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 501: Software EngineeringAdministrationLectures on Reliability and DependabilityDependable and Reliable Systems: The Royal MajestyThe Royal Majesty: AnalysisThe Royal Majesty: Software LessonsReliabilityUser Perception of ReliabilityReliability MetricsReliability Metrics for Distributed SystemsRequirements Specification of System ReliabilityCost of Improved ReliabilityExample: Central Computing SystemSlide 14Slide 15Building Dependable Systems: Three PrinciplesReliability: Modified Waterfall ModelKey Factors for Reliable SoftwareBuilding Dependable Systems: Organizational CultureBuilding Dependable Systems: ComplexityBuilding Dependable Systems: Specifications for the ClientBuilding Dependable Systems: Quality Management ProcessesBuilding Dependable Systems: ChangeReviews: Process (Plan)Reviews: Design and CodeBenefits of Design and Code ReviewsReview Team (Full Version)Example: Program DesignReview ProcessStatic and Dynamic VerificationStatic Validation & VerificationStatic Verification: Program InspectionsInspection Checklist: Common ErrorsStatic Analysis ToolsStatic Analysis Tools (continued)1CS 501 Spring 2005CS 501: Software EngineeringLecture 19Reliability 12CS 501 Spring 2005Administration3CS 501 Spring 2005Lectures on Reliability and DependabilityLecture 19, Reliability 1: The development processReviewsLecture 20, Reliability 2: Different aspects of reliabilityProgramming techniquesLecture 21, Reliability 3: Testing and bug fixingTools4CS 501 Spring 2005Dependable and Reliable Systems: The Royal MajestyFrom the report of the National Transportation Safety Board:"On June 10, 1995, the Panamanian passenger ship Royal Majesty grounded on Rose and Crown Shoal about 10 miles east of Nantucket Island, Massachusetts, and about 17 miles from where the watch officers thought the vessel was. The vessel, with 1,509 persons on board, was en route from St. George’s, Bermuda, to Boston, Massachusetts.""The Raytheon GPS unit installed on the Royal Majesty had been designed as a standalone navigation device in the mid- to late1980s, ...The Royal Majesty’s GPS was configured by Majesty Cruise Line to automatically default to the Dead Reckoning mode when satellite data were not available."5CS 501 Spring 2005The Royal Majesty: Analysis• The ship was steered by an autopilot that relied on position information from the Global Positioning System (GPS).• If the GPS could not obtain a position from satellites, it provided an estimated position based on Dead Reckoning (distance and direction traveled from a known point).• The GPS failed one hour after leaving Bermuda.• The crew failed to see the warning message on the display (or to check the instruments).• 34 hours and 600 miles later, the Dead Reckoning error was 17 miles.6CS 501 Spring 2005The Royal Majesty: Software LessonsAll the software worked as specified (no bugs), but ...• Since the GPS software had been specified, the requirements had changed (stand alone system to part of integrated system).• The manufacturers of the autopilot and GPS adopted different design philosophies about the communication of mode changes.• The autopilot was not programmed to recognize valid/invalid status bits in message from the GPS (NMEA 0183).• The warnings provided by the user interface were not sufficiently conspicuous to alert the crew.• The officers had not been properly trained on this equipment.7CS 501 Spring 2005ReliabilityReliability: Probability of a failure occurring in operational use.Perceived reliability: Depends upon: user behavior set of inputs pain of failure8CS 501 Spring 2005User Perception of Reliability1. A personal computer that crashes frequently v. a machine that is out of service for two days.2. A database system that crashes frequently but comes back quickly with no loss of data v. a system that fails once in three years but data has to be restored from backup.3. A system that does not fail but has unpredictable periods when it runs very slowly.9CS 501 Spring 2005Reliability MetricsTraditional Measures• Mean time between failures• Availability (up time)• Mean time to repairMarket Measures• Complaints• Customer retentionUser Perception is Influenced by• Distribution of failuresHypothetical example: Cars are less safe than airplanes in accidents per hour, but safer in accidents per mile.10CS 501 Spring 2005Reliability Metrics for Distributed SystemsTraditional metrics are hard to apply in multi-component systems:• In a big network, at any given moment something will be giving trouble, but very few users will see it.• A system that has excellent average reliability may give terrible service to certain users.• There are so many components that system administrators rely on automatic reporting systems to identify problem areas.11CS 501 Spring 2005Requirements Specification of System ReliabilityExample: ATM card readerFailure class Example MetricPermanent System fails to operate 1 per 1,000 daysnon-corrupting with any card -- rebootTransient System can not read 1 in 1,000 transactionsnon-corrupting an undamaged cardCorrupting A pattern of Never transactions corrupts database12CS 501 Spring 2005Cost of Improved Reliability$Up time99%100%Will you spend your money on new functionality or improved reliability?13CS 501 Spring 2005Example: Central Computing System A central computer serves the entire organization. Any failure is serious.Step 1: Gather data on every failure • 10 years of data in a simple data base• Every failure analyzed:hardwaresoftware (default)environment (e.g., power, air conditioning)human (e.g., operator error)14CS 501 Spring 2005Example: Central Computing System Step 2: Analyze the data• Weekly, monthly, and annual statisticsNumber of failures and interruptionsMean time to repair• Graphs of trends by component, e.g.,Failure rates of disk drivesHardware failures after power failuresCrashes caused by software bugs in each module15CS 501 Spring 2005Example: Central Computing System Step 3: Invest resources where benefit will be maximum, e.g.,• Orderly shut down after power failure• Priority order for software improvements• Changed procedures for operators• Replacement hardware16CS 501 Spring 2005Building Dependable Systems: Three PrinciplesFor a software system to be dependable:• Each stage of development must be done well.• Changes should be


View Full Document

CORNELL CS 501 - Lecture 19 Reliability 1

Documents in this Course
Quiz 2

Quiz 2

2 pages

Usability

Usability

31 pages

Quiz 1

Quiz 1

2 pages

Stulba;''

Stulba;''

33 pages

Load more
Download Lecture 19 Reliability 1
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 19 Reliability 1 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 19 Reliability 1 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?