DOC PREVIEW
LECTURE NOTES

This preview shows page 1-2-3 out of 10 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

A Technique for Enabling and Supporting Debugging of Field FailuresJames Clause and Alessandro OrsoCollege of ComputingGeorgia Institute of Technology{clause, orso}@cc.gatech.eduAbstractIt is difficult to fully assess the quality of software in-house, outside the actual time and context in which it willexecute after deployment. As a result, it is common forsoftware to manifest field failures, failures that occur onuser machines due to untested behavior. Field failures aretypically difficult to recreate and investigate on developerplatforms, and existing techniques based on crash report-ing provide only limited support for this task. In this pa-per, we present a technique for recording, reproducing, andminimizing failing executions that enables and supports in-house debugging of field failures. We also present a toolthat implements our technique and an empirical study thatevaluates the technique on a widely used e-mail client.1. IntroductionQuality-assurance activities, such as software testing andanalysis, are notoriously difficult, expensive, and time-consuming. As a result, software products are often re-leased with faults or missing functionality. In fact, real-world examples of field failures experienced by users be-cause of untested behaviors (e.g., due to unforeseen us-ages), are countless. When field failures occur, it is im-portant for developers to be able to recreate and investigatethem in-house. This pressing need is demonstrated by theemergence of several crash-reporting systems, such as Mi-crosoft’s error reporting systems [13] and Apple’s CrashReporter [1]. Although these techniques represent a firstimportant step in addressing the limitations of purely in-house approaches to quality assurance, they work on lim-ited data (typically, a snapshot of the execution state) andcan at best identify correlations between a crash report anddata on other known failures.In this paper, we present a novel technique for reproduc-ing and investigating field failures that addresses the limita-tions of existing approaches. Our technique works in threephases, intuitively illustrated by the scenario in Figure 1. Inthe recording phase, while users run the software, the tech-nique intercepts and logs the interactions between applica-tion and environment and records portions of the environ-ment that are relevant to these interactions. If the executionterminates with a failure, the produced execution recordingis stored for later investigation. In the minimization phase,using free cycles on the user machines, the technique re-plays the recorded failing executions with the goal of au-tomatically eliminating parts of the executions that are notrelevant to the failure. In the replay and debugging phase,developers can use the technique to replay the minimizedfailing executions and investigate the cause of the failures(e.g., within a debugger). Being able to replay and debugreal field failures can give developers unprecedented insightinto the behavior of their software after deployment and op-portunities to improve the quality of their software in waysthat were not possible before.To evaluate our technique, we implemented it in a proto-type tool, called ADDA (Automated Debugging of DeployedApplications), and used the tool to perform an empiricalstudy. The study was performed on PINE [19], a widely-used e-mail client, and involved the investigation of failurescaused by two real faults in PINE. The results of the studyare promising. Our technique was able to (1) record all ex-ecutions of PINE (and two other subjects) with a low timeand space overhead, (2) completely replay all recorded exe-cutions, and (3) perform automated minimization of failingexecutions and obtain shorter executions that manifested thesame failures as the original executions. Moreover, we wereable to replay the minimized executions within a debugger,which shows that they could have actually been used to in-vestigate the failures.The contributions of this paper are:• A novel technique for recording and later replaying exe-cutions of deployed programs.• An approach for minimizing failing executions and gen-erating shorter executions that fail for the same reasons.• A prototype tool that implements our technique.• An empirical study that shows the feasibility and effec-tiveness of the approach.!eplay an) )e*ugging p.ase0in1.ouse3Software)e9elopersIn House In the FieldLocalrepositorySite 1SoftwareSite 2Site nField (minimized)execution recordingsInternet<sersSoftwareADDA Recording ToolSoftwareADDA Minization ToolExecution recordingEnvironment dataEvent logExecution recordingEnvironment dataEvent logMinimizedMinimized environmentMinimized logrecordingRemoterepository!e=or)ing p.ase0on1line3>inimi@ation p.ase0off1line3 Softwarereleases/updatesADDA Replay toolMinimized recordingSoftwareFigure 1. An intuitive scenario of usage of our technique.2. Related WorkThis work encompasses several areas. We present themost related efforts, organized into categories.Record and replay. The techniques most closely relatedto our approach are those that record and replay executionsfor testing or debugging. Some of these techniques per-form deterministic replay debugging, that is, replay of ex-ecutions that led to a crash (e.g., [3, 9, 14, 15, 21]). Var-ious commercial and research tools record and replay userinteractions with a software product for regression testing(e.g., [11, 22, 23]). Unlike our approach, most of thesetechniques are designed to be used during in-house testingor debugging. The overhead they impose (on space, time,or infrastructure required for recording) is reasonable fortheir intended use, but would make them impractical foruse on deployed software. The few record and replay tech-niques that are either defined to operate in the field (e.g.,BugNet [14]) or may be efficient enough to be used in thefield (e.g., [9, 21]) require a specialized operating-systemor hardware support, which considerably limits their appli-cability in the short term.More recently, several researchers presented techniquesfor recording executions of Java subsystems [5, 16, 17, 20].These approaches are heavily based on Java’s characteris-tics and target the recording of subsystems only. It wouldbe difficult to adapt them to work in a more general con-text. Moreover, most of these approaches are defined to beused in-house, for regression testing, and impose an over-head that would prevent their use on deployed software.Remote data collection.


LECTURE NOTES

Download LECTURE NOTES
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view LECTURE NOTES and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view LECTURE NOTES 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?