UW-Madison CS 736 - Bugs as Deviant Behavior - A General Approach to Inferring Errors in Systems Code - D387755

Home> Schools> University of Wisconsin, Madison> (CS) > CS 736> Bugs as Deviant Behavior - A General Approach to Inferring Errors in Systems Code

DOC PREVIEW

UW-Madison CS 736 - Bugs as Deviant Behavior - A General Approach to Inferring Errors in Systems Code

School name University of Wisconsin, Madison

Course Cs 736- Advanced Operating Systems

Pages 16

This preview shows page 1-2-3-4-5 out of 16 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Bugs as Deviant Behavior: A General Approachto Inferring Errors in Systems CodeDawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin ChelfComputer Systems LaboratoryStanford UniversityStanford, CA 94305, U.S.A.AbstractA major obstacle to finding program errors in a real systemis knowing what correctness rules the system must obey.These rules are often undocumented or specified in an adhoc manner. This paper demonstrates techniques that auto-matically extract such checking information from the sourcecode itself, rather than the programmer, thereby avoidingthe need for a priori knowledge of system rules.The cornerstone of our approach is inferring programmer“beliefs” that we then cross-check for contradictions. Beliefsare facts implied by code: a dereference of a pointer, p, im-plies a belief that p is non-null, a call to “unlock(l)” impliesthat l was locked, etc. For beliefs we know the programmermust h old, such as the pointer dereference above, we im-mediately flag contradictions as errors. For beliefs that theprogrammer may hold, we can assume these beliefs hold anduse a statistical analysis to rank the resulting errors frommost to least likely. For example, a call to “spinlock”followed once by a call to “spinunlock” implies that theprogrammer may have paired these calls by coincidence. Ifthe pairing happens 999 out of 1000 times, though, then itis probably a valid belief and the sole deviation a probableerror. The key feature of this approach is t hat it requiresno a priori knowledge of t ruth: if two beliefs contradict, weknow that one is an error without knowing what the correctbelief is.Conceptually, our checkers extract beliefs by tailoringrule “templates” to a sy stem – for example, finding all func-tions that fit the rule temp late “<a> must be paired with<b>.” We have d eveloped six checkers that follow this con-ceptual framework. They find hundreds of b ugs in real sy s-tems such as Linux and OpenBSD. From our experience,they give a dramatic reduction in the manual effort neededto check a large system. Compared to our previous work [9],these template checkers find ten to one hundred times morerule instances and derive properties we found impractical tospecify manually.1 IntroductionWe want to find as many serious bugs as possible. In our ex-perience, the biggest obstacle to finding bugs is not the needfor sophisticated techniques nor the lack of either bugs orcorrectness constraints. Simple techniques find many bugsand systems are filled with both rules and errors. Instead,the biggest obstacle to finding many bugs is simply knowingwhat rules to check. Manually discovering any significantnumber of rules a system must obey is a dispiriting adven-ture, especially when it must be repeated for each new re-lease of the system. In a large open source project such asLinux, most rules evolve from the uncoordinated effort ofhundreds or thousands of developers. The end result is anad hoc collection of conventions encoded in millions of linesof code with almost n o documentation.Since manually finding rules is difficult, we instead fo-cus on techniques to automatically extract ru les from sourcecode without a priori kn owledge of the system. We want tofind what is incorrect without knowing what is correct. Thisproblem has two well-known solutions: contradictions andcommon behavior. How can we detect a lie? We can cross-check statements from many witnesses. If two contradict, weknow at least one is wrong without knowing the truth. Sim-ilarly, how can we divine accepted behav ior? We can lookat examples. If one person acts in a given way, it may becorrect behavior or it may be a coincidence. If thousands ofpeople all do the same action, we know the majority is prob-ably right, and any cont radictory action is probably wrongwithout knowing the correct behavior.Our approach collects sets of programmer beliefs, whichare then checked for contradictions. Beliefs are facts aboutthe system implied by the code. We examine two typ es ofbeliefs: MUST beliefs and MAY beliefs. MUST beliefs aredirectly implied by the code, and there is no doubt that theprogrammer has that belief. A pointer dereference impliesthat a programmer must believe the pointer is non-null (as-suming they want safe code). MAY beliefs are cases wherewe observe code features that suggest a belief but may in-stead be a coincidence. A call to “a” followed by a call to “b”implies the programmer may believe they must be paired,but it could be a coincidence.Once we have a set of beliefs, we do two things. For aset of MUST beliefs, we look for contradictions. Any con-tradiction implies the existence of an error in t he code. Fora set including MAY beliefs, we must separate valid beliefsfrom coincidences. We start by assuming all MAY beliefsare MUST beliefs and look for violations (errors) of thesebeliefs. We then use a statistical analysis to rank each er-ror by the probability of its beliefs. If a particular beliefis observed in 999 out of 1000 cases, then it is probably avalid belief. If the belief happens only once, it is probably acoincidence.We apply the above approach by combining it with ourprior work [9]. That work used system-specific static analy-ses to find errors with a fixed set of manually found and spec-ified rules (e.g.,“spinlock(l) must be paired with spin unlock(l)”).It leveraged the fact that abstract rules commonly map tofairly simple source code sequences. For example, one cancheck the rule ab ove by inspecting each path after a callto “spin lock(l)” to ensure that the path contains a callto “spinunlock(l).” While effective, this previous workwas limited by the n eed to find rules manually. This paperdescribes how to derive rule instances automatically: oursystem infers the pairing rule above directly from the sourcecode.Experience indicates that this approach is far better thanthe alternative of manual, text-based search to find relevantrule instances. The analyses in this paper automaticallyderive all the rule instances previously hand-specified in [9],as well as an additional factor of ten to one hundred more.Further, we now check properties that we formerly gave upon (see Section 7). We demonstrate that the approach workswell on complex, real cod e by using it to find hundreds oferrors in the Linux and OpenBSD operating systems. Manyof our bugs have resulted in kernel patches.Section 2 discusses related work. Sections 3–5 give anoverview of the

View Full Document