DOC PREVIEW
UW-Madison CS 736 - Bugs as Deviant Behavior - A General Approach to Inferring Errors in Systems Code

This preview shows page 1-2-3-4-5 out of 16 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Bugs as Deviant Behavior A General Approach to Inferring Errors in Systems Code Dawson Engler David Yu Chen Seth Hallem Andy Chou and Benjamin Chelf Computer Systems Laboratory Stanford University Stanford CA 94305 U S A Abstract A major obstacle to finding program errors in a real system is knowing what correctness rules the system must obey These rules are often undocumented or specified in an ad hoc manner This paper demonstrates techniques that automatically extract such checking information from the source code itself rather than the programmer thereby avoiding the need for a priori knowledge of system rules The cornerstone of our approach is inferring programmer beliefs that we then cross check for contradictions Beliefs are facts implied by code a dereference of a pointer p implies a belief that p is non null a call to unlock l implies that l was locked etc For beliefs we know the programmer must hold such as the pointer dereference above we immediately flag contradictions as errors For beliefs that the programmer may hold we can assume these beliefs hold and use a statistical analysis to rank the resulting errors from most to least likely For example a call to spin lock followed once by a call to spin unlock implies that the programmer may have paired these calls by coincidence If the pairing happens 999 out of 1000 times though then it is probably a valid belief and the sole deviation a probable error The key feature of this approach is that it requires no a priori knowledge of truth if two beliefs contradict we know that one is an error without knowing what the correct belief is Conceptually our checkers extract beliefs by tailoring rule templates to a system for example finding all functions that fit the rule template a must be paired with b We have developed six checkers that follow this conceptual framework They find hundreds of bugs in real systems such as Linux and OpenBSD From our experience they give a dramatic reduction in the manual effort needed to check a large system Compared to our previous work 9 these template checkers find ten to one hundred times more rule instances and derive properties we found impractical to specify manually 1 Introduction We want to find as many serious bugs as possible In our experience the biggest obstacle to finding bugs is not the need for sophisticated techniques nor the lack of either bugs or correctness constraints Simple techniques find many bugs and systems are filled with both rules and errors Instead the biggest obstacle to finding many bugs is simply knowing what rules to check Manually discovering any significant number of rules a system must obey is a dispiriting adventure especially when it must be repeated for each new release of the system In a large open source project such as Linux most rules evolve from the uncoordinated effort of hundreds or thousands of developers The end result is an ad hoc collection of conventions encoded in millions of lines of code with almost no documentation Since manually finding rules is difficult we instead focus on techniques to automatically extract rules from source code without a priori knowledge of the system We want to find what is incorrect without knowing what is correct This problem has two well known solutions contradictions and common behavior How can we detect a lie We can crosscheck statements from many witnesses If two contradict we know at least one is wrong without knowing the truth Similarly how can we divine accepted behavior We can look at examples If one person acts in a given way it may be correct behavior or it may be a coincidence If thousands of people all do the same action we know the majority is probably right and any contradictory action is probably wrong without knowing the correct behavior Our approach collects sets of programmer beliefs which are then checked for contradictions Beliefs are facts about the system implied by the code We examine two types of beliefs MUST beliefs and MAY beliefs MUST beliefs are directly implied by the code and there is no doubt that the programmer has that belief A pointer dereference implies that a programmer must believe the pointer is non null assuming they want safe code MAY beliefs are cases where we observe code features that suggest a belief but may instead be a coincidence A call to a followed by a call to b implies the programmer may believe they must be paired but it could be a coincidence Once we have a set of beliefs we do two things For a set of MUST beliefs we look for contradictions Any contradiction implies the existence of an error in the code For a set including MAY beliefs we must separate valid beliefs from coincidences We start by assuming all MAY beliefs are MUST beliefs and look for violations errors of these beliefs We then use a statistical analysis to rank each error by the probability of its beliefs If a particular belief is observed in 999 out of 1000 cases then it is probably a valid belief If the belief happens only once it is probably a coincidence We apply the above approach by combining it with our prior work 9 That work used system specific static analyses to find errors with a fixed set of manually found and specified rules e g spin lock l must be paired with spin unlock l It leveraged the fact that abstract rules commonly map to fairly simple source code sequences For example one can check the rule above by inspecting each path after a call to spin lock l to ensure that the path contains a call to spin unlock l While effective this previous work was limited by the need to find rules manually This paper describes how to derive rule instances automatically our system infers the pairing rule above directly from the source code Experience indicates that this approach is far better than the alternative of manual text based search to find relevant rule instances The analyses in this paper automatically derive all the rule instances previously hand specified in 9 as well as an additional factor of ten to one hundred more Further we now check properties that we formerly gave up on see Section 7 We demonstrate that the approach works well on complex real code by using it to find hundreds of errors in the Linux and OpenBSD operating systems Many of our bugs have resulted in kernel patches Section 2 discusses related work Sections 3 5 give an overview of the approach and Sections 6 9 apply it to find errors Section 10 concludes more limited set of properties Recent work has developed less


View Full Document

UW-Madison CS 736 - Bugs as Deviant Behavior - A General Approach to Inferring Errors in Systems Code

Documents in this Course
Load more
Loading Unlocking...
Login

Join to view Bugs as Deviant Behavior - A General Approach to Inferring Errors in Systems Code and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Bugs as Deviant Behavior - A General Approach to Inferring Errors in Systems Code and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?