CORNELL CS 611 - Abstract Interpretation

Unformatted text preview:

CS611 Lecture 35 Abstract Interpretation 1 December 2006Lecturer: Benyah Shaparenko1 Introduction to Abstract InterpretationAt this point in the course, we have looked at two aspects of programming languages: dynamic semanticsand static program analysis.1. Dynamic Semantics characterizes the dynamic execution of a program. Examples include operationaland denotational semantics.2. Static Program Analysis is a general term that refers to the collection of techniques and methodsthat allow us to reason statically at compile time about the program and extract information thatis guaranteed to hold during all executions. This information can then be used for optimization andcorrectness.There are two static analysis techniques which are most prominent:1. Type Systems: There are two aspects of type systems to keep in mind as we move on to abstractinterpretation:(a) In type systems, programmers typically annotate the program with type information. We canregard these type annotations as global invariants provided by the user.(b) Types are flow insensitive. A variable or expression has the same type regardless of where it ap-pears in the flow of the program. This is because the type annotations represent global properties.2. Abstract Interpretation: We can think of abstract interpretation being similar to dataflow analysis, butalso providing a framework that allows to formally prove the correctness of an analysis.The idea behind abstract interpretation (and dataflow analysis) is as follows. The exec ution of aprogram computes a piece of concrete information. The goal of abstract interpretation is to staticallycompute a piece of abstract information that characterizes the concrete information in all possibleexecutions of the program.A good example of abstract interpretation is Sign Analysis, whose goal is to statically compute thepossible signs of each variable at each program point. In this example, the concrete informationrepresents the values of variables during program execution, i.e, the state. The abstract informationmodels just the sign of variables.In contrast to type systems, abstract interpretation has the following features:(a) It is flow sensitive. It computes abstract information at each point in the program. The informa-tion at different points in the program may be different.(b) There are no program annotations. The analysis must compute the abstract info by itself, becauseit is unreasonable to ask the user to provide annotations at every point of every program. Inparticular, the compiler must find out the loop invariants; it is not easy to extract this informationautomatically.How does the compiler statically figure out the abstract information that we are interested in? It “executes”the program in the abstract domain, hence the name Abstract Interpretation. Two differences compared tothe concrete execution are the following:11. The analysis must follow all possible paths through program (dynamic execution only follows one path).2. The static analysis must terminate, even if the program doesn’t. We expect the compiling process,including static analysis, to terminate even if our program has an infinite loop.To summarize this introduction to abstract interpretation, we make the following comments. Type systemsare a lightweight form of static analysis, which give some form of correctness (no errors) without muchwork. Abstract interpretation, on the other hand, is a heavyweight form of static analysis, giving detailedinformation at each point in the program. As a result, it provides a stronger sense of correctness and alsoenables optimizations, but at a larger cost.2 LatticesWe will formalize both the concrete and the abstract domain using lattices. A complete lattice is a pair(L, v) such that:• v is a partial order• Any subset X ⊆ L has least upper bound (lub) tX and greatest lower bound (glb) uXA complete lattice is different than a cpo, in that a cpo only requires a lub for ωchains.2.1 Notation for LatticesRegarded as binary operators, the lub and the glb are also referred to as join and meet. In this case, we usethe infix notation:• Join of two elements: x t y• Meet of two elements: x u yFrom its definition, a complete lattice is guaranteed to have a top and a bottom element:• Top element: > = tL• Bottom elem ent: ⊥ = uLIn many cases in the literature, lattices are sometimes denoted as tuples to emphasize all these operatorsand values: (L, v, t, u, >, ⊥).The intuition behind the partial ordering in the abstract lattice is that elements higher in the lattice aremore precise. The most precise piece of abstract information is > and the least precise is ⊥. Note that somebooks put it the other way around.2.2 Properties about OperatorsThe operators u, t and the partial order v satisfy the following properties:2• x u y = x iff x v y• x t y = y iff x v y• u and t are idempotent, commutative and associative.2.3 Properties about LatticesA property that not all lattices have, but which will be important in providing a guarantee that the staticanalysis will terminate is the descending chain condition (DCC). A lattice satisfies DCC is any descendingchain stabilizes:{xi}ns.th. xi+1v xi⇒ ∃n0s.th. ∀n ≥ n0xn= xn0We can also define the height of a lattice as the maximum number of distinct elements in a chain. A finiteheight implies the descending chain condition (DCC).3 Formal FrameworkWe are going to study abstract interpretation using an imperative language, in fact IMP. We will extend itssyntax with labels to model program points:c ::= [skip]l| [x := a]l| c0; c1| if [b]lthen c0else c1| while [b]ldo c0where the labels l ∈ Labels model program points. There is a special label linit, representing the entry pointin the program.We formalize the result of abstract interpretation using a function Result which assigns two elements in theabstract lattice to each program point, b e fore and after the point:Result : Labels → (La× La)We denote by Result(•l) the result right before the program point represented by l; and by Result(l•) theresult right after l.We next want to determine how the abstract information changes when a command is executed; in otherwords, how to “execute” a command in the abstract domain. For this, we introduce a transfer function foreach command c , to map a piece of abstract information into another piece of abstract information:[[c]] : La→ LaWe now formulate the problem as a constraint problem. To compute the Result


View Full Document

CORNELL CS 611 - Abstract Interpretation

Download Abstract Interpretation
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Abstract Interpretation and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Abstract Interpretation 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?