1Analysis of programs with pointersSimple example• What are the dependences in this program?• Problem: just looking at variable names will not give you the correct information– After statement S2, program names “x” and “*ptr” are both expressions that refer to the same memory location.– We say that ptr points-to x after statement S2.• In a C-like language that has pointers, we must know the points-to relation to be able to determine dependences correctlyx := 5ptr := @x*ptr := 9y := xS1S2S3S4dependencesprogramProgram model• For now, only types are int and int*• No heap– All pointers point to only to stack variables• No procedure or function calls• Statements involving pointer variables:– address: x := &y– copy: x := y– load: x := *y– store: *x := y• Arbitrary computations involving intsPoints-to relation• Directed graph:– nodes are program variables– edge (a,b): variable a points-to variable b• Can use a special node to represent NULL• Points-to relation is different at different program points x ptry2• Out-degree of node may be more than one– if points-to graph has edges (a,b) and (a,c), it means that variable a may point to either b or c– depending on how we got to that point, one or the other will be true – path-sensitive analyses: track how you got to a program point (we will not do this)Points-to graphif (p) then x := &y else x := &z…..px := &yx := &zWhat does x point to here?Ordering on points-to relation• Subset ordering: for a given set of variables– Least element is graph with no edges– G1 <= G2 if G2 has all the edges G1 has and maybe some more• Given two points-to relations G1 and G2– G1 U G2: least graph that contains all the edges in G1 and in G2Overview• We will look at three different points-to analyses.• Flow-sensitive points-to analysis– Dataflow analysis– Computes a different points-to relation at each point in program• Flow-insensitive points-to analysis– Computes a single points-to graph for entire program– Andersen’s algorithm• Natural simplification of flow-sensitive algorithm– Steensgard’s algorithm• Nodes in tree are equivalence classes of variables– if x may point-to either y or z, put y and z in the same equivalence class• Points-to relation is a tree with edges from children to parents ratherthan a general graph• Less precise than Andersen’s algorithm but fasterExamplex := &z ptr := @xy := @wptr := @yptr x z ywptr x z y wptrx,yz,wFlow-sensitive algorithmAndersen’s algorithmSteensgard’s algorithm3Notation• Suppose S and S1 are set-valued variables.•S Å S1: strong update– set assignment•S UÅ S1: weak update– set union: this is like S Å S U S1Flow-sensitive algorithmDataflow equations• Forward flow, any path analysis• Confluence operator: G1 U G2• Statementsx := &yGG’ = G with pt’(x) Å {y}x := yGG’ = G with pt’(x) Å pt(y)x := *yGG’ = G with pt’(x) Å U pt(a)for all a in pt(y) *x := yGG’ = G with pt’(a) UÅ pt(y)for all a in pt(x)Dataflow equations (contd.)x := &yGG’ = G with pt’(x) Å {y}x := yGG’ = G with pt’(x) Å pt(y)x := *yGG’ = G with pt’(x) Å U pt(a)for all a in pt(y) *x := yGG’ = G with pt’(a) UÅ pt(y)for all a in pt(x)strong updatesweak update (why?)4Strong vs. weak updates• Strong update:– At assignment statement, you know precisely which variable is being written to– Example: x := ….– You can remove points-to information about x coming into the statement in the dataflow analysis.• Weak update:– You do not know precisely which variable is being updated; only that it is one among some set of variables.– Example: *x := …– Problem: at analysis time, you may not know which variable x points to (see slide on control-flow and out-degree of nodes)– Refinement: if out-degree of x in points-to graph is 1 and x is known not be nil, we can do a strong update even for *x := …Structures• Structure types– struct cell {int value; struct cell *left, *right;}– struct cell x,y;• Use a “field-sensitive” model– x and y are nodes– each node has three internal fields labeled value, left, right• This representation permits pointers into fields of structures– If this is not necessary, we can simply have a node for each structure and label outgoing edges with field nameExampleint main(void){ struct cell {int value;struct cell *next;};struct cell x,y,z,*p;int sum;x.value = 5;x.next = &y;y.value = 6;y.next = &z; z.value = 7;z.next = NULL;p = &x;sum = 0;while (p != NULL) {sum = sum + (*p).value;p = (*p).next;}return sum;}xyzpnextvaluenextvaluenextvaluexyzpnextvaluenextvaluenextvalueNULLNULLFlow-insensitive algorithms5Flow-insensitive analysis• Flow-sensitive analysis computes a different graph at each program point.• This can be quite expensive.• One alternative: flow-insensitive analysis– Intuition:compute a points-to relation which is the least upper bound of all the points-to relations computed by the flow-sensitive analysis• Approach:– Ignore control-flow– Consider all assignment statements together• replace strong updates in dataflow equations with weak updates– Compute a single points-to relation that holds regardless of the order in which assignment statements are actually executedAndersen’s algorithm• Statementsx := &yGG = G with pt(x) UÅ {y}x := yGG = G with pt(x) UÅ pt(y)x := *yGG = G with pt(x) UÅ pt(a)for all a in pt(y) *x := yGG = G with pt(a) UÅ pt(y)for all a in pt(x)weak updates onlyExampleint main(void){ struct cell {int value;struct cell *next;};struct cell x,y,z,*p;int sum;x.value = 5;x.next = &y;y.value = 6;y.next = &z; z.value = 7;z.next = NULL;p = &x;sum = 0;while (p != NULL) {sum = sum + (*p).value;p = (*p).next;}return sum;}x.next = &y;y.next = &z; z.next = NULL;p = &x;p = (*p).next;Assignments for flow-insensitive analysisG...Solution to flow-insensitive equationsxyzpnextvaluenextvaluenextvalueNULL- Compare with points-to graphs for flow-sensitive solution- Why does p point-to NULL in this graph?6Andersen’s algorithm formulated using set constraints• Statements)(xpty∈x := &yx := yx := *y*x := y)()( yptxpt ⊇var2var: →pt)()().( yptaptxpta ⊇∈∀)()().( aptxptypta ⊇∈∀Steensgard’s algorithm• Flow-insensitive• Computes a points-to graph in which there is no fan-out– In points-to graph produced by Andersen’s algorithm, if x points-to y and z, y
View Full Document