Context-sensitive AnalysisBeyond Syntax There is a level of correctness that is deeper than grammar fie(a,b,c,d) int a, b, c, d; { … } fee() { int f[3],g[0], h, i, j, k; char *p; fie(h,i,“ab”,j, k); k = f * i + j; h = g[17]; printf(“<%s,%s>.\n”, p,q); p = 10; } What is wrong with this program? (let me count the ways …)Beyond Syntax There is a level of correctness that is deeper than grammar To generate code, we need to understand its meaning ! fie(a,b,c,d) int a, b, c, d; { … } fee() { int f[3],g[0], h, i, j, k; char *p; fie(h,i,“ab”,j, k); k = f * i + j; h = g[17]; printf(“<%s,%s>.\n”, p,q); p = 10; } What is wrong with this program? (let me count the ways …) • declared g[0], used g[17] • wrong number of args to fie() • “ab” is not an int • wrong dimension on use of f • undeclared variable q • 10 is not a character string All of these are “deeper than syntax”Beyond Syntax To generate code, the compiler needs to answer many questions • Is “x” a scalar, an array, or a function? Is “x” declared? • Are there names that are not declared? Declared but not used? • Which declaration of “x” does each use reference? • Is the expression “x * y + z” type-consistent? • In “a[i,j,k]”, does a have three dimensions? • Where can “z” be stored? (register, local, global, heap, static) • How many arguments does “fie()” take? What about “printf ()” ? • Does “*p” reference the result of a “malloc()” ? • Do “p” & “q” refer to the same memory location? • Is “x” defined before it is used? These are beyond a CFGBeyond Syntax These questions are part of context-sensitive analysis • Questions & answers involve non-local information • Answers may involve computation How can we answer these questions? • Use formal methods → Attribute grammars? Also known as attributed CFG or syntax-directed definitions • Use ad-hoc techniques → Symbol tables → Ad-hoc code (action routines) In scanning & parsing, formalism won; different story here.Beyond Syntax Telling the story • The attribute grammar formalism is important → Succinctly makes many points clear → Sets the stage for actual, ad-hoc practice • The problems with attribute grammars motivate practice → Non-local computation → Need for centralized information • Some folks still argue for attribute grammars → In practice, ad-hoc techniques used We will cover attribute grammars, then move on to ad-hoc ideasAttribute Grammars What is an attribute grammar? • A context-free grammar augmented with a set of rules • Each symbol in the derivation has a set of values, or attributes → X.a denotes the value of a at a particular parse-tree node labeled X • The rules specify how to compute a value for each attribute Example grammar This grammar describes signed binary numbers: +101, -11, +10101, but not 101 We would like to augment it with rules that compute the decimal value of each valid input string Example: parse -101 and compute -5Examples We will use these two throughout the lecture Number → Sign List → – List → – Bit → – 1 Number List Bit 1 Sign – For “–1” Number → Sign List → Sign List Bit → Sign List 1 → Sign List Bit 1 → Sign List 1 1 → Sign Bit 0 1 → Sign 1 0 1 → – 101 Number List Sign – Bit 1 List Bit 0 List Bit 1 For “–101”Attribute Grammars Add rules to compute the decimal value of a signed binary number Symbol AttributesNumbervalSignnegListpos, valBitpos, valBack to the Examples Number List Bit 1 Sign – neg ← true Bit.pos ← 0 Bit.val ← 2Bit.pos ≡ 1 List.pos ← 0 List.val ← Bit.val ≡ 1 Number.val ← – List.val ≡ –1 For “–1” One possible evaluation order: 1 List.pos 2 Sign.neg 3 Bit.pos 4 Bit.val 5 List.val 6 Number.val Other orders are possible Knuth suggested a data-flow model for evaluation • Independent attributes first • Others in order as input values become available Rules + parse tree imply an attribute dependence graph Evaluation order must be consistent with the attribute dependence graphBack to the Examples This is the complete attribute dependence graph for “–101”. It shows the flow of all attribute values in the example. Some flow downward → inherited attributes Some flow upward → synthesized attributes A rule may use attributes in the parent, children, or siblings of a node Number Sign – List Bit 1 List Bit 0 List Bit 1 pos: 0 val: 1 pos: 2 val: 4 pos: 1 val: 0 pos: 2 val: 4 pos: 1 val: 4 pos: 0 val: 5 val: –5 neg: true For “–101”The Rules of the Game • Attributes associated with nodes in parse tree • Rules are value assignments associated with productions • Attribute is defined once, using local information • Label identical terms in production for uniqueness • Rules & parse tree define an attribute dependence graph → Graph must be non-circular This produces a high-level, functional specification Synthesized attribute → Depends on values from children Inherited attribute → Depends on values from siblings & parentUsing Attribute Grammars Attribute grammars can specify context-sensitive actions • Take values from syntax • Perform computations with values • Insert tests, logic, … We want to use both kinds of attribute Synthesized Attributes • Use values from children & from constants • S-attributed grammars • Evaluate in a single bottom-up pass Good match to LR parsing Inherited Attributes • Use values from parent, constants, & siblings • directly express context • can rewrite to avoid them • Thought to be more natural Not easily done at parse timeEvaluation Methods Dynamic, dependence-based methods • Build the parse tree • Build the dependence graph • Topological sort the dependence graph • Define attributes in topological order Rule-based methods (treewalk) • Analyze rules at compiler-generation time • Determine a fixed (static) ordering • Evaluate nodes in that order Oblivious methods (passes, dataflow) •
View Full Document