CMU CS 15312 - Lecture Notes on Abstract Syntax - D628352

Home> Schools> Carnegie Mellon University> Computer Science (CS) > CS 15312> Lecture Notes on Abstract Syntax

DOC PREVIEW

CMU CS 15312 - Lecture Notes on Abstract Syntax

School name Carnegie Mellon University

Course Cs 15312- Foundations of Programming Languages

Pages 7

This preview shows page 1-2 out of 7 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 7 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 7 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 7 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Lecture Notes on Abstract Syntax15-312: Foundations of Programming LanguagesFrank PfenningLecture 3September 7, 2004Grammars, as we have discussed them so far, define a formal languageas a set of strings. We refer to this as the concrete syntax of a language. Whilethis is necessary in the complete definition of a programming language, itis only the beginning. We further have to define at least the static semantics(via typing rules) and the dynamic semantics (via evaluation rules). Thenwe have to reason about their relationship to establish, for example, typesoundness. Giving such definitions and proofs on strings is extremely te-dious and inappropriate; instead we want to give it a more abstract form ofrepresentation. We refer to this layer of representation as the abstract syntaxof a language. An appropriate representation vehicle are terms [Ch. 1].Given this distinction, we can see that parsing is more than simply rec-ognizing if a given string lies within the language defined by a grammar.Instead, parsing in our context should translate a string, given in concretesyntax, into an abstract syntax term. The converse problem of printing(or unparsing) is to translate an abstract syntax term into a string repre-sentation. While the grammar formalism is somewhat unwieldy when itcomes to specifying the translation into abstract syntax, we see that themechanism of judgments is quite robust and can specify both parsing andunparsing quite cleanly.We begin by reviewing the arithmetic expression language in its con-crete [Ch. 3] and abstract [Ch. 4] forms. First, the grammar in its unam-bigous form.1We implement here the decision that addition and multipli-cation should be left-associative (so 1+2+3 is parsed as (1+2)+3) and that1We capitalize the non-terminals to avoid confusion when considering both concrete andabstract syntax in the same judgment. Also, the syntactic category of Terms (denoted by T )should not be confused with the terms we use to construct abstract syntax.LECTURE NOTES SEPTEMBER 7, 2004L3.2 Abstract Syntaxmultiplication has precedence over addition. Such choices are somewhatarbitrary and dictated by convention rather than any scientific criteria.2Digits D : : = 0 | · · · | 9Numbers N : : = D | N DExpressions E : : = T | E+TTerms T : : = F | T *FFactors F : : = N | (E)Written in the form of five judgments.0 D · · · 9 Ds Ds Ns1N s2Ds1s2Ns Ts Es1E s2Ts1+s2Es Fs Ts1T s2Fs1*s2Ts Ns Fs E(s) FThe abstract syntax of the language is much simpler. It can be speci-fied in the form of a grammar, where the universe we are working overare terms and not strings. While natural numbers can also be inductivelydefined in a variety of ways [Ch 1.3], we take them here as primitive math-ematical objects.nat : : = 0 | 1 | · · ·expr : : = num(nat) | plus(expr, expr) | times(expr, expr)Presented as two judgments, we have k nat for every natural number kand the following rule for expressions2The grammar given in [Ch. 3.2] is slightly different, since there addition and multipli-cation are assumed to be right associ ative.LECTURE NOTES SEPTEMBER 7, 2004Abstract Syntax L3.3k natnum(k) exprt1expr t2exprplus(t1, t2) exprt1expr t2exprtimes(t1, t2) exprNow we specify the proper relation between concrete and abstract syn-tax through several simultaneously inductive judgments. Perhaps the eas-iest way to generate these judgments is to add the corresponding abstractsyntax terms to each of the inference rules defining the concrete syntax.0 D ←→ 0 nat · · · 9 D ←→ 9 nats D ←→ k nats N ←→ k nats1N ←→ k1nat s2D ←→ k2nats1s2N ←→ 10k1+ k2nats T ←→ t exprs E ←→ t exprs1E ←→ t1expr s2T ←→ t2exprs1+s2E ←→ plus(t1, t2) exprs F ←→ t exprs T ←→ t exprs1T ←→ t1expr s2F ←→ t2exprs1*s2T ←→ times(t1, t2) exprs N ←→ k nats F ←→ num(k) exprs E ←→ t expr(s) F ←→ t exprWhen giving a specification of the form above, we should verify thatthe basic properties we expect, actually hold. In this case we would liketo check that related strings and terms belong to the correct (concrete orabstract, respectively) syntactic classes.Theorem 1(i) If s D ←→ k nat then s D and k nat.(ii) If s N ←→ k nat then s N and k nat.(iii) If s E ←→ t expr then s E and t expr.(iv) If s T ←→ t expr then s T and t expr.LECTURE NOTES SEPTEMBER 7, 2004L3.4 Abstract Syntax(v) If s F ←→ t expr then s F and t expr.Proof: Part (i) follows by cases (there are 10 cases, one for each digit).Part (ii) follows by rule induction on the given derivation, using (i) inboth cases.Parts (iii), (iv), and (v) follow by simultaneous rule induction on thegiven derivation, using part (ii) in one case. Overall, there are 6 cases. Ineach case we can immediately appeal to the induction hypothesis on allsubderivations and construct a derivation of the desired judgment fromthe results. When implementing such a specification, we generally make a commit-ment as to what is considered our input and what is our output. As mo-tivated above, parsing and unparsing (printing) are specifed by this judg-ment.Definition 2 (Parsing)Given a string s, find a term t such that s E ←→ t expr or fail, if no such texists.Obvious analogous definitions exist for the other syntactic categories.We can further relate parsing into abstract syntax to our definition of thesyntactic categories by ascertaining that if s E then there is an abstract syn-tax term representing it.Theorem 3(i) If s D then there is a k with s D ←→ k nat.(ii) If s N then there is a k with s N ←→ k nat.(iii) If s E then there is a t with s E ←→ t expr.(iv) If s T then there is a t with s T ←→ t expr.(v) If s F then there is a t with s F ←→ t expr.Proof: By cases or straightforward rule induction as in the proof of Theo-rem 1. Now we can refine our notion of ambiguity to take into account the ab-stract syntax that is constructed. This is slightly more relaxed that requiringthe uniqueness of derivations, because different derivations could still leadto the same abstract syntax term.LECTURE NOTES SEPTEMBER 7, 2004Abstract Syntax L3.5Definition 4 (Ambiguity of Parsing)A parsing problem is ambiguous if for a given string s there exist two dis-tinct terms t1and t2such that s E ←→ t1expr and s E ←→ t2expr.Unparsing is just the reverse of parsing: we are given a term t and haveto find a concrete syntax representation for it. Unparsing is usually total(every term can be unparsed) and

View Full Document