DOC PREVIEW
UA CSC 453 - Concrete and Abstract Grammars

This preview shows page 1-2-3-4-5-6 out of 17 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CSc 453 — Compilers and Systems Software5 : Concrete and Abstract GrammarsChristian CollbergDepartment of Computer ScienceUniversity of [email protected]° 2009 Christian CollbergSeptember 9, 20091 Syntax• Thesyntax of a language (formal or natural) is the way the words in a sentence/program can bearranged.• eats dog bone the is not a legal arrangement of words in English.• = y x + 5 is not a legal arrangement of tokens in Java.• Somehow, we need to describe what constitutes legal and illegal sentences in a particular language.• We useproduction rules to describe the syntax of a language.2 Production Rules• Here’s a production rule:IfStat → if ( expr ) stat• This rule states that to construct an if-statement in C you have to type1. an if, then2. a (, then3. some sort of expression, then4. a ), then finally5. some sort of statement.3 A Grammar for English• Agrammar can be used for1. sentence generation (i.e. which sentences does this grammar generate?), or2. parsing (i.e. is sentence S generated by this grammar?).• Let’s look at a simple grammar for a fragment of English.14 Syntactic CategoriesS [Sentence] John likes Sarah’s black hairN [Noun] John, hairV [Verb] eating, satAdj [Adjective] black, longDet [Determiner] the, a, everyNP [Noun Phrase] Sarah’s long black hairVP [Verb Phrase]eating apples5 A Simple English GrammarS → NP VPVP → V NPVP → VNP → NNP → Det NN → JohnN → LisaN → houseV → diedV → kissedDet → theDet → a• S, NP, VP, N, Det, V arenon-terminal symbols.• John, Lisa, house, died,... areterminal symbols.• S is thestart symbol.6 Sentence Generation1. Start with the start sym-bol.2. Pick a non-terminal X onthe right hand side.3. Pick a grammar rule X →γ.4. Replace X with γ.5. Repeat until left with astring of words.SS→NP VP=⇒ NP VPNP→N=⇒ N VPN→John=⇒ John VPVP→V NP=⇒ John V NPV→kissed=⇒ John kissed NPNP→N=⇒ John kissed NN→Lisa=⇒ John kissed Lisa7 Terminology• A grammar is a 4-tuple(non-terminals, terminals, productions, start-symbol)2or(N, Σ, P, S)• A production is of the form α → β where α, β are taken from NSΣ.• Read α → β as “rewrite α with β”.• Read ⇒ as “directly derives”.• Readr⇒ as “directly derives using rule r”.• Read∗⇒ as “derives in one or more steps”.8 A Simple PL GrammarHere’s a grammar for a simple programming language:Program ::= BEGIN Stat ENDStat ::= ident := ExprExpr::= Expr + Expr |Expr * Expr |ident | number• We write terminal symbols like this.• We write non-terminal symbols likethis.• Sometimes we write ::= instead of →.• A → b | c is the same as A → b; A → c. Read | as “or”.9 A Simple PL Grammar. . .We know the sentenceBEGIN a := 5 + 4 * 3 ENDis in the language because we can derive it from the start symbol:Program ⇒ BEGIN Stat END⇒ BEGIN ident := Expr END⇒ BEGIN "a" := Expr END⇒ BEGIN "a" := Expr + Expr END⇒ BEGIN "a" := 5 + Expr END⇒ BEGIN "a" := 5 + Expr * Expr END⇒ BEGIN "a" := 5 + 4 * Expr END⇒ BEGIN "a" := 5 + 4 * 3 END310 Terminology. . .• Our English grammar is the 4-tuple({S,NP,V,. . . },{John,house,died,. . . },{S → NP VP, VP → V,. . . },S)• Our PL grammar is the 4-tuple({Program,Stat,. . . },{BEGIN,:=,*,. . . },{ Program ::= BEGIN Stat END,. . . },Program)11 Parse Trees• We often want to show how a particular sentence was derived. We can do this without listing all thesteps explicitly by drawing aparse tree.• A parse tree is a tree where1. The root is labeled by the start symbol.2. Each leaf is labeled by a terminal symbol.3. Each interior node is labeled by a non-terminal symbol.12 Parse Trees. . .• If one step of our derivation is· · · A · · · ⇒ · · · X Y Z · · ·(i.e, we used the rule A → XY Z) then we’ll get a parse (sub-)tree............AXYZ13 Parse Trees. . .SS→NP VP=⇒ NP VPNP→N=⇒ N VPN→John=⇒ John VPVP→V NP=⇒ John V NPV→kissed=⇒ John kissed NPNP→N=⇒ John kissed NN→Lisa=⇒ John kissed LisaLisaNPSVPNVNPNkissedJohn414 Parse Trees. . .Program ⇒ BEGIN Stat END⇒ BEGIN ident := Expr END⇒ BEGIN "a" := Expr END⇒ BEGIN "a" := Expr + Expr END⇒ BEGIN "a" := 5 + Expr END⇒ BEGIN "a" := 5 + Expr * Expr END⇒ BEGIN "a" := 5 + 4 * Expr END⇒ BEGIN "a" := 5 + 4 * 3 END+ident"a"Expr5Expr4Expr3Expr*ExprEND:=BEGIN StatProgram15Regular Grammars and Lexical Analysis16 Regular Grammars• A grammar isregular if all rules are of the formA → aBA → a• By convention, the symbols A, B, C, . . . are non-terminals, a, b, c, . . . are terminals, and α, β, γ, . . . arestrings of symbols.• Regular grammars are used to describe the lexical structure of programs, i.e. what tokens look like.17Parsing and the Definition of Syntax18 Context-Free Grammars• Programming language syntax is described by acontext free grammar (CFG).• In a CFG all rules are of the formA → γγ is any sequence of terminals or non-terminals. A is a single non-terminal.• Example: an if-statement consists of an if-token, expression, then-token, statement, and (maybe) anelse-token followed by a statement.519 EBNF• BNF is Backus-Naur Form, a way to write CFGs. EBNF (Extended BNF) is a more expressive wayto write CFGs.• Repetition and choice are common structures in a language (and hence, its grammar).• Repetition:int x,y,z,w,....;• Choice:class C { ... }class C extends D { ... }20 EBNF. . .• In BNF, our variable declarationint x,y,z,w,....;looks like this:vars ::= ident ident idlist ;idlist ::= , ident idlist | ǫ• In EBNF, it looks like this:vars ::= ident ident { , ident } ;• I.e. {e} means that e is repeated 0 or more times.21 EBNF. . .• In BNF, our class declarationclass C extends D { ... }looks like this:class ::= class ident extends { . . . }extends ::= extends ident | ǫ• In EBNF, it looks like this:class ::= class ident [extends ident] { . . . }• I.e. [e] means that e is optional.622 EBNF for Lucaprogram ::=PROGRAM ident ; decl list block .decl list ::={declaration ; }declaration ::=VAR ident : ident |TYPEident = RECORD [ [field list] ] |TYPE ident = ARRAY expression OF ident |CONST ident : ident = expression |PROCEDUREident ( [formal list] ) decl list block ;23 EBNF for Luca. . .field list ::= field decl {; field decl }field decl ::= ident : identformal list ::= formal param {; formal param }formal param ::= [VAR] ident : identactual list ::= expression {, expression }block ::= BEGIN


View Full Document

UA CSC 453 - Concrete and Abstract Grammars

Download Concrete and Abstract Grammars
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Concrete and Abstract Grammars and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Concrete and Abstract Grammars 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?