This preview shows page 1-2-3-4-5-6-7-8-53-54-55-56-57-58-59-108-109-110-111-112-113-114-115 out of 115 pages.
Review for the FinalCOMS W4115Prof. Stephen A. EdwardsSpring 2007Columbia UniversityDepartment of Computer ScienceThe Final70 minutes4–5 problemsClosed bookOne single-sided 8.5 × 11 sheet of notes of your owndevisingComprehensive: Anything discussed in class is fair gameLittle, if any, programming.You should understand ANTLR/C/Java/ML/PrologYou do not need to know details of syntaxTopics 1Structure of a CompilerScripting LanguagesScanning and ParsingRegular ExpressionsContext-Free GrammarsTop-down ParsingBottom-up ParsingASTsName, Scope, and BindingsControl-flow constructsTopics 2TypesStatic Semantic AnalysisCode GenerationFunctional Programming (ML, Lambda Calculus)Logic Programming (Prolog)Compiling a Simple Programint gcd(int a, int b){while (a != b) {if (a > b) a -= b;else b -= a;}return a;}What the Compiler Seesint gcd(int a, int b){while (a != b) {if (a > b) a -= b;else b -= a;}return a;}i n t sp g c d ( i n t sp a , sp in t sp b ) nl { nl sp sp w h i l e sp( a sp ! = sp b ) sp { nl sp sp sp sp if sp ( a sp > sp b ) sp a sp - = sp b; nl sp sp sp sp e l s e sp b sp - = spa ; nl sp sp } nl sp sp r e t u r n spa ; nl } nlText file is a sequence of charactersLexical Analysis Gives Tokensint gcd(int a, int b){while (a != b) {if (a > b) a -= b;else b -= a;}return a;}int gcd ( int a , int b ) – while ( a!= b ) – if ( a > b ) a -= b ;else b -= a ; ˝ return a ; ˝A stream of tokens. Whitespace, comments removed.Parsing Gives an ASTfuncintgcd argsargint aargint bseqwhile!=a bif>a b-=a b-=b areturnaint gcd(int a, int b){while (a != b) {if (a > b) a -= b;else b -= a;}return a;}Abstract syntax tree built from parsing rules.Semantic Analysis ResolvesSymbolsSymbolTable:int aint bfuncintgcd argsargint aargint bseqwhile!=a bif>a b-=a b-=b areturnaTypes checked; references to symbols resolvedTranslation into 3-Address CodeL0: sne $1, a, bseq $0, $1, 0btrue $0, L1 % while (a != b)sl $3, b, aseq $2, $3, 0btrue $2, L4% if (a < b)sub a, a, b % a -= bjmp L5L4: sub b, b, a% b -= aL5: jmp L0L1: ret aint gcd(int a, int b){while (a != b) {if (a > b) a -= b;else b -= a;}return a;}Idealized assembly language w/ infinite registersGeneration of 80386 Assemblygcd: pushl %ebp % Save frame pointermovl %esp,%ebpmovl 8(%ebp),%eax % Load a from stackmovl 12(%ebp),%edx % Load b from stack.L8: cmpl %edx,%eaxje .L3 % while (a != b)jle .L5 % if (a < b)subl %edx,%eax % a -= bjmp .L8.L5: subl %eax,%edx % b -= ajmp .L8.L3: leave% Restore SP, BPretScanning and AutomataDescribing TokensAlphabet: A finite set of symbolsExamples: { 0, 1 }, { A, B, C, ..., Z }, ASCII, UnicodeString: A finite sequence of symbols from an alphabetExamples: ǫ (the empty string), Stephen, αβγLanguage: A set of strings over an alphabetExamples: ∅ (the empty language), { 1, 11, 111, 1111 },all English words, strings that start with a letter followed byany sequence of letters and digitsOperations on LanguagesLet L = { ǫ, wo }, M = { man, men }Concatenation: Strings from one followed by the otherLM = { man, men, woman, women }Union: All strings from each languageL ∪ M = {ǫ, wo, man, men }Kleene Closure: Zero or more concatenationsM∗= {ǫ, M, MM, MMM, . . .} ={ǫ, man, men, manman, manmen, menman, menmen,manmanman, manmanmen, manmenman, ...}Regular Expressions over anAlphabet ΣA standard way to express languages for tokens.1. ǫ is a regular expression that denotes {ǫ}2. If a ∈ Σ, a is an RE that denotes {a}3. If r and s denote languages L(r) and L(s),•(r)|(s) denotes L(r) ∪ L(s)•(r)(s) denotes {tu : t ∈ L(r), u ∈ L(s)}•(r)∗denotes ∪∞i=0Li(L0= ∅ and Li= LLi−1)Nondeterministic Finite Automata“All stringscontaining aneven number of0’s and 1’s”A BCD00001111start1. Set of states S:nA , B , C , Do2. Set of input symbols Σ: {0, 1}3. Transition function σ : S × Σǫ→ 2Sstateǫ 0 1A – {B} {C}B– {A} {D}C– {D} {A}D– {C} {B}4. Start state s0:A5. Set of accepting states F :nAoThe Language induced by an NFAAn NFA accepts an input string x iff there is a path fromthe start state to an accepting state that “spells out” x.A BC D00001111startShow that the string“010010” is accepted.A B D C D B A010 010Translating REs into NFAsastartar1r2ifr1r2startr1|r2ifr1r2startǫǫǫǫ(r)∗ifrstartǫǫǫǫTranslating REs into NFAsExample: translate (a|b)∗abb into an NFA0123456 7 8 9 10ǫǫaǫbǫǫǫ ab bǫǫShow that the string “aabb” is accepted.01 23 6 7 8 9 10ǫǫaǫǫ ab bSimulating NFAsProblem: you must follow the “right” arcs to show that astring is accepted. How do you know which arc is right?Solution: follow them all and sort it out later.“Two-stack” NFA simulation algorithm:1. Initial states: the ǫ-closure of the start state2. For each character c,•New states: follow all transitions labeled c•Form the ǫ-closure of the current states3. Accept if any final state is acceptingSimulating an NFA: ·aabb, Start012345678 9 10ǫǫaǫbǫǫǫab bǫǫSimulating an NFA: ·aabb, ǫ-closure012345678 9 10ǫǫaǫbǫǫǫab bǫǫSimulating an NFA: a·abb012345678 9 10ǫǫaǫbǫǫǫab bǫǫSimulating an NFA: a·abb, ǫ-closure012345678 9 10ǫǫaǫbǫǫǫab bǫǫSimulating an NFA: aa·bb012345678 9 10ǫǫaǫbǫǫǫab bǫǫSimulating an NFA: aa·bb, ǫ-closure012345678 9 10ǫǫaǫbǫǫǫab bǫǫSimulating an NFA: aab·b012345678 9 10ǫǫaǫbǫǫǫab bǫǫSimulating an NFA: aab·b, ǫ-closure012345678 9 10ǫǫaǫbǫǫǫab bǫǫSimulating an NFA: aabb·012345678 9 10ǫǫaǫbǫǫǫab bǫǫSimulating an NFA: aabb·, Done012345678 9 10ǫǫaǫbǫǫǫab bǫǫDeterministic Finite AutomataRestricted form of NFAs:•No state has a transition on ǫ•For each state s and symbol a, there is at most oneedge labeled a leaving s.Differs subtly from the definition used in COMS W3261(Sipser, Introduction to the Theory of Computation)Very easy to check acceptance: simulate by maintainingcurrent state. Accept if you end up on an accepting state.Reject if you end on a non-accepting state or if there is notransition from the current state for the next symbol.Deterministic Finite AutomataELSE: "else" ;ELSEIF: "elseif" ;els eifDeterministic Finite AutomataIF: "if" ;ID: ’a’..’z’ (’a’..’z’ | ’0’..’9’)*;NUM: (’0’..’9’)+ ;ID IFID IDNUM NUMifa-z0-9a-eg-z0-9a-z90-9a-hj-za-z0-90-90-90-9Building a DFA from an NFASubset construction algorithmSimulate the NFA for all possible inputs and track thestates that appear.Each unique state during simulation becomes a state inthe DFA.Subset construction
View Full Document