Review for the Final COMS W4115 Prof Stephen A Edwards Fall 2007 Columbia University Department of Computer Science The Final 70 minutes 4 5 problems Closed book One single sided 8 5 11 sheet of notes of your own devising Comprehensive Anything discussed in class is fair game Little if any programming Ability to write ANTLR C Java Prolog ML syntax not required Broad knowledge of languages discussed Topics 1 Structure of a Compiler Scripting Languages Scanning and Parsing Regular Expressions Context Free Grammars Top down Parsing Bottom up Parsing ASTs Name Scope and Bindings Control flow constructs Topics 2 Types Static Semantic Analysis Code Generation Functional Programming ML Lambda Calculus Logic Programming Prolog Compiling a Simple Program int gcd int a int b while a b if a b a b else b a return a What the Compiler Sees int gcd int a int b while a b if a b a b else b a return a i n t sp g c d i n t sp a sp i n t sp b nl nl sp sp w h i l e sp a sp sp b sp nl sp sp sp sp i f sp a sp sp b sp a sp sp b nl sp sp sp sp e l s e sp b sp sp a nl sp sp nl sp sp r e t u r n sp a nl nl Text file is a sequence of characters Lexical Analysis Gives Tokens int gcd int a int b while a b if a b a b else b a return a int gcd b else b int a int b while if a b a a return a b A stream of tokens Whitespace comments removed a Parsing Gives an AST func int gcd args arg int seq arg a int int gcd int a int b while a b if a b a b else b a return a while b return if a b a a b a b Abstract syntax tree built from parsing rules b a Semantic Analysis Resolves Symbols func int gcd args arg int Symbol Table int a a seq arg int while b return if a b a a b a b int b Types checked references to symbols resolved b a Translation into 3 Address Code L0 sne seq btrue sl seq btrue sub jmp L4 sub L5 jmp L1 ret 1 0 0 3 2 2 a L5 b L0 a a 1 L1 b 3 L4 a b 0 while a b a 0 if a b b a b int gcd int a int b while a b if a b a b else b a return a b a b a Idealized assembly language w infinite registers Generation of 80386 Assembly gcd L8 L5 L3 pushl movl movl movl cmpl je jle subl jmp subl jmp leave ret ebp Save frame pointer esp ebp 8 ebp eax Load a from stack 12 ebp edx Load b from stack edx eax L3 while a b L5 if a b edx eax a b L8 eax edx b a L8 Restore SP BP Scanning and Automata Describing Tokens Alphabet A finite set of symbols Examples 0 1 A B C Z ASCII Unicode String A finite sequence of symbols from an alphabet Examples the empty string Stephen Language A set of strings over an alphabet Examples the empty language 1 11 111 1111 all English words strings that start with a letter followed by any sequence of letters and digits Operations on Languages Let L wo M man men Concatenation Strings from one followed by the other LM man men woman women Union All strings from each language L M wo man men Kleene Closure Zero or more concatenations M M M M M M M man men manman manmen menman menmen manmanman manmanmen manmenman Regular Expressions over an Alphabet A standard way to express languages for tokens 1 is a regular expression that denotes 2 If a a is an RE that denotes a 3 If r and s denote languages L r and L s r s denotes L r L s r s denotes tu t L r u L s i L0 and Li LLi 1 r denotes L i 0 Nondeterministic Finite Automata n All strings A B C D 1 Set of states S containing an 2 Set of input symbols 0 1 even number of 3 Transition function S 2S 0 s and 1 s state 0 1 A B C 0 start A B A D B 0 C D A 1 1 1 1 C B D 0 C D 4 Start state s0 A 0 5 Set of accepting states F n A o o The Language induced by an NFA An NFA accepts an input string x iff there is a path from the start state to an accepting state that spells out x 0 start A B 0 1 1 1 1 0 Show that the string C D 010010 is accepted 0 A 0 B 1 D 0 C 0 D 1 B 0 A Translating REs into NFAs a r1 r2 r1 r2 start i start start a r1 r2 r1 i f f r2 r start i r f Translating REs into NFAs Example translate a b abb into an NFA a 0 1 3 2 b 6 b a 7 8 b 9 10 5 4 Show that the string aabb is accepted 0 1 2 a 3 6 7 a 8 b 9 b 10 Simulating NFAs Problem you must follow the right arcs to show that a string is accepted How do you know which arc is right Solution follow them all and sort it out later Two stack NFA simulation algorithm 1 Initial states the closure of the start state 2 For each character c New states follow all transitions labeled c Form the closure of the current states 3 Accept if any final state is accepting Simulating an NFA aabb Start a 0 1 3 2 b 5 4 a 6 7 b 8 b 9 10 Simulating an NFA aabb closure a 0 1 3 2 b 5 4 a 6 7 b 8 b 9 10 Simulating an NFA a abb a 0 1 3 2 b 5 4 a 6 7 b 8 b 9 10 Simulating an NFA a abb closure a 0 1 3 2 b 5 4 a 6 7 b 8 b 9 10 Simulating an NFA aa bb a 0 1 3 2 b 5 4 a 6 7 b 8 b 9 10 Simulating an NFA aa bb closure a 0 1 3 2 b 5 4 a 6 7 b 8 b 9 10 Simulating an NFA aab b a 0 1 3 2 b 5 4 a 6 7 b 8 b 9 10 Simulating an NFA aab b closure a 0 1 3 2 b 5 4 a 6 7 b 8 b 9 10 Simulating an NFA aabb a 0 1 3 2 b 5 4 a 6 …
View Full Document
Unlocking...