DOC PREVIEW
Columbia COMS W4115 - Review for the Midterm

This preview shows page 1-2-3-4-29-30-31-32-59-60-61-62 out of 62 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

ParsingReview for the MidtermStephen A. EdwardsColumbia UniversityFall 2008The Midterm70 minutes4–5 problemsClosed bookOne sheet of notes of your own devisingComprehensive: Anything discussed in class is fair gameLittle, if any, programming.Details of O’Caml/C/C++/Java syntax not requiredBroad knowledge of languages discussedTopicsStructure of a CompilerScanning and ParsingRegular ExpressionsContext-Free GrammarsBottom-up ParsingASTsName, Scope, and BindingsPart IStructure of a CompilerCompiling a Simple Programint gcd(int a, int b){while (a != b) {if (a > b) a -= b;else b -= a;}return a;}What the Compiler Seesint gcd(int a, int b){while (a != b) {if (a > b) a -= b;else b -= a;}return a;}i n t sp g c d ( i n t sp a , sp in t sp b ) nl { nl sp sp w h i l e sp( a sp ! = sp b ) sp { nl sp sp sp sp if sp ( a sp > sp b ) sp a sp - = sp b; nl sp sp sp sp e l s e sp b sp - = spa ; nl sp sp } nl sp sp r e t u r n spa ; nl } nlText file is a sequence of charactersLexical Analysis Gives Tokensint gcd(int a, int b){while (a != b) {if (a > b) a -= b;else b -= a;}return a;}intgcd ( int a , int b ) { while ( a!= b ) { if ( a > b ) a -= b ; elseb -= a ; } return a ; }A stream of tokens. Whitespace, comments removed.Parsing Gives a n ASTfuncintgcd argsargint aargint bseqwhile!=a bif>a b-=a b-=b areturnaint gcd(int a, int b){while (a != b) {if (a > b) a -= b;else b -= a;}return a;}Abstract syntax tree built from parsing rules.Semantic Anal ysis Resolves SymbolsSymbolTable:int aint bfuncintgcd argsargint aargint bseqwhile!=a bif>a b-=a b-=b areturnaTypes checked; references to symbols resolvedTranslation into 3-Address CodeL0: sne $1, a, bseq $0, $1, 0btrue $0, L1 % while (a != b)sl $3, b, aseq $2, $3, 0btrue $2, L4% if (a < b)sub a, a, b % a -= bjmp L5L4: sub b, b, a % b -= aL5: jmp L0L1: ret aint gcd(int a, int b){while (a != b) {if (a > b) a -= b;else b -= a;}return a;}Idealized a ssembly language w/ infinite registersGene ration of 80386 Assemblygcd: pushl %ebp % Save FPmovl %esp,%ebpmovl 8(%ebp),%eax% Load a from stackmovl 12(%ebp),%edx % Load b from stack.L8: cmpl %edx,%eaxje .L3% while (a != b)jle .L5 % if (a < b)subl %edx,%eax % a -= bjmp .L8.L5: subl %eax,%edx % b -= ajmp .L8.L3: leave % Restore SP, BPretPart IIScanningDescribing TokensAlphabet: A finite set of symbolsExamples: { 0, 1 }, { A, B, C, .. . , Z }, ASCII, UnicodeString: A finite sequence of symbols f rom an alphabetExamples: ǫ (the empty string), Stephen, αβγLanguage: A set of strings over an alphabetExamples: ; (the empty language), { 1, 11, 111, 1111 }, all Englishwords, strings that start with a letter followed by any sequence ofletters and digitsOperations on LanguagesLet L = { ǫ, wo }, M = { man, men }Concatenation: Strings from one followed by the otherLM = { man, men, woman, women }Union: All strings f rom each languageL ∪ M = {ǫ, wo, man, men }Kleene Closure: Zero or more concatenationsM∗= {ǫ} ∪ M ∪ M M ∪ M M M · ·· ={ǫ, man, men, manman, ma nmen, menman, menmen,manmanman, manmanmen, manmenma n, . . . }Regular Expressions over an Alphabet ΣA standard way to express languages for tokens.1. ǫ is a regular expression that denotes {ǫ}2. If a ∈ Σ, a is an RE that denotes {a}3. If r and s denote languages L(r ) and L(s),Ï(r )|(s) denotes L(r ) ∪ L(s)Ï(r )( s) denotes {tu : t ∈ L(r ),u ∈ L(s)}Ï(r )∗denotes ∪∞i=0Li(L0= {ǫ} and Li= LLi−1)Nondeterministic Finite Automata“All strings containing aneven number of 0’s and 1’s”A BCD00001111start1. Set of states S:nA , B , C , Do2. Set of input symbols Σ: {0,1}3. Transition function σ : S × Σǫ→ 2Sstateǫ 0 1A – {B} {C}B– {A} {D}C– {D} {A}D– {C} {B}4. Start state s0: A5. Set of accepting states F :nAoThe Language induced by an NFAAn NFA accepts an input string x iff there is a path from the startstate to an accepting state that “spells out” x.A BCD00001111startShow that the string “010010” isaccepted.A B DCD B A010 010Translating REs into NFAsastartar1r2ifr1r2startr1|r2ifr1r2startǫǫǫǫ(r )∗ifrstartǫǫǫǫTranslating REs into NFAsExample: translate (a|b)∗abb into an NFA0123456 7 8 910ǫǫaǫbǫǫǫ abbǫǫShow that the string “aabb” is accepted.0 1 2 3 6 7 8 9 10ǫǫaǫ ǫ ab bSimulating NFAsProblem: you must follow the “right” arcs to show that a string isaccepted. How do you know which arc is right?Solution: follow them all an d sort it out later.“Two-stack” NFA simulation algorithm:1. Initial states: the ǫ-closure of the start state2. For each character c,ÏNew states: follow all transitions labeled cÏForm the ǫ-closure of the current states3. Accept if any final state is acceptingSimulating an NFA: ·aabb, Start0 12 34 56 7 8 910ǫǫaǫbǫǫǫ abbǫǫSimulating an NFA: ·aabb, ǫ-cl osure0 12 34 56 7 8 910ǫǫaǫbǫǫǫ abbǫǫSimulating an NFA: a·abb0 12 34 56 7 8 910ǫǫaǫbǫǫǫ abbǫǫSimulating an NFA: a·abb, ǫ-closure0 12 34 56 7 8 910ǫǫaǫbǫǫǫ abbǫǫSimulating an NFA: aa·bb0 12 34 56 7 8 910ǫǫaǫbǫǫǫ abbǫǫSimulating an NFA: aa·bb, ǫ-closure0 12 34 56 7 8 910ǫǫaǫbǫǫǫ abbǫǫSimulating an NFA: aab·b0 12 34 56 7 8 910ǫǫaǫbǫǫǫ abbǫǫSimulating an NFA: aab·b, ǫ-closure0 12 34 56 7 8 910ǫǫaǫbǫǫǫ abbǫǫSimulating an NFA: aabb·0 12 34 56 7 8 910ǫǫaǫbǫǫǫ abbǫǫSimulating an NFA: aabb·, Done0 12 34 56 7 8 910ǫǫaǫbǫǫǫ abbǫǫDeterm inist ic Finite AutomataRestricted form of NFAs:ÏNo state has a transition on ǫÏFor each state s and symbol a, there is at most one edgelabeled a leaving s.Differs subtly from the definition used in COMS W3261 (Sipser,Introduction to the Theory of Computation)Very easy to check acceptance: simulate by maintaining currentstate. Accept if you end up on an accepting state. Reject if you endon a non-accepting state or if there is no transition from the currentstate for the next symbol.Determ inist ic Finite Automata{type token = ELSE | ELSEIF}rule token =parse "else" { ELSE }| "elseif" { ELSEIF }els eifDeterm inist ic Finite Automata{ type token = IF | ID of string | NUM of string }rule token =parse "if" { IF }| [’a’-’z’] [’a’-’z’ ’0’-’9’]*as lit { ID(lit) }| [’0’-’9’]+ as num { NUM(num) }ID IFIDNUMifa-z0-9a-eg-z0-9a-z0-9a-hj-z0-90-9Building a DFA from an NFASubset construction algorithmSimulate the NFA for all possible inputs and track the states thatappear.Each unique state during simulation becomes a state in the DFA.Subset construction for (a|b)∗abb (1)abSubset


View Full Document

Columbia COMS W4115 - Review for the Midterm

Documents in this Course
YOLT

YOLT

13 pages

Lattakia

Lattakia

15 pages

EasyQL

EasyQL

14 pages

Photogram

Photogram

163 pages

Espresso

Espresso

27 pages

NumLang

NumLang

6 pages

EMPATH

EMPATH

14 pages

La Mesa

La Mesa

9 pages

JTemplate

JTemplate

238 pages

MATVEC

MATVEC

4 pages

TONEDEF

TONEDEF

14 pages

SASSi

SASSi

16 pages

JTemplate

JTemplate

39 pages

BATS

BATS

10 pages

Synapse

Synapse

11 pages

c.def

c.def

116 pages

TweaXML

TweaXML

108 pages

Load more
Download Review for the Midterm
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Review for the Midterm and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Review for the Midterm 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?