DOC PREVIEW
UA CSC 453 - Study Notes

This preview shows page 1-2-3-4-5 out of 14 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CSc 453Compilers and Systems Software4 : Lexical Analysis IIDepartment of Computer ScienceUniversity of [email protected]° 2009 Christian CollbergImplementing AutomataNFAs and DFAscan be hard-codedusing this pattern:state := start statec := first charwhile (true) {case state of {1: case c of {char1: {c := nextChar();state :=new state; }2: case c of {char2: {c := nextChar();state :=new state; }char3: {return; /* accept */}}}Implementing Automata. . .We can also encode the transitions directly into a transitiontable:next statestatechar1char2other Accepting1 22 2 2 [3]3√States in brackets don’t consume their inputs. Acceptingstates are indicated by a√. Empty entries represent errorstates.Implementing Automata. . .Given the table, we can write an interpreter to perform lexicalanalysis ofany DFA:state := 1c :=first charwhile not ACCEPT[state] do {newstate := NEXTSTATE[state,c]if ADVANCE[state,c] thenc := nextChar()state := newstate}if ACCEPT[state] then accept;Table-driven C Comments*3401all charsexcept *all charsexcept *,/2*/*/state / * other Accepting0 11 22 2 3 23 4 3 24√Table-driven C Comments. . .class Comments {public static final int SLASH = 0;public static final int STAR = 1;public static final int OTHER = 2;public static final int END = 3;static int[][] NEXTSTATE = {// "/" "*" other{ 1, -1, -1},{-1, 2, -1},{ 2, 3, 2},{ 4, 3, 2},{-1, -1, -1}};Table-driven C Comments. . .static boolean[] ACCEPT ={false,false,false,false,true};static boolean[][] ADVANCE = {// "/" "*" other{true, true, true},{true, true, true},{true, true, true},{true, true, true},{true, true, true}};Table-driven C Comments. . .static String input;static int current = -1;static int nextChar() {int ch;current++;if (current >= input.length()) return END;switch (input.charAt(current)) {case ’/’ : { ch = SLASH; break;}case ’*’ : { ch = STAR; break;}default : { ch = OTHER; break;}}return ch;}Table-driven C Comments. . .public static boolean interpret () {int state = 0;int c = nextChar();while ((c != END) && (state>=0) && !ACCEPT[state])int newstate = NEXTSTATE[state][c];if (ADVANCE[state][c])c = nextChar();state = newstate;}return (state>=0) && ACCEPT[state];}public static void main (String[] args) {input = args[0];boolean result = interpret();}Hard-coded C Comments*3401all charsexcept *all charsexcept *,/2*/*/Let’s do the same thing again, but this time we will hard-codethe interpreter using switch-statements.nextChar and the constant declarations are the same as forthe previous program.Hard-coded C Comments. . .class Comments {// Declarations of SLASH,STAR,OTHER,END, and nextChar().public static boolean interpret() {int state = 0;int ch = nextChar();while(true) {switch (state) {case -1 :return false;case 0 :switch (ch) {case SLASH:ch=nextChar();state=1;break;default :return false;}break;case 1 :switch (ch) {case STAR: ch=nextChar(); state=2;break;default : return false;}break;case 2 :switch (ch) {case SLASH: ch=nextChar(); state=2;break;case STAR : ch=nextChar(); state=3;break;case OTHER: ch=nextChar(); state=2;break;default : return false;}break;Hard-coded C Comments. . .case 3 :switch (ch) {case SLASH: ch=nextChar(); state=4;break;case STAR : ch=nextChar(); state=3;break;case OTHER: ch=nextChar(); state=2;break;default : return false;}break;case 4 :return (ch == END);}}}From REs to NFAsFrom REs to NFAsWe will describe our tokens using REs, convert these to anNFA, convert this to a DFA, and finally code this into aprogram or a table to be interpreted:NFADFAREprogramtableinterpreterWe will next show how to construct an NFA from a regularexpression. This algorithm is called Thompson’s Construction(after Ken Thompson of Bell Labs).Thompson’s ConstructionEach piece of a regular expression is turned into a part of anNFA.Each part is glued together (using ǫ-transitions) into acomplete automaton.An RE matching the character a translates intoaAn RE matching ǫ translates intoǫThompson’s Construction – ConcatenationWe represent an RE component r by the figure:rfor rStart stateAccepting statefor rAn RE matching the regular expression r followed by theregular expression s (rs) translates intosrǫThompson’s Construction – AlternationThe regular expression r|s translates intosrǫǫǫǫThompson’s Construction – RepetitionThe regular expression r* translates intorǫǫǫǫThompson’s Construction – Example IThe regular expression ab|a translates intoa baǫǫǫǫǫThompson’s Construction – Example IIThe regular expression letter(letter|digit)* translatesintoletterletterdigitǫǫǫǫǫǫǫǫǫFrom NFA to DFAFrom NFA to DFAWe now know how to translate a regular expression into anNFA, and how to translate a DFA into code. The missingpiece is how to translate an NFA into a DFA.NFADFAREprogramtableinterpreterFrom NFA to DFA. . .Each state in the DFA corresponds to a set of states in theNFA.The DFA will be in state2,3,4if the NFA could have been in any of the states2°, 3°, 4°.After reading a1a2···anthe DFA is in a a state thatrepresents the states the NFA could be in after seeing theinput a1a2···an.From NFA to DFA. . .4312aa bǫǫǫbBAbbaCA°in the DFA represents the set of states {1°, 2°, 4°} in theNFA. These are the states the FAs could be in before anyinput is consumed (the start states).B°in the DFA represents the set of states {2°, 3°, 4°} in theNFA. These are the states we can get to on the symbol afrom A°.From NFA to DFA. . .We need three functions:1ǫ-closure(T ) is the set of NFA states reachable from someNFA state s in T on ǫ-transitions alone. This is essentially agraph exploration algorithm that finds the nodes in a graphreachable from a griven node.2move(T ,a) is the set of NFA states to which there is atransition on input symbol a from some NFA state s ∈ T .3SubsetConstruction(N) returns a DFAD=(Dstates,Dtrans) corresponding to NFA N.ǫ-closure(T )procedure ǫ-closure(T )push all states in T onto stackC := Twhile stack is not empty dot := pop(stack)for each edge tǫ→ u doif u is not in C thenC := C ∪ upush(stack, u)return Cǫ-closure(T ) – Example4312aa bǫǫǫǫ-closure( 1°) = {1°, 2°, 4°}ǫ-closure( 2°) = {2°}ǫ-closure( 4°) = {2°, 4°}ǫ-closure({ 3°, 4°}) = {2°, 3°, 4°}move(T ,a) – Example4312aa bǫǫǫmove({ 1°}, a) = { 2°, 3°}move({ 2°, 3°}, b) = { 4°}SubsetConstruction(N)procedure SubsetConstruction(NFA N)Dstates := {ǫ-closure(s0)}Dtrans := {}repeatT := an unexplored state in Dstatesfor each input symbol a


View Full Document

UA CSC 453 - Study Notes

Download Study Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Study Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Study Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?