DOC PREVIEW
Berkeley COMPSCI 164 - Lecture Notes

This preview shows page 1-2-3-4-27-28-29-30-56-57-58-59 out of 59 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

More Finite Automata/ Lexical Analysis /Introduction to ParsingProgramming a lexer in Lisp “by hand”Set up some data and predicatesTokenize function… Next-token / two versionsExampleIntroduction to ParsingOutlineLanguages and AutomataLimitations of Regular LanguagesContext Free Grammars are more powerfulThe Functionality of the ParserExampleExampleExampleComparison with Lexical AnalysisThe Role of the ParserA test framework for trivial MJ line of codeContext-Free Grammars: Why Notational ConventionsExamples of CFGsExamples of CFGsExamples of CFGs (cont.)The Language of a CFGKey IdeaThe Language of a CFG (Cont.)The Language of a CFG (Cont.)The Language of a CFGTerminalsExamplesTo be more formal..Let’s produce some sentential forms of a MJgrammarMJ Example (Cont.)Arithmetic ExampleNotesMore NotesSimple grammar (3.1 in text)Derivations and Parse TreesBuilding a Parse TreeAnother Derivation ExampleDerivation Example (Cont.)Left-Most Derivation in Detail (1)Derivation in Detail (2)Derivation in Detail (3)Derivation in Detail (4)Derivation in Detail (5)Derivation in Detail (6)Notes on DerivationsWhat is a Right-most Derivation?Right-most Derivation in Detail (1)Right-most Derivation in Detail (2)Right-most Derivation in Detail (3)Right-most Derivation in Detail (4)Right-most Derivation in Detail (5)Right-most Derivation in Detail (6)Derivations and Parse TreesSummary: Objectives of ParsingQuestion from 9/21: grammar for /* */Prof. Fateman CS 164 Lecture 7 1More Finite Automata/ Lexical Analysis /Introduction to ParsingLecture 7Prof. Fateman CS 164 Lecture 7 2Programming a lexer in Lisp “by hand”• (actually picked out of comp.lang.lisp when I was teaching CS164 3 years ago, an example by Kent Pitman).• Given a string like "foo+34-bar*g(zz)" we could separate it into a lisp list of strings:("foo" "+" "34" …) or we could try for a list of Lisp symbols like (foo + 34 – bar * g |(| zz |)| ).Huh? What is |(| ? It is the way lisp prints the symbol with printname "(" so as to not confuse the Lisp read program, and humans too.Prof. Fateman CS 164 Lecture 7 3Set up some data and predicates(defvar *whitespace* '(#\Space #\Tab #\Return #\Linefeed))(defun whitespace? (x) (member x *whitespace*))(defvar *single-char-ops* '(#\+ #\- #\* #\/ #\( #\) #\. #\, #\=))(defun single-char-op? (x) (member x *single-char-ops*))Prof. Fateman CS 164 Lecture 7 4Tokenize function…(defun tokenize (text) ;; text is a string "ab+cd(x)"(let ((chars '()) (result '()))(declare (special chars result)) ;;explain scope(dotimes (i (length text))(let ((ch (char text i))) ;;pick out ith character of string(cond ((whitespace? ch) (next-token))((single-char-op? ch) (next-token)(push ch chars)(next-token))(t(push ch chars)))))(next-token)(nreverse result)))Prof. Fateman CS 164 Lecture 7 5Next-token / two versions(defun next-token () ;;simple version(declare (special chars result))(when chars (push (coerce (nreverse chars) 'string) result)(setf chars '())))(defun next-token () ;; this one “parses” integers magically(declare (special chars result))(when chars (let((st (coerce (reverse chars) 'string))) ;keep chars around to test(push (if (every #'digit-char-p chars)(read-from-string st)(intern st))result))(setf chars '())))Prof. Fateman CS 164 Lecture 7 6Example• (tokenize "foo(-)+34") Î (foo |(| - |)| + 34)• (Much) more info in file: pitmantoken.cl• Missing: line/column numbers, 2-char tokens, keyword vs. identifier distinction. Efficiency here is low (but see file for how to use hash tables for character types!)• Also note that Lisp has a programmable read-table so that its own idea of what delimits a token can be changed, as well as meanings of every character.Prof. Fateman CS 164 Lecture 7 7Introduction to ParsingProf. Fateman CS 164 Lecture 7 8Outline• Regular languages revisited• Parser overview• Context-free grammars (CFG’s)• DerivationsProf. Fateman CS 164 Lecture 7 9Languages and Automata• Formal languages are very important in CS– Especially in programming languages• Regular languages– The weakest class of formal languages widely used– Many applications• We will also study context-free languagesProf. Fateman CS 164 Lecture 7 10Limitations of Regular Languages• Intuition: A finite automaton with N states that runs N+1 steps must revisit a state.• Finite automaton can’t remember # of times it has visited a particular state. No way of telling how it got here.• Finite automaton can only use finite memory.– Only enough to store in which state it is – Cannot count, except up to a finite limit• E.g., language of balanced parentheses is not regular: {(i)i| i > 0}Prof. Fateman CS 164 Lecture 7 11Context Free Grammars are more powerful• Easy to parse balanced parentheses and similar nested structures• A good fit for the vast majority of syntactic structures in programming languages like arithmetic expressions.• Eventually we will find constructions that are not CFG, or are more easily dealt with outside the parser.Prof. Fateman CS 164 Lecture 7 12The Functionality of the Parser• Input: sequence of tokens from lexer• Output: parse tree of the programProf. Fateman CS 164 Lecture 7 13Example•Program Sourceif (x < y) a=1; else a=2;Lex output = parser input (simplified)IF lpar ID < ID rpar ID = ICONST ; ID= ICONST ICONST•Parser output (simplified)IF-THEN-ELSE<IDIDASSIGNASSIGNProf. Fateman CS 164 Lecture 7 14Example• MJSourceif (x<y) a=1; else a=2;• Actual lex output (from lisp…)(fstring " if (x<y) a=1; else a=2;") Æ(if if (1 . 10)) (#\( #\( (1 . 12)) (id x (1 . 13)) (#\< #\< (1 . 14)) (id y (1 . 15)) (#\) #\) (1 . 16)) (id a (1 . 18)) (#\= #\= (1 . 19)) (iconst 1 (1 . 20)) (#\; #\; (1 . 21)) (else else (1 . 26)) …Prof. Fateman CS 164 Lecture 7 15Example•MJSourceif (x < y) a=1; else a=2;• Actual Parser output ; lc = line&column(If (LessThan (IdentifierExp x) (IdentifierExp y))(Assign (id a lc) (IntegerLiteral 1))(Assign (id a lc) (IntegerLiteral 2))))– Or cleaned up by taking out “extra” stuff …(If (< x y) (assign a 1)(assign a 2))Prof. Fateman CS 164 Lecture 7 16Comparison with Lexical AnalysisPhase Input OutputLexer Sequence of charactersSequence of tokensParser Sequence of tokensParse treeProf. Fateman CS 164 Lecture 7 17The Role of the Parser• Not all sequences of tokens are programs . . .• . . . Parser must distinguish between valid and invalid sequences of tokens• Some


View Full Document

Berkeley COMPSCI 164 - Lecture Notes

Documents in this Course
Lecture 8

Lecture 8

40 pages

Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?