Unformatted text preview:

MIT 6.035 Specifying Languages with Regular Expressions and Context-Free Grammars pMartin Rinard Laboratory for Computer Science Massachusetts Institute of Technology•t t t•s•g p g ( pLanguage Definition Problem • How to precisely define language L d f l d fi i i • Layered structure of language definition • Start with a set of letters in language Lexical tructure identifies “words” in language Lexical structure -identifies words in language (each word is a sequence of letters) • Syntactic structure -identifies “sentences” inSyntactic structure identifies sentences in language (each sentence is a sequence of words) • Semantics - meaning of program (specifies what result should be for each input) • Today’s topic: lexical and syntactic structures( l i )c oneSpecifying Formal Languages • Huge Triumph of Computer Science • Beautiful Theoretical Results • Practical Techniques and Applications • Two Dual Notions • Generative approach (grammar or regular expression) • Recognition approach (automaton) Lots of theorems about onverting approach• Lots of theorems about converting one approach automatically to another•n e e romap a e•( )Specifying Lexical Structure Using Regular ExpressionsRegular Expressions • Have some alphabet ∑ = set of letters R l i b ilt f • Regular expressions are built from: • ε -empty string A yl tt rf l h b t ∑Any letter from alphabet ∑ •r1r2 – regular expression r1 followed by r2 (sequence)(sequence) •r1| r2 – either regular expression r1 or r2 (choice) • r* - iterated sequence and choice ε | r | rr | … • Parentheses to indicate grouping/precedencea(|)(|)(|)Concept of Regular Expression Generating a StringGenerating a String Rewrite regular expression until have only a sequence of letters (string) leftsequence of letters (string) left ExampleGener l Rules p (0 | 1)*.(0|1)* (0 | 1)(0 | 1)*.(0|1)* General Rules 1) r1| r2 → r1 1(0|1)*.(0|1)* 1.(0|1)* 2) r1| r2 → r2 3) r* →rr* 1.(0|1)(0|1)* 1.(0|1) ) 4) r* →ε 1.01 (0|1)(0|1)*(|)(|) (|)Nondeterminism in Generation • Rewriting is similar to equational reasoning • But different rule applications may yield different final results Example 1 Example 2 p (0|1)*.(0|1)* (0|1)(0|1)*.(0|1)* Example 2 (0|1)*.(0|1)* (0|1)(0|1)*.(0|1)* 1(0|1)*.(0|1)* 1.(0|1)* 0(0|1)*.(0|1)* 0.(0|1)* 1.(0|1)(0|1)* 1.(0|1) 10 0.(0|1)(0|1)* 0.(0|1) 01 1.0 0.1•Concept of Language Generated by Regular ExpressionsRegular Expressions • Set of all strings generated by a regular expression is language of regular expressionexpression is language of regular expression • In general, language may be (countably) infinite String in language is often called a tokenString in language is often called a token •t•w even o•Examples of Languages and Regular ExpressionsExpressions • ∑ = { 0, 1, . } (0|1)* (0|1)* Bi fl ti i b• (0|1)*.(0|1)* -Binary floating point numbers • (00)* - even-length all-zero strings 1*(01*01*)* strings ith number f1*(01*01*)* -strings with even number of zeros • ∑ = {a b c 0 1 2} ∑ { a,b,c, 0, 1, 2 } • (a|b|c)(a|b|c|0|1|2)* - alphanumeric identifiers • (0|1|2)* - trinary numberst t t t t tAlternate Abstraction Finite-State AutomataFiniteState Automata •Alphabet ∑ S f ith i iti l d• Set of states with initial and accept states • Transitions between states, labeled with letters (0|1)*.(0|1)* 1 Start state1. Accept state0 0h l b l lIf d i t t t t t t t iAutomaton Accepting String Conceptually, run string through automaton • Have current state and current letter in string • Start with start state and first letter in string • At each step, match current letter against a transition whose label is same as letter • Continue until reach end of string or match fails • If end in accept state, automaton accepts string • Language of automaton is set of strings it acceptsExample Current state p 1 1. Start state 0 0 . Accept state 11.0 Current letterCurrent letterExamplepCurrent state 1 1. Start state 0 0 . Accept state 11.0 Current letterCurrent letterExamplepCurrent state 1 1. Start state 0 0 . Accept state 11.0 Current letterCurrent letterExamplepCurrent state 1 1. Start state 0 0 . Accept state 11.0 Current letterCurrent letterExamplepCurrent state 1 1. Start state 0 0 . Accept state 11.0 Current letterCurrent letterExamplepCurrent state 1 1. Start state 0 0 . Accept state 11.0 Current letter String is accepted! Current letter•hl h ll dffaGenerative Versus Recognition • Regular expressions give you a way to generate all strings in language • Automata give you a way to recognize if a specific string is in language •Philosophically very different • Theoretically equivalent (for regular expressions nd automata) expressions and automata) • Standard approach Use regular expressions when define languageUse regular expressions when define language • Translated automatically into automata for implementationimplementation ••s••o c c o•From Regular Expressions to AutomataAutomata • Construction by structural induction Gi bit l i• Given an arbitrary regular expression r • Assume we can convert r to an automaton with One tart state One start state • One accept state Show how Show how an automaton with • One start stateOne start state • One accept state to convert all constructors to deliversBasic Constructs Accept tate Start state ε ε Accept state ε a a∈ΣSequenceAccept stateStart stateAccept stater1r2r1r21 212SequenceAccept stateStart stateOld accept stateOld start stateAccept stateOld accept stater1r2r1r21 212SequenceAccept stateStart stateOld accept stateOld start stateAccept stateOld accept stater1r2r1r2ε1 212SequenceAccept stateStart stateOld accept stateOld start stateAccept stateOld accept stater1r2εr1r2ε1 212SequenceAccept stateStart stateOld accept stateOld start stateAccept stateOld accept stater1r2εr1r2εε1 212ChoiceAccept stateStart stateAccept stater |rr1r1|r2r2ChoiceOld accept stateOld start stateAccept stateStart stateOld accept state Accept stater |rr1r1|r2r2ChoiceOld accept stateOld start stateAccept stateStart stateOld accept state Accept stater |rr1εr1|r2r2εChoiceOld accept stateOld start stateAccept stateStart stateOld accept state Accept stater |rr1εεr1|r2εr2εa ssKleene Star Old ccept tate Old start state Accept tate Start state Old accept state Accept state r* r r* ra ssKleene Star Old ccept


View Full Document

MIT 6 035 - Specifying Languages with Regular Expressions

Download Specifying Languages with Regular Expressions
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Specifying Languages with Regular Expressions and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Specifying Languages with Regular Expressions 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?