MIT 6 035 - Specifying Languages with Regular Expressions and Context-Free Grammars

Unformatted text preview:

Language Definition ProblemSpecifying Formal LanguagesSpecifying Lexical Structure Using Regular ExpressionsConcept of Regular Expression Generating a StringNondeterminism in GenerationConcept of Language Generated by Regular ExpressionsExamples of Languages and Regular ExpressionsAlternate Abstraction Finite-State AutomataAutomaton Accepting StringGenerative Versus RecognitionFrom Regular Expressions to AutomataBasic ConstructsSequenceNFA vs. DFAConversionsNFA to DFA ConstructionNFA to DFA Example for (a|b)*.(a|b)*Lexical Structure in LanguagesLexical Categories ExampleProgramming Language SyntaxSolution – Context-Free GrammarProduction GameSample DerivationParse TreeParse Tree for <2-1>+1Ambiguity in GrammarAmbiguity ExampleEliminating AmbiguityPrecedence ViolationsHacking Around PrecedenceParse Tree ChangesGeneral IdeaParserSolutionAbstract Syntax TreesExampleSummaryHandling If Then ElseParse TreesAlternative ReadingsHacked GrammarGrammar VocabularyDefining a LanguageRegular LanguagesRegular LanguagesGrammar and Automata CorrespondenceContext-Free GrammarsPush-Down AutomataCFG Versus PDAContext-Sensitive Grammars and Turing MachinesMIT 6.035Specifying Languages with Regular Expressions and Context-Free GrammarsMartin RinardLaboratory for Computer ScienceMassachusetts Institute of TechnologyLanguage Definition Problem• How to precisely define language• Layered structure of language definition• Start with a set of letters in language• Lexical structure - identifies “words” in language (each word is a sequence of letters)• Syntactic structure - identifies “sentences” in language (each sentence is a sequence of words) • Semantics - meaning of program (specifies what result should be for each input)• Today’s topic: lexical and syntactic structuresSpecifying Formal Languages• Huge Triumph of Computer Science• Beautiful Theoretical Results• Practical Techniques and Applications• Two Dual Notions• Generative approach (grammar or regular expression)• Recognition approach (automaton)• Lots of theorems about converting one approach automatically to anotherSpecifying Lexical Structure Using Regular Expressions• Have some alphabet ∑ = set of letters• Regular expressions are built from:• ε -empty string• Any letter from alphabet ∑•r1r2 – regular expression r1followed by r2 (sequence)•r1| r2 – either regular expression r1or r2 (choice)• r* - iterated sequence and choice ε | r | rr | …• Parentheses to indicate grouping/precedenceConcept of Regular Expression Generating a StringRewrite regular expression until have only a sequence of letters (string) leftExample(0 | 1)*.(0|1)*(0 | 1)(0 | 1)*.(0|1)*1(0|1)*.(0|1)*1.(0|1)*1.(0|1)(0|1)*1.(0|1)1.0General Rules1) r1| r2 → r12) r1| r2 → r23) r* →rr*4) r* →εNondeterminism in Generation• Rewriting is similar to equational reasoning• But different rule applications may yield different final resultsExample 1(0|1)*.(0|1)*(0|1)(0|1)*.(0|1)*1(0|1)*.(0|1)*1.(0|1)*1.(0|1)(0|1)*1.(0|1)1.0Example 2(0|1)*.(0|1)*(0|1)(0|1)*.(0|1)*0(0|1)*.(0|1)*0.(0|1)*0.(0|1)(0|1)*0.(0|1)0.1Concept of Language Generated by Regular Expressions• Set of all strings generated by a regular expression is language of regular expression• In general, language may be (countably) infinite• String in language is often called a tokenExamples of Languages and Regular Expressions• ∑ = { 0, 1, . }• (0|1)*.(0|1)* - Binary floating point numbers• (00)* - even-length all-zero strings• 1*(01*01*)* - strings with even number of zeros• ∑ = { a,b,c, 0, 1, 2 }• (a|b|c)(a|b|c|0|1|2)* - alphanumeric identifiers• (0|1|2)* - trinary numbersAlternate Abstraction Finite-State Automata•Alphabet ∑• Set of states with initial and accept states• Transitions between states, labeled with letters1Start state1.(0|1)*.(0|1)*0 0Accept stateAutomaton Accepting StringConceptually, run string through automaton• Have current state and current letter in string• Start with start state and first letter in string• At each step, match current letter against a transition whose label is same as letter• Continue until reach end of string or match fails• If end in accept state, automaton accepts string• Language of automaton is set of strings it acceptsCurrent stateExample10.Start state10Accept state11.0Current letterExampleCurrent state10.Start state10Accept state11.0Current letterExampleCurrent state10.Start state10Accept state11.0Current letterExampleCurrent state10.Start state10Accept state11.0Current letterExampleCurrent state10.Start state10Accept state11.0Current letterExampleCurrent state10.Start state10Accept state11.0String is accepted!Current letterGenerative Versus Recognition• Regular expressions give you a way to generate all strings in language• Automata give you a way to recognize if a specific string is in language• Philosophically very different• Theoretically equivalent (for regular expressions and automata)• Standard approach• Use regular expressions when define language• Translated automatically into automata for implementationFrom Regular Expressions to Automata• Construction by structural induction• Given an arbitrary regular expression r• Assume we can convert r to an automaton with• One start state• One accept state• Show how to convert all constructors to deliver an automaton with• One start state• One accept stateBasic ConstructsStart stateAccept stateεεaa∈ΣSequenceStart stateAccept stater1r2r1r2SequenceOld start state Start stateOld accept state Accept stater1r2r1r2SequenceOld start state Start stateOld accept state Accept stater1r2εr1r2SequenceOld start state Start stateOld accept state Accept stateεr1r2εr1r2SequenceOld start state Start stateOld accept state Accept stater1r2εr1r2εεChoiceStart stateAccept stater1r1|r2r2ChoiceOld start state Start stateOld accept state Accept stater1r1|r2r2ChoiceOld start state Start stateOld accept state Accept stater1r2εεr1|r2ChoiceOld start state Start stateOld accept state Accept stateεr1r2εεεr1|r2Kleene StarOld start state Start stateOld accept state Accept staterr*Kleene StarOld start state Start stateOld accept state Accept staterr*Kleene StarOld start state Start stateOld accept state Accept staterε εr*Kleene StarOld start state Start stateOld accept state Accept stateεrεεr*Kleene StarOld start state Start stateOld accept state Accept stateεrεεr*εNFA vs. DFA•DFA•No ε


View Full Document

MIT 6 035 - Specifying Languages with Regular Expressions and Context-Free Grammars

Download Specifying Languages with Regular Expressions and Context-Free Grammars
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Specifying Languages with Regular Expressions and Context-Free Grammars and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Specifying Languages with Regular Expressions and Context-Free Grammars 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?