Unformatted text preview:

CSE 5317Lecture'2:'Language'implementa1on21'Jan'2010Nate'NystromUniversity'of'Texas'at'ArlingtonFriday, February 5, 2010Language definitionSyntax'defines'the'structure'of'a'program◾set'of'rules'defining'which'symbols'are'a'legall y'structured'programSeman)cs'defines'the'“meaning”'of'a'program◾without'seman1cs,'programs'are'just'sequences'of'charactersNext'week:'syntaxMost'of'the'rest'of'the'course:'seman1cs2Friday, February 5, 2010SyntaxSyntax'defined'by'a'conte xt-free'grammarExample:◾Exp'::='Exp'+'Term'|'Term◾Term'::='Term'*'Factor'|'Factor◾Factor'::='INTEGER◾Expressions'with'+'and'*'operators'over'integers◾*"has"higher"precedence"than"+◾1+2*3"parses"as"1+(2*3)"not"as"(1+2)*33Friday, February 5, 2010SemanticsDefines'the'meaning'of'programsOWen'described'informally◾in'a'book'(e.g.,'the'Java'Language'Specifica1on)◾or,'by'the'implementa1on'(e.g.,'Python)Be]er'is'to'describe'the'seman1cs'formally◾Unambiguous◾Allows'you'to'check'that'language'is'implemented'correctly◾Can'generate'some'(or'all!)'of'the'implementa1on'automa1cally◾Can'reason'formally'about'programs4Friday, February 5, 2010Formal semanticsSeveral'different'ways:◾opera1onal'seman1cs'a'rewrite'rules◾will"discuss"briefly"later"in"the"semester◾denota1onal'seman1cs'a'program'as'mathema1cal'formula◾requires"deep"mathema>cs"(domain"theory)–not"going"to"discuss◾axioma1c'seman1cs'a'program'as'logical'statements◾less"useful–not"going"to"discuss5Friday, February 5, 2010Operational semanticsSeman1cs'given'by'a'set'of'syntac1c'rewrite'rulese.g.,''''if'true'then's1'else's2'➝'s1e.g.,''''if'false'then's1'else's2'➝'s2e.g.,''''if'e'then's1'else's2'➝'if'e’'then's1'else's2''''''''''''''''''''''when'e'➝'e’e.g,.''''while'e'do's'➝,if'e'then'{'s;'while'e'do's'}Trivial'to'translate'opera1onal'seman1cs'into'an'implementa1on6Friday, February 5, 2010Language implementation◾Compiler◾a'program'that'translates'an'executable'program'in'one'language'into'an'executable'program'in'anoth er'language◾usually'higheralevel'to'loweralevel◾e.g.,"C"to"x86,"Java"to"JVM"bytecode,"C++"to"C◾“tran slator”'a'target'code'is'in'a' simil ar'lan guage◾“decompiler”'a'target'code'is'in'a'higheralevel'language◾“assembler”'a'source'is'assembly'code,'target'is'machine'code◾Interpreter◾a'program'that'reads'an'executable'pro gram'and'produces'the'results'of'running'that'program7Friday, February 5, 2010Compilers◾recognize'legal'(and'ill egal)'programs◾generate'correct'code◾manage'storage'of'all'variables'and'code◾agree'on'format'for'target'(object,'assembly)'code8compilersource"codetarget"codeerrorsFriday, February 5, 2010Interpreters◾recognize'legal'(and'ill egal)'programs◾eval uate'program9interpretersource"codeoutputerrorsFriday, February 5, 2010Traditional two-pass compiler◾intermediate'representa1on'(IR)◾front'end'maps'legal'code'to'IR◾back'end'maps'IR'onto'target'machine◾simplify'retarge1ng◾allow'mul1ple'front'ends10front"endsource"codetarget"codeerrorsback"enderrorsIRFriday, February 5, 2010Traditional interpreter◾intermediate'representa1on'(IR)◾front'end'maps'legal'code'to'IR◾eval uator'interprets'program,'producing'output11front"endsource"codeoutputerrorsevaluatorerrorsIRFriday, February 5, 2010Front end◾Responsibili1es:◾recognize'legal'code◾report'errors◾produce'IR◾preliminary'storage'map◾shape'the'code'for'the'back'end◾Much'of'the'front'end'construc1on'can'be'automated12scannersource"codeerrorsparsererrorstokensseman>c"analysiserrorsIR IRFriday, February 5, 2010Scanner◾maps'characters'into'tokens◾x'='x'+'y◾=>◾ID(“x”)'='ID(“x”) '+'ID(“ y”)◾character'string'value'for'a'token'is'lexeme◾typical'tokens:'NUMBER,'ID,'+,'a,'*,'/,'if,'while◾eliminates'whitespace◾key'issue'is'speed◾oWen'use'specialized'scanner'vs.'generated'one'(via'lex)13Friday, February 5, 2010ScanningPreaprocess'the'input'before'parsingBreak'stream'of'bytes'into'a'stream'of'tokens◾IF,"WHILE,"INT,"STRING,"INTScanner"eliminates"whitespace,"commentsSimplifies"the"parser◾deal"with"~100"tokens"vs."65536"characters◾smaller"state"machineUses"less"memory,"faster14Friday, February 5, 2010Parser◾recognize'context afree'syntax◾guide'contextasensi1ve'analysis◾construct'IR◾produce'meaningful'error'messages◾a]empt'error'correc1on◾parser'generators'mechanize'much'of'this'work15Friday, February 5, 2010Parsing16Turns'stream'of'bytes'into'an'inamemory'representa1on'of'the'programMany'approaches:◾parser4generators'for'different'classes'of'grammars◾top\down:"LL(1),"LL(k),"LL(*),"PEG◾bo_om\up:"LR(1),"LALR(1),"LR(k),"GLR◾implement'by'hand◾LL(k)"grammars"can"be"implemented"by"a"recursive\descent"parserFriday, February 5, 2010Context-free syntax◾Specify'with'a'grammar◾SheepNoise'::='baa'|'baa'SheepNoise◾Format'is'called'BackusaNaur'Form'(BNF)◾Formally,'a'grammar'G'='(S,N,T,P)◾S'is'the'start'symbol'(in'N)◾N'is'the'set'of'nonterminals'(e.g.,'SheepNoise)◾T'is'the'set'of'terminals'(e.g.,'baa,'NUMBER,'+)◾P'is'the'set'of'prod uc1on s'or'rewrite'rules'(in'Na>'(N'U'T))17Friday, February 5, 2010Backus-Naur FormSet'of'terminals,'a1,'...,'am◾characters'or'tokensSet'of'nonterminals,'A1,'... ,'AnDesignate'one'nonterminal'as'the'start4symbolSet'of'rules'of'the'form,'A'::='X1'...'Xk◾X'is'a'symbol,'either'a'terminal'or'nonterminal18Friday, February 5, 2010A more useful grammar◾Goal'::='Expr◾Expr'::='Expr'Op'Term◾Expr'::='Term◾Term'::='NUMBER◾Term'::='ID◾Op'::='+◾Op'::='a◾S'='Goal◾N'='{Goal,'Expr,'Term,'Op}◾T'='{NUMBER,'ID,'+,'a}◾P'='{Goal'::='Expr,'...,'Op'::='a}19Friday, February 5, 2010Derivations◾Given'a'grammar,'valid4sentences'can'be'derived'by'repeated'subs1tu1on◾Goal◾Expr◾Expr'Op'Term◾Expr'Op'y◾Expr'a'y◾Expr'Op'Term'a'y◾Expr'Op'2'a'y◾Expr'+'2'a'y◾Term'+'2'a'y◾x'+'2'a'y◾To'recognize'a'valid'sentence'in'some'CFG,'we'reverse'the'process'and'build'up'a'parse20Friday, February 5, 2010Front endA parse can be represented by a tree called a pars e or syntax tree2><num:<id:x><id:>ygoaloptermopexprexpr termexprterm-+Obviously, this contains a lot of unnecessary information19Parse trees◾Can'represent'a'parse'with'a'tree◾Contains'a'lot'of'unnecessary'informa1on21Friday, February 5, 2010Abstract syntax trees◾Much'more'concise'than'p arse'tree◾Abstract'syntax'trees'(ASTs)'are'oWen'used'as'an'IR'between'front'end'and'back'end22Front endSo, compilers often use an abstract syntax


View Full Document

UT Arlington CSE 5317 - Language Implementation

Download Language Implementation
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Language Implementation and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Language Implementation 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?