Fall 2005 Lecture 15: Putting it all together From parsing to code generation How to make the computer understand? •Write a program using a programming language • Microprocessors talk in assembly language Translation written Compiler Assembly Language Program in a Programming Languages Saman Amarasinghe 2 6.035 ©MIT Fall 1998 Example (input program) int expr(int n){ int d; d = 4 * n * n * (n + 1) * (n + 1); return d; } Saman Amarasinghe 3 6.035 ©MIT Fall 1998 Example (Output assembly code) Unoptimized Code Optimized Code expr: .LFB2: pushq %rbp .LCFI0: movq %rsp, %rbp expr:.LFB2: movl %edi, %eax imull %edi, %eax incl %edi .LCFI1: imull %edi, %eax movl %edi, -4(%rbp) imull %edi, %eax movl -4(%rbp), %eax sall $2, %eax ret movl %eax, %edx imull -4(%rbp), %edx movl -4(%rbp), %eax incl %eax imull %eax, %edx movl -4(%rbp), %eax incl %eax imull %edx, %eax sall $2, %eax movl %eax, -8(%rbp) movl -8(%rbp), %eax leave ret Saman Amarasinghe 4 6.035 ©MIT Fall 1998 5 6.035 ) Saman Amarasinghe ©MIT Fall 1998 Anatomy of a Computer Code Optimizer Code Generator Optimized Intermediate Representation Assembly code Intermediate Representation Semantic Analyzer Lexical Analyzer (Scanner) Syntax Analyzer (Parser) Token Stream Parse Tree Program (character streamLexical Analysis • Lexical analyzer create tokens out of a text stream • Tokens are defined using regular expressions Saman Amarasinghe 6 6.035 ©MIT Fall 1998 1Examples of Regular Expressions Regular Expression a Strings matched “a” a · b “ab” a | b “a” “b” ε “” a* “” “a” “aa” “aaa” … (a | ε) · b “ab” “b” num = 0|1|2|3|4|5|6|7|8|9 “0” “1” “2” “3” … posint = num · num* “8” “6035” … int = (ε | -) · posint “-42” “1024” … real = int ·(ε | (. · posint)) “-12.56” “12” “1.414”... Saman Amarasinghe 7 6.035 ©MIT Fall 1998 Lexical Analysis • Lexical analyzer create tokens out of a text stream • Tokens are defined using regular expressions • Regular expressions can be mapped to Nondeterministic Finite Automatons (NFA) – by simple construction • NFA is transformed to a DFA – Transformation algorithm – Executing a DFA is straightforward Saman Amarasinghe 8 6.035 ©MIT Fall 1998 Syntax Analysis (parsing) • Defining a language using context-free grammars Saman Amarasinghe 9 6.035 ©MIT Fall 1998 Example: A CFG for expressions <expr> → <expr> <op> <expr> <expr> → ( <expr> ) <expr> → - <expr> <expr> → num <op> → + <op> → * Saman Amarasinghe 10 6.035 ©MIT Fall 1998 Parse Tree Example num ‘*’ ‘(‘ num ‘+’ num ‘)’ <op> <expr> <expr> <expr> <expr> <expr> num * ( <expr> ) <op> Syntax Analysis (parsing) • Defining a language using context-free grammars • Classification of Grammars LR(0) SLR(1) (1) LR(1) LR(k) regular LALRunambiguous Context free num + num Saman Amarasinghe 11 6.035 ©MIT Fall 1998 Saman Amarasinghe 12 6.035 ©MIT Fall 1998 213 6.035 LR(k) Parser Engine ( ) $ X (2) (2 ) (2) (3) (3 ) (3) 14 6.035 → • → • Y → • ( → • ( ) → • s0 Y → <X> •$ s5 ( Y s4 → ( • )→ ( • ) s3 ( ) $ X Y s0 shi ) ) s1 i (5) ()s2 shi ) ) s3 is4 ) ) s5 s6 (2) → ( • → ( • ) → • ( ) → • s1 ( → ( • ) → • ( ) → • s2( → • s6X Y 31 Saman Amarasinghe ©MIT Fall 1998 Current Symbol Parser Action Symbol Stack State Stack ACTION Goto State s0 shift to s2 error error goto s1 s1 error error accept s2 shift to s 2 shift to s5 error goto s3 s3 error shift to s4 error s4 reduce reduce reduces5 reduce reduce reduceSaman Amarasinghe ©MIT Fall 1998 <S> <X> $ <X> <X> <Y> <Y> <Y> <S> <Y> <Y> ) <Y> <Y> ACTION Goto State ft to s1 reduce (5 reduce (5 goto s5 goto s6 sh ft to s2 reduce reduce 3 or 5 goto s3 ft to s2 reduce (5 reduce (5 goto s3 error sh ft to s4 error error reduce (4 reduce (4error error accept error error reduce<X> <Y> <Y> <Y> <Y> <Y> <Y> <Y> <Y> <Y> <Y> <X> <Y> Semantic Analysis • Building a symbol table • Static Checking – Flow-of-control checks – Uniqueness checks – Type checking • Dynamic Checking – Array bounds check – Null pointer dereference check Saman Amarasinghe 15 6.035 ©MIT Fall 1998 Translation to Intermediate Format • Goal: Remain Largely Machine Independent But Move Closer to Standard Machine Model – From high-level IR to a low-level IR • Eliminate Structured Flow of Control • Convert to a Flat Address Space Saman Amarasinghe 16 6.035 ©MIT Fall 1998 Code Optimizations • Generate code as good as hand-crafted by a good assembly programmer • Have stable, robust performance • Abstract the architecture away from the programmer – Exploit architectural strengths – Hide architectural weaknesses Saman Amarasinghe 17 6.035 ©MIT Fall 1998 Code Optimizations • Algebraic simplification • Common subexpression elimination • Copy propagation • Constant propagation • Dead-code elimination • Register allocation • Instruction Scheduling Saman Amarasinghe 18 6.035 ©MIT Fall 1998 3 Figure by MIT OCW.Compiler Project! • You guys build a full-blown compiler from the ground up!!! • From decaf to working code Saman Amarasinghe 19 6.035 ©MIT Fall 1998 How will you use 6.035 knowledge? • As an informed programmer • As a designer of simple languages to aid other programming tasks • As an engineer dealing with new computer architectures • As a true compiler hacker Saman Amarasinghe 21 6.035 ©MIT Fall 1998 1. Informed Programmer • What did you learned in 6.035? – How optimizations work or why they did not work – How to read and understand optimized code Saman Amarasinghe 23 6.035 ©MIT Fall 1998 Compiler Derby • Who has the fastest compiler in the east??? • Will give you the program 12 hours in advance – Test and make all the optimizations work – DO NOT ADD PROGRAM SPECIFIC HACKS! • Wednesday, December 14th at 11:00AM location TBA – refreshments provided Saman Amarasinghe 20 6.035 ©MIT Fall 1998 1. Informed Programmer • Now you know what the compiler is doing – don’t treat it as a black box – don’t trust it to do the right thing! • Implications – performance – debugging – correctness Saman Amarasinghe 22 6.035 ©MIT Fall 1998 25 2.
View Full Document