Programming Languages and Translators COMS W4115 Pieter Bruegel The Tower of Babel 1563 Prof Stephen A Edwards Spring 2007 Columbia University Department of Computer Science Instructor Prof Stephen A Edwards sedwards cs columbia edu http www1 cs columbia edu sedwards 462 Computer Science Building Office Hours 3 4 PM Tuesday 4 5 PM Wednesday Schedule Mondays and Wednesdays 1 10 PM to 2 25 PM 627 Mudd Lectures January 17 to April 30 Midterm March 7 Final April 30 in class Final project report May 7 Holidays March 12 16 Spring Break Objectives Theory of language design Finer points of languages Different languages and paradigms Practice of Compiler Construction Overall structure of a compiler Automated tools and their use Lexical analysis to assembly generation Required Text Alfred V Aho Monica S Lam Ravi Sethi and Jeffrey D Ullman Compilers Principles Techniques and Tools Addison Wesley 2006 Second Edition Assignments and Grading 40 Programming Project 20 Midterm near middle of term 30 Final at end of term 10 Individual homework Project is most important but most students do well on it Grades for tests often vary more Prerequisite Java Fluency You and your group will write perhaps 5000 lines of Java you will not have time to learn it We will be using a tool that generates fairly complicated Java and it will be necessary to understand the output Prerequisite COMS W3157 Advanced Programming Teams will build a large software system Makefiles version control test suites Testing will be as important as development Prerequisite COMS W3261 Computability and Models of Computation You need to understand grammars We will be working with regular and context free languages Class Website Off my home page http www1 cs columbia edu sedwards Contains syllabus lecture notes and assignments Schedule will be continually updated during the semester Collaboration Collaborate with your team on the project Do your homework by yourself Tests Will be closed book with a one page cheat sheet of your own devising Don t cheat on assignments If you re dumb enough to cheat I m smart enough to catch you The Project The Project Design and implement your own little language Five deliverables 1 A proposal describing and motivating your language 2 A language reference manual defining it formally 3 A compiler or interpreter for your language running on some sample programs 4 A final project report 5 A final project presentation Teams Immediately start forming four person teams to work on this project Each team will develop its own langauge Suggested division of labor Front end back end testing documentation All members of the team should be familiar with the whole project First Three Tasks 1 Decide who you will work with You ll be stuck with them for the term choose wisely 2 Elect a team leader Languages come out better from dictatorships not democracies Besides you ll have someone to blame 3 Select a weekly meeting time Harder than you might think Might want to discuss with a TA you d like to have so it is convenient for him her as well Project Proposal Describe the language that you plan to implement Explain what problem your language can solve and how it should be used Describe an interesting representative program in your language Give some examples of its syntax and an explanation of what it does 2 4 pages Language Reference Manual A careful definition of the syntax and semantics of your language Follow the style of the C language reference manual Appendix A of Kernighan and Ritchie The C Programming Langauge see the class website Final Report Sections 1 Introduction the proposal 2 Language Tutorial 3 Language Reference Manual 4 Project Plan 5 Architectural Design 6 Test Plan 7 Lessons Learned 8 Complete listing Due Dates Proposal February 7 soon Reference Manual March 5 Final Report May 7 Design a language A small domain specific language Think of awk or php not Java or C Examples from earlier terms Quantum computing language Geometric figure drawing language Projectile motion simulation langauge Matlab like array manipulation language Screenplay animation language Other language ideas Simple animation language Model train simulation language Escher like pattern generator Music manipulation language harmony Web surfing language Mathematical function manipulator Simple scripting language a la Tcl Petri net simulation language What s in a Language Components of a language Syntax How characters combine to form words sentences paragraphs The quick brown fox jumps over the lazy dog is syntactically correct English but isn t a Java program class Foo public int j public int foo int k return j k Is syntactically correct Java but isn t C Specifying Syntax Usually done with a context free grammar Typical syntax for algebraic expressions expr expr expr expr expr expr expr expr expr digit expr Components of a language Semantics What a well formed program means The semantics of C says this computes the nth Fibonacci number int fib int n int a 0 b 1 int i for i 1 i n i int c a b a b b c return b Semantics Something may be syntactically correct but semantically nonsensical The rock jumped through the hairy planet Or ambiguous The chickens are ready for eating Semantics Nonsensical in Java class Foo int bar int x return Foo Ambiguous in Java class Bar public float foo return 0 public int foo return 0 Specifying Semantics Doing it formally beyond the scope of this class but basically two ways Operational semantics Define a virtual machine and how executing the program evolves the state of the virtual machine Denotational semantics Shows how to build the function representing the behavior of the program i e a transformation of inputs to outputs from statements in the language Most language definitions use an informal operational semantics written in English Great Moments in Programming Language Evolution Assembly Before numbers 55 89E5 8B4508 8B550C 39D0 740D 39D0 7E08 29D0 39D0 75F6 C9 C3 29C2 EBF6 After Symbols gcd pushl movl movl movl cmpl je L7 cmpl jle subl L2 cmpl jne L9 leave ret L5 subl jmp ebp esp ebp 8 ebp eax 12 ebp edx edx eax L9 edx eax L5 edx eax edx eax L7 eax edx L2 FORTRAN Before gcd pushl movl movl movl cmpl je L7 cmpl jle subl L2 cmpl jne L9 leave ret L5 subl jmp ebp esp ebp 8 ebp eax 12 ebp edx edx eax L9 edx eax L5 edx eax edx eax L7 eax edx L2 After Expressions control flow 10 20 if a EQ if a LT a a else b b endif goto 10 end b goto 20 b then b a COBOL From cafepress com Added type declarations record
View Full Document
Unlocking...