Language Design COMS W4115 Katsushika Hokusai In the Hollow of a Wave off the Coast at Kanagawa 1827 Prof Stephen A Edwards Fall 2003 Columbia University Department of Computer Science Language Design Issues Syntax how programs look Names and reserved words Instruction formats Grouping Semantics what programs mean Model of computation sequential concurrent Control and data flow Types and data representation C History Developed between 1969 and 1973 along with Unix Due mostly to Dennis Ritchie Designed for systems programming Operating systems Utility programs Compilers Filters Evolved from B which evolved from BCPL BCPL Martin Richards Cambridge 1967 Typeless Everything a machine word n bit integer Pointers addresses and integers identical Memory undifferentiated array of words Natural model for word addressed machines Local variables depend on frame pointer relative addressing no dynamically sized automatic objects Strings awkward Routines expand and pack bytes to from word arrays C History Original machine DEC PDP 11 was very small 24K bytes of memory 12K used for operating system Written when computers were big capital equipment Group would get one develop new language OS C History Many language features designed to reduce memory Forward declarations required for everything Designed to work in one pass must know everything No function nesting PDP 11 was byte addressed Now standard Meant BCPL s word based model was insufficient Euclid s Algorithm in C int gcd int m int n New syle function declaration lists number and type of int r arguments while r m n 0 Originally only m n listed return type n r Generated code did not care how many arguments were return n actually passed and everything was a word Arguments are call by value Euclid s Algorithm in C int gcd int m int n Automatic variable Allocated on stack when function int r while r m n 0 entered released on return m n Parameters n r automatic variables accessed via frame return n pointer Ignored Other temporaries n m FP PC r SP also stacked Euclid on the PDP 11 globl gcd text gcd jsr r5 rsave L2 mov 4 r5 r1 sxt r0 div 6 r5 r0 mov r1 10 r5 jeq L3 mov 6 r5 4 r5 mov 10 r5 6 r5 jbr L2 L3 mov 6 r5 r0 jbr L1 L1 jmp rretrn GPRs r0 r7 r7 PC r6 SP r5 FP Save SP in FP r1 n sign extend r0 r1 m n r r1 m n if r 0 goto L3 m n n r r0 n non optimizing compiler return r0 n Euclid on the PDP 11 globl gcd text gcd jsr r5 rsave L2 mov 4 r5 r1 sxt r0 div 6 r5 r0 mov r1 10 r5 jeq L3 mov 6 r5 4 r5 mov 10 r5 6 r5 jbr L2 L3 mov 6 r5 r0 jbr L1 L1 jmp rretrn Very natural mapping from C into PDP 11 instructions Complex addressing modes make frame pointer relative accesses easy Another idiosyncrasy registers were memory mapped so taking address of a variable in a register is straightforward The Design of C Taken from Dennis Ritchie s C Reference Manual Appendix A of Kernighan Ritchie Lexical Conventions Identifiers words e g foo printf Sequence of letters digits and underscores starting with a letter or underscore Keywords special words e g if return C has fairly few only 23 keywords Deliberate leaves more room for users names Comments between and Most fall into two basic styles start end sequences as in C or until end of line as in Java s Lexical Conventions C is a free form language where whitespace mostly serves to separate tokens Which of these are the same 1 2 return this returnthis 1 2 foo bar foobar Space is significant in some language Python uses indentation for grouping thus these are different if x 3 y 2 z 3 if x 3 y 2 z 3 Constants Literals Integers e g 10 Should a leading be part of an integer or not Characters e g a How do you represent non printable or characters Floating point numbers e g 3 5e 10 Usually fairly complex syntax easy to get wrong Strings e g Hello How do you include a in a string What s in a Name In C each name has a storage class where it is and a type what it is Storage classes Fundamental types Derived types 1 automatic 1 char 1 arrays 2 static 2 int 2 functions 3 external 3 float 3 pointers 4 register 4 double 4 structures Objects and lvalues Object area of memory lvalue refers to an object An lvalue may appear on the left side of an assignment a 3 OK a is an lvalue 3 a 3 is not an lvalue Conversions C defines certain automatic conversions A char can be used as an int Floating point arithmetic is always done with doubles floats are automatically promoted int and char may be converted to float or double and back Result is undefined if it could overflow Adding an integer to a pointer gives a pointer Subtracting two pointers to objects of the same type produces an integer Expressions Expressions are built from identifiers foo constants 3 parenthesis and unary and binary operators Each operator has a precedence and an associativity Precedence tells us 1 2 3 4 means 1 2 3 4 Associativity tells us 1 2 3 4 means 1 2 3 4 C s Operators in Precedence Order f r r b l p n o n o i j n o r r i j i j i j b c b c b r r l r l n l i r1 r2 a i i l l n o n o i j n o r r p m i l type r i j s m n o n o l n l i l i l n l i l i l n l i l sizeof t Declarators Declaration string of specifiers followed by a declarator basic type z static unsigned int f 10 int char 10 z z specifiers declarator Declarator s notation matches that of an expression use it to return the basic type Largely regarded as the worst syntactic aspect of C both pre pointers and post fix operators arrays functions Storage Class Specifiers auto Automatic stacked default static Statically allocated extern Look for a declaration elsewhere register Kept in a register not memory C trivia Originally a function could only have at most three register variables may only be int or char can t use address of operator Today register simply ignored Compilers try to put most automatic variables in registers Type Specifiers int char float double struct declarations struct identifier declarations struct identifier Declarators identifier declarator Grouping declarator Function declarator optional constant Array declarator Pointer C trivia Originally number and type of arguments to a function wasn t part of its type thus declarator just contained Today ANSI C allows function and argument types making an even bigger mess of declarators Declarator syntax Is int f a pointer to a function returning an int or a function that …
View Full Document
Unlocking...