G22.2110 Programming LanguagesModels of ComputationGenealogyCommon IdeasPredictable performance vs. ease of writingLanguage as a tool for thought (Iverson)The programming environment may be larger than the languageLanguage DefinitionSyntax and SemanticsGrammarsThe Chomsky hierarchyLexical IssuesBNF: standard notation for context-free grammarsParse treesAmbiguityThe dangling else problemG22.2110G22.2110 Programming Languages Programming LanguagesThe main themes of programming language design and use:◦Model of computation◦Expressivenesstypes and their operationscontrol structuresabstraction mechanismstools for programming in the large◦Ease of use: Writeability / Readability / MaintainabilityModels of ComputationModels of ComputationImperative: programs have mutable storage (state) modified by assignments◦by far the most common and familiarFunctional (applicative): programs are pure functions◦much use in AI, formal semantics, language researchDeclarative: programs are unordered sets of assertions and rules◦Prolog, data base applicationsGenealogyGenealogyFORTRAN (1957) => Fortran90, HPFCOBOL (1956) => COBOL2000◦still a large chunk of installed softwareAlgol60 => Algol68 => Pascal => Ada => … Ada2012Algol60 => BCPL => CSimula => Smalltalk, C++LISP => Scheme, ML, Haskell◦with plenty of cross-influences: Java inherits from C++, Smalltalk, LISP, Ada, etc.Common IdeasCommon IdeasModern imperative languages (Ada, C++, Java) have similar characteristics:◦large number of features (grammar with several hundred productions, 500 page reference manuals…)◦a rich type system◦procedural mechanisms◦object-oriented facilities◦abstraction mechanisms, with information hiding◦several storage-allocation mechanisms◦facilities for concurrent programming (not C++)◦facilities for generic programming (not older versions of Java)Predictable performance vs. ease of writingPredictable performance vs. ease of writingLow-level languages mirror the physical machine:◦Assembly, C, FortranHigh-level languages model an abstract machine with useful capabilities:◦ML, Setl, Prolog, PythonWide-spectrum languages try to do both, more or less well:◦Ada, C++, JavaHigh-level languages are often interpreted, have garbage collection, are dynamically typed, and cannot be used for real-time programming. Cost of operations is not directly visible.◦Java is a hybridMajor distinction: manual / automatic storage managementLanguage as a tool for Language as a tool for thought (Iverson)thought (Iverson)Drawing a histogram in APL:◦ “* “ [ V ° V]◦Is it natural ? ( only if you happen to think that way)◦Role of language as a communication vehicle among programmers is more important than ease of writing◦APL is an extreme case (write-only language)◦All languages have the same expressive power; arguments of the form “you can’t do this in X” are meaningless.◦But.. Idioms in language A may be useful inspiration when writing in language B.The programming environment may be larger The programming environment may be larger than the languagethan the languageThe predefined libraries are indispensable to the proper use of the language.The libraries are defined in the language itself, but they have to be internalized by a good programmer. the C++ standard template libraryThe Java Swing classesThe Ada I/O packagesThere is a law of conservation of junk: whether it goes into the language or in the library, the total amount of miscellaneous information is roughly the same from language to language.Language DefinitionLanguage DefinitionDifferent users have different needs:◦programmerstutorials, reference manuals, programming guides (idioms)◦implementorsprecise operational semantics◦verifiersrigorous axiomatic or natural semantics◦language designers and language lawyersall of the aboveDifferent levels of detail and precision◦But none of them can be sloppy!Syntax and SemanticsSyntax and SemanticsSyntax refers to external representationSemantics denotes meaningDistinction is convenient but arbitrary: can describe fully a programming language by syntactic means (Algol68 and W-grammars), but resulting grammar is hard to use.In practice, describe context-free aspects with a grammar, the rest with prose or a different formal notation (equations, pre/post conditions, assertions).GrammarsGrammarsA set of non-terminal symbolsA distinguished non-terminal: the root symbolA set of terminal symbolsA series of rewrite rules (productions) of the form: ABC.. ::= XYZ..where A, B, C, D, X, Y, Z terminals and non terminals.The language is the set of sentences containing only terminalsymbols, that can be generated by applying the rewriting rulesstarting from the root symbolThe Chomsky hierarchyThe Chomsky hierarchyRegular grammars:◦all productions have the form: N ::= TN◦(one non-terminal on each side)Context-free grammars:◦all productions have the form: N ::= XYZ◦(one non-terminal on the left-hand side)Context-sensitive languages:◦The number of symbols on the left is no greater than on the right◦(no production shrinks the size of the sentential form)Type-0 languages◦no restrictionsLexical IssuesLexical IssuesLexical structure of programming languages is simple.Described mostly by regular grammarTerminals are characters:◦need to specify character set: ASCII, Latin-1, ISO646, Unicode…◦need to specify if case is significant◦need to specify external source representation for portabilityIdentifiers:◦Id => letter ◦Id => letter IdRegular grammars can’t count: previous rule does not specify size limitEarly FORTRAN oddities: DO 10 J = 1, 10 vs. DO 10 J = 1.10BNF: standard notation for BNF: standard notation for context-free grammarscontext-free grammarsA series of conventional abbreviations:◦alternation: symb ::= Letter | Digit◦option: if_stat ::= IF condition THEN statement [else_part] END IF ◦repetition: else_part ::= {elseif_part} [ELSE statement] elsif_part ::= ELSIF condition THEN statement◦also noted with Kleene star: id ::= Letter symb*◦Does not add to expressive power of grammar, could be done with recursion (tastes on readability differ)◦need convention for metasymbols: what if ‘|’ is in
View Full Document