68
STRUCTURE OF PROGRAMMING LANGUAGES Dr Yasser Fouad

STRUCTURE OF PROGRAMMING LANGUAGES Dr Yasser Fouad

Embed Size (px)

Citation preview

STRUCTURE OF PROGRAMMING

LANGUAGES

Dr Yasser Fouad

ISBN 0-321-33025-0

Book

Quote of the Day

“A language that doesn't affect the way you think about programming, is not worth knowing.” - Alan Perlis

Then you decide to get a PhD

You get tired of the PowerPoint and its animations. You embed a domain-specific language (DSL) into Ruby.

4…

Reasons for Studying Concepts of Programming Languages

• Increased ability to express ideas• Improved background for choosing

appropriate languages• Increased ability to learn new

languages• Better understanding of significance

of implementation• Overall advancement of computing

How is this class different?

It’s about:

a) foundations of programming langauges

b) but also how to design your own languages

c) how to implement them

d) and about PL tools, such as analyzers

e) also learn about some classical C.S.

algorithms.

6

7

Why a developer needs PL

New languages will keep coming– Understand them, choose the right one.

Write code that writes code– Be the wizard, not the typist.

Develop your own language.– Are you kidding? No.

Learn about compilers and interpreters.– Programmer’s main tools.

8

Overview

• how many languages does one need?

• how many languages did you use? Let’s

list them here:

9

Develop your own language

Are you kidding? No. Guess who developed:– PHP– Ruby– JavaScript– perl

Done by smart hackers like you – in a garage– not in academic ivory tower

Our goal: learn good academic lessons– so that your future languages avoid known

mistakes

1-10

Programming Domains

• Scientific applications– Large number of floating point computations– Fortran

• Business applications– Produce reports, use decimal numbers and characters– COBOL

• Artificial intelligence– Symbols rather than numbers manipulated– LISP

• Systems programming– Need efficiency because of continuous use– C

• Web Software– Eclectic collection of languages: markup (e.g., XHTML),

scripting (e.g., PHP), general-purpose (e.g., Java)

Figure by Brian Hayes(who credits, in part, Éric Lévénez and Pascal Rigaux):Brian Hayes, “The Semicolon Wars.” American Scientist, July-August 2006, pp.299-303

13

Genealogy of Common Language

© O. NierstraszPS — Introduction 1.14

Programming Paradigms

A programming language is a problem-solving tool.

Imperative style:program = algorithms + datagood for decomposition

Functional style:program = functions o functionsgood for reasoning

Logic programming style:program = facts + rulesgood for searching

Object-oriented style:program = objects + messagesgood for modeling(!)

Other styles and paradigms: blackboard, pipes and filters, constraints, lists, ...

1-15

Language Categories

• Imperative– Central features are variables, assignment statements, and

iteration– Examples: C, Pascal

• Functional– Main means of making computations is by applying functions to

given parameters– Examples: LISP, Scheme

• Logic– Rule-based (rules are specified in no particular order)– Example: Prolog

• Object-oriented– Data abstraction, inheritance, late binding– Examples: Java, C++

• Markup – New; not a programming per se, but used to specify the layout

of information in Web documents– Examples: XHTML, XML

16

•A program is a machine-compatible representation of an algorithm

•If no algorithm exists for performing a task, then the task can

not be performed by a machine

•Programs and algorithms they represent collectively referred

to as Software

Program

What is a Programming Language?

© O. Nierstrasz PS — Introduction1.17

• A formal language for describing computation?

• A “user interface” to a computer?• Syntax + semantics?• Compiler, or interpreter, or translator?• A tool to support a programming paradigm?

A programming language is a notational system for describing computation in a machine-readable and human-readable form.

— Louden

A programming language is a notational system for describing computation in a machine-readable and human-readable form.

— Louden

1-18

Programming Methodologies Influences• 1950s and early 1960s: Simple applications;

worry about machine efficiency• Late 1960s: People efficiency became important;

readability, better control structures– structured programming– top-down design and step-wise refinement

• Late 1970s: Process-oriented to data-oriented– data abstraction

• Middle 1980s: Object-oriented programming– Data abstraction + inheritance + polymorphism

Favorite programming language June 2012• Python (3,054)• Ruby (1,723)• JavaScript (1,415)• C (970)• C# (829)• PHP (666)• Java (551)• C++ (529)• Haskell (519)• Clojure (459)• CoffeeScript (362)• Objective C (326)• Lisp (322)• Perl (311)• Scala (233)• Scheme (190)• Other (188)• Erlang (162)• Lua (145)• SQL (101)

job listings collected from Dice.com

• Java 17,599 (+8.96%)• XML 10,780 (+11.70%)• JavaScript (+11.64%)• HTML 9,587 (-1.53%)• C# 9,293 (+17.04%)• C++ 6,439 (+7.55%)• AJAX 5,142 (+15.81%)• Perl 5,107 (+3.21%)• PHP 3,717 (+23%)

• Python 3,456 (+32.87%)

• Ruby 2,141 (+39.03%)• HTML5 (+276.85%)• Flash 1,261 (+95.2%)• Silverlight 865 (-

11.91%)• COBOL 656 (-10.75%)• Assembler 209 (-1.42%)• PowerBuilder (-18.71%)• FORTRAN 45 (-33.82%)

Languages in Common Use

© O. NierstraszPS — Introduction 1.25

A Brief ChronologyEarly 1950s “order codes” (primitive assemblers)1957 FORTRAN the first high-level programming language1958 ALGOL the first modern, imperative language1960 LISP, COBOL Interactive programming; business programming1962 APL, SIMULA the birth of OOP (SIMULA)1964 BASIC, PL/I

1966 ISWIM first modern functional language (a proposal)1970 Prolog logic programming is born1972 C the systems programming language1975 Pascal, Scheme two teaching languages1978 CSP Concurrency matures1978 FP Backus’ proposal1983 Smalltalk-80, Ada OOP is reinvented1984 Standard ML FP becomes mainstream (?)1986 C++, Eiffel OOP is reinvented (again)1988 CLOS, Oberon, Mathematica

1990 Haskell FP is reinvented1990s Perl, Python, Ruby, JavaScript Scripting languages become mainstream1995 Java OOP is reinvented for the internet2000 C#

26

ENIAC (1946, University of Philadelphia)

ENIAC program for external ballistic equations:

27

Programming the ENIAC

28

ENIAC (1946, University of Philadelphia)

programming done by– rewiring the interconnections – to set up desired formulas, etc

Problem (what’s the tedious part?)– programming = rewiring– slow, error-prone

solution: – store the program in

memory!– birth of von Neuman paradigm

29

Assembly – the language (UNIVAC 1, 1950)

Idea: mnemonic (assembly) code – Then translate it to machine code by hand (no compiler yet)– write programs with mnemonic codes (add, sub),

with symbolic labels, – then assign addresses by hand

Example of symbolic assemblerclear-and-add aadd bstore c

translate it by hand to something like this (understood by CPU)B100 A200 C300

Assembly Language

• Use symbols instead of binary digits to describe fields of instructions.

• Every aspect of machine visible in program:– One statement per machine instruction.– Register allocation, call stack, etc. must be

managed explicitly.

• No structure: everything looks the same.

10101100100000100000000000010101ADDI R4 R2 21

ADDI R4,R2,21

31

Assembler – the compiler (Manchester, 1952)

• a loop example, in MIPS, a modern-day assembly code:

loop: addi $t3, $t0, -8 addi $t4, $t0, -4 lw $t1, theArray($t3) # Gets the last lw $t2, theArray($t4) # two elements add $t5, $t1, $t2 # Adds them together... sw $t5, theArray($t0) # ...and stores the result addi $t0, $t0, 4 # Moves to next "element“

# of theArray blt $t0, 160, loop # If not past the end of # theArray, repeat jr $ra

High-level Language

• Provides notation to describe problem solving strategies rather than organize data and instructions at machine-level.

• Improves programmer productivity by supporting features to abstract/reuse code, and to improve reliability/robustness of programs.

• Requires a compiler.

33

FORTRAN I (1954-57)

Langauge, and the first compiler– Produced code almost as good as hand-

written– Huge impact on computer science – Modern compilers preserve its outlines

By 1958, >50% of all software is in FORTRAN

34

FORTRAN I

Example: nested loops in FORTRAN– a big improvement over assembler, – but annoying artifacts of assembly remain:

• labels and rather explicit jumps (CONTINUE)• lexical columns: the statement must start in

column 7

– The MIPS loop from previous slide, in FORTRAN:

DO 10 I = 2, 40 A[I] = A[I-1] + A[I-2]

10 CONTINUE

© O. Nierstrasz PS — Introduction1.35

“Hello World” in FORTRAN

All examples from the ACM "Hello World" project:www2.latech.edu/~acm/HelloWorld.shtml

PROGRAM HELLODO 10, I=1,10PRINT *,'Hello World'

10 CONTINUESTOPEND

36

Side note: designing a good language is hardGood language protects against bugs, but lessons take a while.An example that caused a failure of a NASA planetary probe:

buggy line:DO 15 I = 1.100

what was intended (a dot had replaced the comma):DO 15 I = 1,100

because Fortran ignores spaces, compiler read this as:DO15I = 1.100

which is an assignment into a variable DO15I, not a loop.

This mistake is harder to make (if at all possible) with the modern lexical rules (white space not ignored) and loop syntax

for (i=1; i < 100; i++) { … }

© O. Nierstrasz PS — Introduction1.37

“Hello World” in COBOL

000100 IDENTIFICATION DIVISION.000200 PROGRAM-ID. HELLOWORLD.000300 DATE-WRITTEN. 02/05/96 21:04.000400* AUTHOR BRIAN COLLINS000500 ENVIRONMENT DIVISION.000600 CONFIGURATION SECTION.000700 SOURCE-COMPUTER. RM-COBOL.000800 OBJECT-COMPUTER. RM-COBOL.001000 DATA DIVISION.001100 FILE SECTION.100000 PROCEDURE DIVISION.100200 MAIN-LOGIC SECTION.100300 BEGIN.100400 DISPLAY " " LINE 1 POSITION 1 ERASE EOS.100500 DISPLAY "HELLO, WORLD." LINE 15 POSITION 10.100600 STOP RUN.100700 MAIN-LOGIC-EXIT.100800 EXIT.

© O. Nierstrasz PS — Introduction1.38

ALGOL 60

History• Committee of PL experts formed in 1955 to design universal,

machine-independent, algorithmic language• First version (ALGOL 58) never implemented; criticisms led to

ALGOL 60Innovations• BNF (Backus-Naur Form) introduced to define syntax (led to

syntax-directed compilers)• First block-structured language; variables with local scope• Structured control statements• Recursive procedures• Variable size arraysSuccesses• Highly influenced design of other PLs but never displaced

FORTRAN

© O. Nierstrasz PS — Introduction1.39

“Hello World” in BEALGOL

BEGINFILE F (KIND=REMOTE);EBCDIC ARRAY E [0:11];REPLACE E BY "HELLO WORLD!";WHILE TRUE DO

BEGINWRITE (F, *, E);END;

END.

© O. Nierstrasz PS — Introduction1.40

“Hello World” in PL/1

HELLO: PROCEDURE OPTIONS (MAIN);

/* A PROGRAM TO OUTPUT HELLO WORLD */FLAG = 0;

LOOP: DO WHILE (FLAG = 0); PUT SKIP DATA('HELLO WORLD!');

END LOOP;

END HELLO;

© O. Nierstrasz PS — Introduction1.41

“Hello World” in Functional Languages

SML

Haskellprint("hello world!\n");

hello() = print "Hello World"

42

Goto considered harmful

L1: statementif expression goto L1statement

Dijkstra says: gotos are harmful– use structured programming– lose some performance, gain a lot of readability

how do you rewrite the above code into structured form?

© O. Nierstrasz PS — Introduction1.43

Special-Purpose Languages

SNOBOL• First successful string manipulation language• Influenced design of text editors more than other PLs• String operations: pattern-matching and substitution• Arrays and associative arrays (tables)• Variable-length strings

...OUTPUT = 'Hello World!'

END

© O. Nierstrasz PS — Introduction1.44

Object-Oriented Languages

History• Simula was developed by Nygaard and Dahl (early 1960s)

in Oslo as a language for simulation programming, by adding classes and inheritance to ALGOL 60

• Smalltalk was developed by Xerox PARC (early 1970s) to drive graphic workstations

Beginwhile 1 = 1 do begin

outtext ("Hello World!");outimage;

end;End;

Transcript show:'Hello World';cr

© O. Nierstrasz PS — Introduction1.45

4GLs

“Problem-oriented” languages• PLs for “non-programmers”• Very High Level (VHL) languages for specific problem

domainsClasses of 4GLs (no clear boundaries)• Report Program Generator (RPG)• Application generators• Query languages• Decision-support languagesSuccesses• Highly popular, but generally ad hoc

© O. Nierstrasz PS — Introduction1.46

“Hello World” in SQL

CREATE TABLE HELLO (HELLO CHAR(12))UPDATE HELLO

SET HELLO = 'HELLO WORLD!'SELECT * FROM HELLO

© O. Nierstrasz PS — Introduction1.47

Scripting Languages

HistoryCountless “shell languages” and “command languages” for operating systems and configurable applications

echo "Hello, World!"

on OpenStackshow message boxput "Hello World!" into message box

end OpenStack

puts "Hello World "

print "Hello, World!\n";

> Unix shell (ca. 1971) developed as user shell and scripting tool

> HyperTalk (1987) was developed at Apple to script HyperCard stacks

> TCL (1990) developed as embedding language and scripting language for X windows applications (via Tk)

> Perl (~1990) became de facto web scripting language

© O. Nierstrasz PS — Introduction1.48

How do Programming Languages Differ?

Common Constructs:• basic data types (numbers, etc.); variables;

expressions; statements; keywords; control constructs; procedures; comments; errors ...

Uncommon Constructs:• type declarations; special types (strings,

arrays, matrices, ...); sequential execution; concurrency constructs; packages/modules; objects; general functions; generics; modifiable state; ...

Improved background for choosing appropriate languages• C vs. Modula-3 vs. C++ for systems

programming• Fortran vs. APL vs. Ada for numerical

computations• Ada vs. Modula-2 for embedded systems• Common Lisp vs. Scheme vs. Haskell for

symbolic data manipulation• Java vs. C/CORBA for networked PC

programs

Evolution of Programming Languages

• ALGOL - 60 (ALGOrithmic Language)

Goals : Communicating AlgorithmsFeatures : Block Structure (Top-down design) Recursion (Problem-solving strategy) BNF - Specification

• LISP (LISt Processing) Goals : Manipulating symbolic information Features : List Primitives Interpreters / Environment

Evolution of Programming Languages

• Pascal Goal : Structured Programming, Type checking, Compiler writing. Features :

• Rich set of data types for efficient algorithm design

• E.g., Records, sets, ...• Variety of “readable” single-entry single-exit control structures

• E.g., for-loop, while-loop,...• Efficient Implementation

• Recursive descent parsing

Other Languages

• Functional• LISP, Scheme• ML, Haskell

• Logic • Prolog

• Object-oriented• Smalltalk, SIMULA, Modula-3, Oberon• C++, Java, C#, Eiffel, Ada-95

• Hybrid• Python, Ruby, Scala

• Application specific languages and tools

Programming Languages

•C

Bell labs Dennis Ritchie, 1973

•C++

Bjarne Stroustrup, 1980

Hybrid OOP

•Java

Sun Microsystems (formally announced in May 1995)

Pure OOP

Web programming

Current Trend

• Multiparadigm languages – Functional constructs for programming in the

small • Focus on conciseness and correctness

– Object-Oriented constructs for programming in the large

• Focus on programmer productivity and code evolution

• Example languages– Older: Python, Ruby, – Recent: Scala, F#, etc

Scheme (dialect of LISP)

• Recursive definitions• Symbolic computation : List Processing• Higher-order functions• Dynamic type checking• Functional + Imperative features• Automatic storage management

– Provides a uniform executable platform for studying, specifying, and comparing languages.

Java vs Scala

//Java - what we're used to seeing

public String buildEpochKey(String... keys) {StringBuilder s = new StringBuilder("elem") for(String key:keys) { if(key != null) { s.append(".") s.append(key) } } return s.toString(). toLowerCase()}

Java vs Scala

//Scala

def buildEpochKey(keys: String*): String = { ("elem" +: keys) filter(_ != null) mkString(".") toLowerCase}

1-58

Implementation Methods

• Compilation– Programs are translated into machine

language

• Pure Interpretation– Programs are interpreted by another

program known as an interpreter

• Hybrid Implementation Systems– A compromise between compilers and

pure interpreters

1-59

Compilation

• Translate high-level program (source language) into machine code (machine language)

• Slow translation, fast execution• Compilation process has several phases:

– lexical analysis: converts characters in the source program into lexical units

– syntax analysis: transforms lexical units into parse trees which represent the syntactic structure of program

– Semantics analysis: generate intermediate code– code generation: machine code is generated

1-60

The Compilation Process

Additional Compilation Terminologies

• Load module (executable image): the user and system code together

• Linking and loading: the process of collecting system program and linking them to user program

1-62

Pure Interpretation

• No translation• Easier implementation of programs (run-

time errors can easily and immediately displayed)

• Slower execution (10 to 100 times slower than compiled programs)

• Often requires more space• Becoming rare on high-level languages• Significant comeback with some Web

scripting languages (e.g., JavaScript)

1-63

Hybrid Implementation Systems

• A compromise between compilers and pure interpreters

• A high-level language program is translated to an intermediate language that allows easy interpretation

• Faster than pure interpretation• Examples

– Perl programs are partially compiled to detect errors before interpretation

– Initial implementations of Java were hybrid; the intermediate form, byte code, provides portability to any machine that has a byte code interpreter and a run-time system (together, these are called Java Virtual Machine)

1-64

Just-in-Time Implementation Systems

• Initially translate programs to an intermediate language

• Then compile intermediate language into machine code

• Machine code version is kept for subsequent calls

• JIT systems are widely used for Java programs

• .NET languages are implemented with a JIT system

1-65

Programming Environments

• The collection of tools used in software development

• UNIX– An older operating system and tool collection– Nowadays often used through a GUI (e.g., CDE, KDE, or

GNOME) that run on top of UNIX

• Borland JBuilder– An integrated development environment for Java

• Microsoft Visual Studio.NET– A large, complex visual environment– Used to program in C#, Visual BASIC.NET, Jscript, J#, or

C++

What Does This C Statement Mean?

*p++ = q++

increments p increments q

modifies *p

Does this mean… … or … or

*p = *q;++p;++q;

*p = *q;++q;++p;

tp = p;++p;tq = q;++q;*tp = *tq;

Languages

• Simula• Smalltalk• Algol• Cobol• F#• Prolog• Pascal• Modula-2• ADA• PL/I• CORBA

• PERL• BASIC• JAVASCRIPT• LISP• MIRANDA• ML• SCHEMA• SNOBOL• APL• DELPHI• MAYA

1-67

© O. Nierstrasz PS — Introduction1.68

Can you answer these questions?

Why are there so many programming languages? Why are FORTRAN and COBOL still important

programming languages? Which language should you use to implement a spelling

checker? A filter to translate upper-to-lower case? A theorem prover? An address database? An expert system? A game server for initiating chess games on the internet? A user interface for a network chess client?