42
Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Embed Size (px)

Citation preview

Page 1: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Unit-1 IntroductionPREPARED BY:

PROF. HARISH I RATHOD

COMPUTER ENGINEERING DEPARTMENT

GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE

COMPILER DESIGN (170701)

Page 2: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Introduction

• Programming languages are notations for describing computations to people and to machines.

• The world depend on programming languages because,• All the software running on all the computer is written in some

programming language.• But before a program can be run, it first must be translated into

a form in which it can be executed by a computer.• The software systems that do this translations are called

compilers.GPERI – CD - UNIT-1 2

Page 3: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Language Processors

• Compiler:• It is a program that can read a program in one language (the

source language) and translate it into an equivalent program in another language (the target language).

• The role of compiler is to report any errors in the source program that it detects during the translation process.

GPERI – CD - UNIT-1 3

Page 4: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Language Processors

• Compiler:

• If the target program is an executable machine language program, it can then be called by the user to process inputs and produce outputs.

GPERI – CD - UNIT-1 4

CompilerSource program

Target program

Fig 1: A compiler

Target ProgramInput Output

Fig 2: Running the target program

Page 5: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Language Processors

• Interpreter:• Instead of producing a target program as a translation,• It appears to directly execute the operations specified in the

source program on input supplied by the user.

GPERI – CD - UNIT-1 5

Target ProgramInput

Output

Fig 2: Running the target program

Source program

Page 6: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Language Processors

• Difference between compiler and interpreter:• The machine language target program produced by a compiler

is usually much faster than an interpreter.• An interpreter can give better error diagnostics than a compiler,

because it execute the source program statement by statement.

• In compiler, several other programs may be required to create an executable target program.

GPERI – CD - UNIT-1 6

Page 7: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Language Processors

• .

GPERI – CD - UNIT-1 7

Preprocessor

Compiler

Assembler

Linker/Loader

Source program

Modified source program

Target assembly program

Re-locatable machine code

target machine code

Library filesRe-locatable object file.

Page 8: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Language Processors

• A source program divided into modules stored in separate files.

• The task of collecting a source program is sometimes entrusted to a separate program, called preprocessor.

• The modified source program is then fed to a compiler.• The compiler produce an assembly-language program as its

output.GPERI – CD - UNIT-1 8

Page 9: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Language Processors

• The assembly language is then processed by a program called an assembler.

• An assembler produces re-locatable machine code as its output.

• Large program are often compiled in pieces, • so the re-locatable machine code may have to linked together

with other re-locatable object files and library files into the code that actually runs on the machine.

GPERI – CD - UNIT-1 9

Page 10: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Language Processors

• The linker resolves (decides) external memory addresses, where the code in one file may refer to location in another file.

• The loader then puts together all of the executable object files into memory for execution.

GPERI – CD - UNIT-1 10

Page 11: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Structure of Compiler (Front end and Back end)

• We treated a compiler as a single box, • That maps a source program into a semantically equivalent

target program.• If we open this box there are two parts to this mapping: • Analysis and • Synthesis.

GPERI – CD - UNIT-1 11

Page 12: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Structure of Compiler (Front end and Back end)

• Analysis part:• Breaks up the source program into constituent pieces and

impose (execute or carry out) a grammatical structure on them.• Then use this structure to create an intermediate

representation of the source program.• If this part detects that the source program is either

syntactically ill formed or semantically unsound,• Then it must provide informative messages, so the user can

take corrective action.GPERI – CD - UNIT-1 12

Page 13: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Structure of Compiler (Front end and Back end)

• Analysis part:• This part also collect information about the source program

and store it in a data structure called a symbol table.• Analysis determines the operations implied by the source

program which are recorded in a tree structure• The analysis part is often called the front end of the compiler

GPERI – CD - UNIT-1 13

Page 14: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Structure of Compiler (Front end and Back end)

• Synthesis part:• Synthesis takes the tree structure and translates the operations

therein into the target program. • or• It constructs the target program from the intermediate

representation and the information in the symbol table.• The synthesis part is the back end.

GPERI – CD - UNIT-1 14

Page 15: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Analysis of the source program

• Lexical Analysis (Linear Analysis): • source program reads from left to right and grouped into token

e.g. • constants, • variables names, • keywords etc. (check for valid token set).

GPERI – CD - UNIT-1 15

Page 16: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Analysis of the source program

• Hierarchical Analysis (Syntax Analysis or Parsing):• Grouped tokens into grammatical phase and construct parse

tree (check for valid syntax).• Semantic Analysis: • Certain checks are performed to ensure that the components of

a program fit together meaningfully.• i.e. its tasks is to determine the meaning of the source program

(check for the semantic errors )

GPERI – CD - UNIT-1 16

Page 17: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Phases of compiler

• .

GPERI – CD - UNIT-1 17

Symbol Table

Lexical Analyzer

Character stream

Syntax Analyzer

Token stream

Semantic Analyzer

Syntax tree

Intermediate Code Generator

Syntax tree

Machine Independent Code Optimizer

Intermediate representation

Code Generator

Machine Dependent Code Optimizer

Target machine code

Intermediate representation

Target machine code

Page 18: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Lexical Analysis

• First phase of compiler.• Also called lexical analysis or scanning.• The lexical analyzer reads the stream of characters of the

source program and groups the character into meaningful sequences called lexeme.

• For each lexeme lexical analyzer produces token as output.• The form of token is:

(token-name, attribute-value)GPERI – CD - UNIT-1 18

Page 19: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Lexical Analysis

• The token is pass to the next phase, syntax analysis.• In token,• The first component token-name is an abstract symbol that is

used during syntax analysis.• The second component attribute-value points to an entry in

the symbol table for this token.

GPERI – CD - UNIT-1 19

Page 20: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Lexical Analysis

• Example:• A source program contain assignment statement.

position = initial + rate * 60

• It could be group into the following lexeme and mapped into the following tokens.

GPERI – CD - UNIT-1 20

Page 21: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Lexical Analysis

• position is a lexeme, mapped into a token (id,1),• Where:• id (identifier) is an abstract symbol, and • 1 points to the symbol table entry for position.

• The assignment symbol = is lexeme, mapped into a token (=), no need attribute value, omitted second component.

• Initial is a lexeme, mapped into the token (id,2)• Where:• 2 points to the symbol table entry for initial.

GPERI – CD - UNIT-1 21

Page 22: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Lexical Analysis

• + is a lexeme, mapped into token (+).• rate is a lexeme, mapped into a token (id,3),• Where:• 3 points to the symbol table entry for rate.

• * is a lexeme, mapped into token (*).• 60 is a lexeme, mapped into token (60).

(id,1) (=) (id,2) (+) (id,3) (*) (60)

GPERI – CD - UNIT-1 22

Page 23: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Lexical Analysis

• .

GPERI – CD - UNIT-1 23

Page 24: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Syntax Analysis (parsing)

• The second phase of compiler.• It uses the first component of the tokens produced by the

lexical analyzer to create a tree like intermediate representation.• Known as syntax tree in which:• Interior node represent an operation and• child node represent the arguments of the operations.

GPERI – CD - UNIT-1 24

Page 25: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Lexical Analysis

• .

GPERI – CD - UNIT-1 25

Page 26: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Semantic Analysis

• Uses the syntax tree and the information in the symbol table to check the source program for semantic consistency.

• It also gathers types information and saves it in either the syntax tree or the symbol table for the next phase use.

• Its important task is type checking,• where compiler checks that each operator has matching

operands.

GPERI – CD - UNIT-1 26

Page 27: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Semantic Analysis

• For example:• Many programming language require an array index to be an

integer;• The compiler must report an error if a floating point number is

used to index as an array.• Also permit some type conversion.• For example: a binary arithmetic operator may be applied to

either a pair of integers or to a pair of floating points number.

GPERI – CD - UNIT-1 27

Page 28: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Lexical Analysis

• .

GPERI – CD - UNIT-1 28

Page 29: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Intermediate Code Generation

• During the process of translating, compiler may construct one or more intermediate represent. (Syntax tree)

• They are commonly used during syntax and semantic analysis.• After syntax and semantic analysis of the source program,• Many compilers generate an explicit low-level or machine like

intermediate representation.• It have two important properties:

GPERI – CD - UNIT-1 29

Page 30: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Intermediate Code Generation

• It have two important properties:• It should be easy to produce,• It should be easy to translate into the target machine .

• We consider an intermediate form called three-address code.• Consist of a sequence of assembly-like instructions with three

operands per instruction.

GPERI – CD - UNIT-1 30

Page 31: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Intermediate Code Generation

• Each operand can act like a register.

t1 = inttofloat(60)

t2 = id3 * t1

t3 = id2 * t2

id1 = t3

GPERI – CD - UNIT-1 31

Page 32: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Lexical Analysis

• .

GPERI – CD - UNIT-1 32

Page 33: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Code optimization

• Attempts to improve the intermediate code so that better target code result

t1 = id3 * 60.0

id1 = id2 + t1

GPERI – CD - UNIT-1 33

Page 34: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Lexical Analysis

• .

GPERI – CD - UNIT-1 34

Page 35: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Code Generation

• Final phase of compiler to generate the target code.• Memory location are selected for each variable used by the

program. • Intermediate instruction are translated into sequence of m/c

instruction having similar meaning.• For example using register R1 and R2.

GPERI – CD - UNIT-1 35

Page 36: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Code Generation

LDF R2, id3

MULF R2, R2, #60.0

LDF R1, id2

ADDF R1, R1, R2

STF id1, R1

GPERI – CD - UNIT-1 36

Page 37: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Lexical Analysis

• .

GPERI – CD - UNIT-1 37

Page 38: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Symbol Table Management

• It is the data structure which contains a record for each identifier with its attribute list.

• As a identifier identified by scanner (lexical analyzer) it will be entered into symbol table.

• Essential function of compiler is to record the identifiers with its attributs (type, scope, storage location, etc.)

GPERI – CD - UNIT-1 38

Page 39: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

The grouping of phases

• Compiler front and back ends:

• Front ends: analysis :• It consists of those phases, or parts of phases, that depend

primarily on the source language and are largely independent of the target machine.

GPERI – CD - UNIT-1 39

Page 40: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

The grouping of phases

• Compiler front and back ends:

• Back end: synthesis (machine dependent):• It includes those portions of the compiler that,• depend on the target machine, and generally, those portions

do not depend on the source language.

GPERI – CD - UNIT-1 40

Page 41: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

The grouping of phases

• Advantage of Analysis – Synthesis concept: • One can take the front end of a compiler and redo its associated

back end to produce a compiler for the same source language on a different machine.

• If the back end design carefully,• it may not even be necessary to redesign too much of the back

end.

GPERI – CD - UNIT-1 41

Page 42: Unit-1 Introduction PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

Compiler Construction Tools

• Software development tools are available to implement one or more compiler phases.

• Scanner generators• Parser generators• Syntax-directed translation engines• Automatic code generators• Data-flow engines

GPERI – CD - UNIT-1 42