28
CH1.1 CSE244 Chapter 1: Introduction to Chapter 1: Introduction to Compiling Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Box U-2155 Storrs, CT 06269-2155 [email protected] http://www.cse.uconn.edu/~akiayias Additional Notes Credits: Steven A. Demurjian CSE, UCONN Robert LaBarre United Technologies Research Center

CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

  • View
    220

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.1

CSE244

Chapter 1: Introduction to CompilingChapter 1: Introduction to Compiling

Aggelos KiayiasComputer Science & Engineering Department

The University of Connecticut371 Fairfield Road, Box U-2155

Storrs, CT [email protected]

http://www.cse.uconn.edu/~akiayias

Additional Notes Credits:Steven A. Demurjian

CSE, UCONNRobert LaBarre

United Technologies Research Center

Page 2: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.2

CSE244

Introduction to CompilersIntroduction to Compilers

As a Discipline, Involves Multiple CS&E AreasAs a Discipline, Involves Multiple CS&E Areas Programming Languages and Algorithms Theory of Computing & Software Engineering Computer Architecture & Operating Systems

Has Deceivingly Simplistic Intent:Has Deceivingly Simplistic Intent:

CompilerSource program

Target Program

Error messages

Diverse & Varied

Page 3: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.3

CSE244

Classifications of CompilersClassifications of Compilers

Compilers Viewed from Many PerspectivesCompilers Viewed from Many Perspectives

However, All utilize same basic tasks to However, All utilize same basic tasks to accomplish their actionsaccomplish their actions

Single Pass

Multiple Pass

Load & Go

Construction

Debugging

OptimizingFunctional

Page 4: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.4

CSE244

The ModelThe Model

The TWO Fundamental Parts:The TWO Fundamental Parts:

We Will Discuss Both in This Class, andWe Will Discuss Both in This Class, andFOCUS on analysis.FOCUS on analysis.

Analysis:

Synthesis:

Decompose Source into an intermediate representation

Target program generation from representation

Page 5: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.5

CSE244

Important Notes

Today: There are many Today: There are many Software ToolsSoftware Tools for helping with the for helping with the AnalysisAnalysis Part. This Wasn’t the Case in Early Days. Part. This Wasn’t the Case in Early Days. (some) (some) analysis is also important inanalysis is also important in::

Structure / Syntax directed editors: Force “syntactically” correct code to be entered

Pretty Printers: Standardized version for program structure (i.e., blank space, indenting, etc.)

Static Checkers: A “quick” compilation to detect rudimentary errors

Interpreters: “real” time execution of code a “line-at-a-time”

Page 6: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.6

CSE244

Important Notes

Compilation Is Compilation Is NotNot Limited to Programming Language Limited to Programming Language ApplicationsApplications Text Formatters

LATEX & TROFF Are Languages Whose Commands Format Text

Silicon Compilers Textual / Graphical: Take Input and Generate Circuit Design

Database Query Processors Database Query Languages Are Also a Programming

Language

Input is compiled Into a Set of Operations for Accessing the Database

Page 7: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.7

CSE244

The Many The Many PhasesPhases of a Compiler of a CompilerSource Program

Lexical Analyzer

1

Syntax Analyzer2

Semantic Analyzer3

Intermediate Code Generator

4

Code Optimizer5

Code Generator6

Target Program

Symbol-table Manager

Error Handler

1, 2, 3 : Analysis - Our Focus

4, 5, 6 : Synthesis

Page 8: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.8

CSE244

Language-Processing SystemLanguage-Processing System

Source Program

Pre-Processor1

Compiler2

Assembler3

RelocatableMachine Code

4

Loader Link/Editor

5

Executable

Library,relocatable object files

Page 9: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.9

CSE244

Three Phases:Three Phases: Linear / Lexical Analysis:

L-to-r Scan to Identify Tokenstoken: sequence of chars having a collective meaning

Hierarchical Analysis:

Grouping of Tokens Into Meaningful Collection

Semantic Analysis:

Checking to ensure Correctness of Components

The Analysis Task For Compilation

Page 10: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.10

CSE244

Phase 1. Lexical Analysis

Easiest Analysis - Identify tokens which are the basic building blocks

For Example:

All are tokens

Blanks, Line breaks, etc. are scanned out

Position := initial + rate * 60 ;_______ __ _____ _ ___ _ __ _

Page 11: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.11

CSE244

Phase 2. Phase 2. Hierarchical AnalysisHierarchical Analysisaka aka ParsingParsing or or Syntax AnalysisSyntax Analysis

For previous example,

we would have

Parse Tree:

identifier

identifier

expression

identifier

expression

number

expression

expression

expression

assignment statement

position

:=

+

*

60

initial

rate

Nodes of tree are constructed using a grammar for the language

Page 12: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.12

CSE244

What is a Grammar?What is a Grammar?

Grammar is a Set of Rules Which Govern the Grammar is a Set of Rules Which Govern the Interdependencies & Structure Among the TokensInterdependencies & Structure Among the Tokens

statement is an assignment statement, or while statement, or if statement, or ...

assignment statement

expression is an

is an identifier := expression ;

(expression), or expression + expression, or expression * expression, or number, or identifier, or ...

Page 13: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.13

CSE244

Why Have We Divided Analysis Why Have We Divided Analysis in This Manner?in This Manner?

Lexical Analysis - Scans Input, Its Linear Actions Lexical Analysis - Scans Input, Its Linear Actions Are Not RecursiveAre Not Recursive Identify Only Individual “words” that are the

the Tokens of the Language Recursion Is Required to Identify Structure of an Recursion Is Required to Identify Structure of an

Expression, As Indicated in Parse TreeExpression, As Indicated in Parse Tree Verify that the “words” are Correctly

Assembled into “sentences” What is Third Phase?What is Third Phase?

Determine Whether the Sentences have One and Only One Unambiguous Interpretation

… and do something about it! e.g. “John Took Picture of Mary Out on the

Patio”

Page 14: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.14

CSE244

Phase 3. Semantic AnalysisPhase 3. Semantic Analysis

Find More Complicated Semantic Errors and Find More Complicated Semantic Errors and Support Code GenerationSupport Code Generation

Parse Tree Is Augmented With Semantic ActionsParse Tree Is Augmented With Semantic Actions

position

initial

rate

:=+

*

60

Compressed Tree

position

initial

rate

:=+

*

inttoreal

60

Conversion Action

Page 15: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.15

CSE244

Phase 3. Semantic AnalysisPhase 3. Semantic Analysis

Most ImportantMost Important Activity in This Phase: Activity in This Phase:

Type CheckingType Checking - - Legality of OperandsLegality of Operands

Many Different Situations:Many Different Situations:

Real := int + char ;

A[int] := A[real] + int ;

while char <> int do

…. Etc.

Page 16: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.16

CSE244

Supporting Phases/ Activities for Analysis

Symbol Table Creation / MaintenanceSymbol Table Creation / Maintenance Contains Info (storage, type, scope, args) on

Each “Meaningful” Token, Typically Identifiers Data Structure Created / Initialized During

Lexical Analysis Utilized / Updated During Later Analysis &

Synthesis

Error HandlingError Handling Detection of Different Errors Which

Correspond to All Phases What Kinds of Errors Are Found During the

Analysis Phase? What Happens When an Error Is Found?

Page 17: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.17

CSE244

The Many The Many PhasesPhases of a Compiler of a CompilerSource Program

Lexical Analyzer

1

Syntax Analyzer2

Semantic Analyzer3

Intermediate Code Generator

4

Code Optimizer5

Code Generator6

Target Program

Symbol-table Manager

Error Handler

1, 2, 3 : Analysis - Our Focus

4, 5, 6 : Synthesis

Page 18: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.18

CSE244

The Synthesis Task For Compilation Intermediate Code GenerationIntermediate Code Generation

Abstract Machine Version of Code - Independent of Architecture

Easy to Produce and Do Final, Machine Dependent Code Generation

Code OptimizationCode Optimization Find More Efficient Ways to Execute Code Replace Code With More Optimal Statements 2-approaches: High-level Language &

“Peephole” Optimization Final Code GenerationFinal Code Generation

Generate Relocatable Machine Dependent Code

Page 19: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.19

CSE244

Reviewing the Entire ProcessReviewing the Entire Process

Errors

position := initial + rate * 60

lexical analyzer

syntax analyzer

semantic analyzer

intermediate code generator

id1 := id2 + id3 * 60

:=

id1id2l

id3

+*

60

:=

id1id2l

id3

+*

inttoreal

60

Symbol Table

position ....

initial ….

rate….

Page 20: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.20

CSE244

Reviewing the Entire ProcessReviewing the Entire Process

Errors

intermediate code generator

code optimizer

final code generator

temp1 := inttoreal(60)

temp2 := id3 * temp1

temp3 := id2 + temp2

id1 := temp3

temp1 := id3 * 60.0

id1 := id2 + temp1

MOVF id3, R2

MULF #60.0, R2MOVF id2, R1ADDF R1, R2MOVF R1, id1

position ....

initial ….

rate….

Symbol Table

3 address code

Page 21: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.21

CSE244

AssemblersAssemblers

Assembly code: names are used for instructions, Assembly code: names are used for instructions, and names are used for memory addresses.and names are used for memory addresses.

Two-pass Assembly:Two-pass Assembly: First Pass: all identifiers are assigned to

memory addresses (0-offset)e.g. substitute 0 for a, and 4 for b

Second Pass: produce relocatable machine code:

MOV a, R1

ADD #2, R1MOV R1, b

0001 01 00 00000000 *

0011 01 10 000000100010 01 00 00000100 *

relocationbit

Page 22: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.22

CSE244

Loaders and Link-EditorsLoaders and Link-Editors

Loader: taking relocatable machine code, altering Loader: taking relocatable machine code, altering the addresses and placing the altered instructionsthe addresses and placing the altered instructionsinto memory.into memory.

Link-editor: taking many (relocatable) machine Link-editor: taking many (relocatable) machine code programs (with cross-references) and produce code programs (with cross-references) and produce a single file.a single file. Need to keep track of correspondence between

variable names and corresponding addresses in each piece of code.

Page 23: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.23

CSE244

Compiler Cousins:Compiler Cousins: PreprocessorsPreprocessors Provide Input to Compilers

1. Macro Processing

#define in C: does text substitution before compiling

#define X 3

#define Y A*B+C

#define Z getchar()

Page 24: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.24

CSE244

2. File Inclusion

#include in C - bring in another file before compiling

defs.h

//////

//////

//////

main.c

#include “defs.h”

…---…---…---…---…---…---…---…---…---

//////

//////

//////

…---…---…---…---…---…---…---…---…---

Page 25: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.25

CSE244

3. Rational Preprocessors

Augment “Old” Languages With Modern Augment “Old” Languages With Modern ConstructsConstructs

Add Macros for If - Then, While, Etc. Add Macros for If - Then, While, Etc.

#Define Can Make C Code More Pascal-like#Define Can Make C Code More Pascal-like

#define begin {

#define end }

#define then

Page 26: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.26

CSE244

4. Language Extensions for a Database System

EQUEL - Database query language embedded in a programming language. C

## Retrieve (DN=Department.Dnum) where

## Department.Dname = ‘Research’

is Preprocessed into:

ingres_system(“Retr…..Research’”,____,____);

a procedure call in a programming language.

Page 27: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.27

CSE244

The Grouping of Phases

Front End : Analysis + Intermediate Code Generation

Back End : Code Generation + Optimizationvs.

Number of Passes:

A pass: requires r/w intermediate files

Fewer passes: more efficiency.

However: fewer passes require more sophisticated memory management and compiler phase interaction.

Tradeoffs ……..

Page 28: CH1.1 CSE244 Chapter 1: Introduction to Compiling Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield

CH1.28

CSE244

Compiler Construction Tools

Parser Generators : Produce Syntax Analyzers

Scanner Generators : Produce Lexical Analyzers <= Lex (Flex)

Syntax-directed Translation Engines : Generate Intermediate Code <= Yacc (Bison)

Automatic Code Generators : Generate Actual Code

Data-Flow Engines : Support Optimization