15
Language Processing Systems Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Language Processing Systems - u-aizu.ac.jp

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Language Processing Systems - u-aizu.ac.jp

Language Processing Systems

Prof. Mohamed Hamada

Software Engineering Lab. The University of Aizu

Japan

Page 2: Language Processing Systems - u-aizu.ac.jp

2

Review

Page 3: Language Processing Systems - u-aizu.ac.jp

3

Compiler Architecture

Scanner (lexical

analysis)

Parser (syntax

analysis)

Code Optimizer

Code Generator

Source language

tokens Parse tree Intermediate

Language

Target language

Semantic Analysis 

IC generator

AST

Error Handler

Symbol Table

OIL

Page 4: Language Processing Systems - u-aizu.ac.jp

4

Compiler Architecture

Scanner (lexical

analysis)

Parser (syntax

analysis)

Code Optimizer

Code Generator

Source language

tokens Parse tree Intermediate

Language

Target language

Semantic Analysis 

IC generator

AST

Error Handler

Symbol Table

OIL

Front End Back End

Page 5: Language Processing Systems - u-aizu.ac.jp

5

Front-end and Back-end

Target-1 Code Generator Target-2 Code Generator

Intermediate-code Optimizer

Language-1 Front End

Source program in Language-1

Language-2 Front End

Source program in Language-2

Non-optimized Intermediate Code

Optimized Intermediate Code

Target-1 machine code Target-2 machine code

Page 6: Language Processing Systems - u-aizu.ac.jp

6

Front-end and Back-end

•  Suppose you want to write compilers from C++ to 4 computer platforms:

C++

Java

FORTRAN

MIPS

SPARC

Pentium

PowerPC

We need to write 12 programs

Page 7: Language Processing Systems - u-aizu.ac.jp

7

Front-end and Back-end

•  But we can do it better

FE BE

IR

–  IR: Intermediate Representation –  FE: Front-End –  BE: Back-End

C++

Java

FORTRAN

MIPS

SPARC

Pentium

PowerPC

BE

BE

BE

FE

FE

We need to write 7 programs only

Page 8: Language Processing Systems - u-aizu.ac.jp

8

Scanner

Scanner (lexical

analysis)

Parser (syntax

analysis)

Code Optimizer

Code Generator

Source language

tokens Parse tree Intermediate

Language

Target language

Semantic Analysis 

IC generator

AST

Error Handler

Symbol Table

OIL

Front End Back End

Scanner (lexical

analysis)

How it works? Use Finite Automata to recognize tokens Use Regular expressions to define tokens

How to write it? Use Unix command LEX

Page 9: Language Processing Systems - u-aizu.ac.jp

9

Parser

Scanner (lexical

analysis)

Parser (syntax

analysis)

Code Optimizer

Code Generator

Source language

tokens Parse tree Intermediate

Language

Target language

Semantic Analysis 

IC generator

AST

Error Handler

Symbol Table

OIL

Front End Back End

Parser (syntax

analysis)

How it works? Use Top-down (LL(k)) or Bottom-up (LR(k)) parsing to make the parse tree

How to write it? Use Unix command Yacc

Page 10: Language Processing Systems - u-aizu.ac.jp

10

Top Down Parsing

Parsing

Bottom Up Parsing

Predictive Parsing Shift-reduce Parsing

LL(k) Parsing LR(k) Parsing

Left Recursion

Left Factoring

Page 11: Language Processing Systems - u-aizu.ac.jp

11

Scanner and Parser

Scanner

Parser

symbol table

get next token

Source Program

get next char

next char next token

(Contains a record for each identifier)

1.  Uses Regular Expressions to define tokens

2.  Uses Finite Automata to recognize tokens

Uses Top-down parsing or Bottom-up parsing

To construct a Parse tree

Page 12: Language Processing Systems - u-aizu.ac.jp

12

Semantics analysis

Scanner (lexical

analysis)

Parser (syntax

analysis)

Code Optimizer

Code Generator

Source language

tokens Parse tree Intermediate

Language

Target language

Semantic Analysis 

IC generator

AST

Error Handler

Symbol Table

OIL

Front End Back End

Abstract Syntax Tree

Scope

Symbol Table

Type Checker

Semantic Analysis 

Page 13: Language Processing Systems - u-aizu.ac.jp

13

Intermediate Code (IC) Generator

Scanner (lexical

analysis)

Parser (syntax

analysis)

Code Optimizer

Code Generator

Source language

tokens Parse tree Intermediate

Language

Target language

Semantic Analysis 

IC generator

AST

Error Handler

Symbol Table

OIL

Front End Back End

Three-address code

Directed Acyclic Graph (DAG)

Control Flow Graph (CFG)

IC generator

Stack based (postfix)

Page 14: Language Processing Systems - u-aizu.ac.jp

14

Code Generator

Scanner (lexical

analysis)

Parser (syntax

analysis)

Code Optimizer

Code Generator

Source language

tokens Parse tree Intermediate

Language

Target language

Semantic Analysis 

IC generator

AST

Error Handler

Symbol Table

OIL

Front End Back End

Data Dependency Graph

Instruction Selection

Register Allocation

Target Machine

Code Generator

Memory Management

Page 15: Language Processing Systems - u-aizu.ac.jp

15

Parser := id1 + id2 *

id3 60

position := initial + rate * 60

Scanner

id1 := id2 + id3 * 60

Semantic Analyzer

:= id1 + id2 *

id3 int-to-real

60

Intermediate Code Generator

temp1 := int-to-real (60) temp2 := id3 * temp1 temp3 := id2 + temp2 id1 := temp3

Code Optimizer

temp1 := id3 * 60.0 id1 := id2 + temp1

Code Generator

MOV id3, R2 MUL #60.0, R2 MOV id2, R1 ADD R2, R1 MOV R1, id1

Example