29
6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading and running C--. The Gnu assembler and the VxWorks dynamic linker. Monkey see, monkey do. gener.cxx. Structure of an M68K assembly file. Declarations.

6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

Embed Size (px)

Citation preview

Page 1: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.16.1

6. Phase 3 : Code Generation Part I6. Phase 3 : Code Generation Part I

• Overview of compilation.

• The unit directory.

• genProg.cxx.

• What you must do.

• Compiling, assembling, downloading and running C--.

• The Gnu assembler and the VxWorks dynamic linker.

• Monkey see, monkey do.

• gener.cxx.

• Structure of an M68K assembly file.

• Declarations.

• Statements.

Page 2: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.26.2

OverviewOverview

• A compiler is lexer + syner + gener.

– Written lexer and syner. Now you write the gener.

– Easiest of the three once you know what M68K code to generate. And I tell you that bit.

compilerstdin stdout Errors ||

M68KC-- Code

Page 3: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.36.3

The Unit DirectoryThe Unit Directory

• The unit directory for this phase is

/usr/users/staff/aosc/cm049icp/phase3

• Among other things it contains the following :

– genprog.cxx : the test bed program for phase 3.

– gener.template : a template file for your phase 3 programs.

– gener.h : the header file for phase 3. trueval, falseval : constants to represent true

and false in M68K code. INT_MAX_16_BIT,INT_MIN_16_BIT : maximum

and minimum values for 16 bit 2s complement integers.

RPolish : struct for holding the Reverse Polish representation of C-- expressions.

– makefile : the makefile for phase 3.

Page 4: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.46.4

The Unit Directory IIThe Unit Directory II

– gener : an executable for my phase 3 program.

– tests/test*.c-- : testing programs for the demo.

– rpolish.cxx : Reverse Polish conversion programs.RPolish *append(RPolish *rp1, RPolish *rp2)RPolish *toRPF(Factor *fact)RPolish *toRPT(Term *term)RPolish *toRPBE(BasicExp *bexp)RPolish *toRPE(Expression *expr)

– partialExp.cxx : partial code generator for expressions.void genExpression(SymTab *st, Expression *expr, int &label, int &finalLabel) Only handles literal constants. Do a full implementation of genExpression after

you’ve got the rest of it to work.

Page 5: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.56.5

genProg.cxxgenProg.cxx

• The test bed program is as follows :

#include “.../phase2/syner.h”#include “.../phase3/gener.h”

void main(){ SymTab *st = NULL ; AST *ast = NULL ; int label = 0 ;

synAnal(st, ast, label) ; generate(st, ast, label) ;}

• First calls synAnal to parse the C--, then calls generate to produce M68K code.

• Input/Output is from/to stdin/stdout.

• For a ‘real’ compiler would use argc/argv and command line arguments to use files for Input/Output.

You must write this

subprogram

You must write this

subprogram

Page 6: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.66.6

What You Must DoWhat You Must Do

• Your implementation of generate must be in a file called gener.cxx in your directory.

• Take a copy of makefile and gener.template.

• Print out a copy of gener.h.

• Print out a copy of rpolish.cxx.

• Print out a copy of partialExp.cxx.

• Useful commands : testphase3, demophase3.

– They work as usual.

– Your program’s output must be exactly the same as mine to get the marks.

Page 7: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.76.7

Compiling, Assembling, Downloading & Running C--Compiling, Assembling, Downloading & Running C--

• C-- program in prog.c-- :

const string s = “Hello\n” ;{ cout << s ; }

• Make and run gener :

jaguar> make generjaguar> gener < prog.c-- > a.sjaguar> assem ajaguar>

• Connect to VxWorks box and download and run :

rlogin moloch-> ld < a-> runHello->

UNIX command.

UNIX command.

Page 8: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.86.8

M68K Assembler & VxWorks Dynamic LinkerM68K Assembler & VxWorks Dynamic Linker

• assem is a shell script (in /usr/users/staff/aosc/bin) which calls the Gnu Motorola 68000 assembler.

• Gnu assembler is a high level Macro-Assembler.

– Supports medium level memory management.

– Makes variables/constants very easy.

• VxWorks has a dynamic linker.

– Similar to NT except that it works.

– M68K programs contain calls to library subroutines. e.g. scanf, printf. Run-time addresses of these subroutines are not known to

the compiler.

– When programs are downloaded the required addresses are automatically linked in.

Page 9: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.96.9

Monkey See, Monkey DoMonkey See, Monkey Do

• My code generator is in a file called gener in the unit directory for this phase.

– /usr/users/staff/aosc/cm049icp/phase3

– To work out what assembly code you need to generate run gener on C-- source code and inspect the output that is produced.

– More or less the approach I adopted using C source and the GNU C compiler, cc68k.

– Took longer than I expected because GNU assembler uses non-standard M68K assembly code mnemonics and assembler directives. Bloody idiots.

• Rest of this lecture is just a few ‘handy hints’.

Page 10: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.106.10

Monkey See, Monkey Do IIMonkey See, Monkey Do II

• Monkey see, monkey do is standard in the industry.

• Usually have to tweak the instruction set of one chip into the instruction set of another.

• Tend to stick to a small set of instructions which are common to all chips.

– e.g. MOVE, ADD, JMP etc.

– Usually about 20% of a CISC chip’s instruction set.

• Main reason for RISC chips.

– Why provide lots of instructions that no-one uses?

– RISC chips have a lot fewer instructions than CISC chips.

– Fewer instructions means less tweaking means less work.

Page 11: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.116.11

Top Level Structure For gener.cxxTop Level Structure For gener.cxx

• Contents of gener.template :

#include <iostream.h>#include <fstream.h>#include <iomanip.h>#include <ctype.h>#include <stddef.h>#include <stdlib.h>

#include “.../lib/cstring.h”#include “.../phase2/syner.h”

#include “.../phase3/rpolish.cxx”

void genHeader(){ cout << “genHeader\n” ; }

void genFooter(int finalLabel){ cout << “genFooter\n” ;

Page 12: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.126.12

Top Level Structure For gener.cxx IITop Level Structure For gener.cxx II

void genDec(SymTab *st){ cout << “genDec\n” ; }

void genDeclarations(SymTab *st){ cout << “genDeclarations\n” ;

#include “.../phase3/partialExp.cxx”

// Forward Declaration.void genStatements(SymTab *st, AST *ast, int &label, int &finalLabel) ;

void genIfSt(SymTab *st, AST *ast, int &label, int &finalLabel){ cout << “genIfSt\n” ; }

Page 13: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.136.13

Top Level Structure For gener.cxx IIITop Level Structure For gener.cxx III

void genWhileSt(SymTab *st, AST *ast, int &label, int &finalLabel){ cout << “genWhileSt\n” ; }

void genCinSt(SymTab *st, AST *ast, int &label, int &finalLabel){ cout << “genCinSt\n” ; }void genCoutSt(SymTab *st, AST *ast, int &label, int &finalLabel){ cout << “genCoutSt\n” ; }

Page 14: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.146.14

Top Level Structure For gener.cxx IVTop Level Structure For gener.cxx IV

void genAssignSt(SymTab *st, AST *ast, int &label, int &finalLabel){ cout << “genAssignSt\n” ; }

void genStatements(SymTab *st, AST *ast, int &label, int &finalLabel){ cout << “genStatements\n” ; }

Page 15: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.156.15

Top Level Structure For gener.cxx VTop Level Structure For gener.cxx V

void generate(SymTab *st, AST *ast, int label){ int finalLabel = label++ ;

genHeader() ; genDeclarations(st) ; genStatements(st, ast, label, finalLabel) ; genFooter(finalLabel)} // generate

• finalLabel used to label the error code for integer overflow.

– Avoids using a 2-pass generator.

Page 16: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.166.16

Structure Of A M68K Assembler FileStructure Of A M68K Assembler File

• Assembler code file is made up of 3 parts :

– Standard header part.

– Specific assembly code generated from C-- source.

– Standard footer part.

• Standard header :

#NO_APP_IOinteger:

.asciz “%d”_Eintegeroverflow:

.asciz “\n\nInteger Overflow!\n”

|| Declarations go here.|.even.globl _run_run:

To find out what this means RTFM.

To find out what this means RTFM.

Page 17: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.176.17

Structure Of A M68K Assembler File IIStructure Of A M68K Assembler File II

• Standard footer :

RTSLfinalLabel:

LINK A6,#0PEA _EintegeroverflowJBSR __printfADDQ.W #4,SPUNLK A6RTS

• Code after LfinalLabel label is integer overflow handling code.

– Obviously, use value of finalLabel not its name.

• RTS : VxWorks calls the assembly code as a subprogram.

• ‘\t’ at start of all indented lines throughout assembly code.

• No ‘\t’ anywhere else (except in strings).

Page 18: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.186.18

Variable And Constant DeclarationsVariable And Constant Declarations

• genDec handles a single declaration.

• C-- :

int i1 = 0 ;int i2 ;const string str = “Hello\n” ;bool b1 = false ;bool b2 ;

• M68K :

.comm i1,4

.comm i2,4Lstr: .asciz “Hello\n” ;.comm b1,4.comm b2,4

• Note that strings are initialised on declaration.

Page 19: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.196.19

Code For genDeclarationsCode For genDeclarations

void genDeclarations(SymTab *st){ SymTab *stsave = NULL ;

stsave = st ; while (st != NULL){ genDec(st) ;

st = st->next ; }

cout << “.even\n” ; cout << “.globl _run\n” ; cout << “_run\n” ;

st = stsave ;

while (st != NULL) // Initialise ints and bools. st = st->next ;}

Page 20: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.206.20

Initialising ints and boolsInitialising ints and bools

• int and bool constants and variables must be initialised when the program runs.

– i.e. by M68K MOVE instructions.

• In genDeclarations : if (st->initialise != NULL) && (st-type != STRINGDATA){ cout << “\tMOVE.W “ ; if (st->type == INTDATA) cout << “#’ << st->initialise->litInt ; else if (st->type == BOOLDATA) { if (st->initialise->litBool == “true”) cout << ‘#’ << trueval ; else if (st->initalise->litBool == “false”) cout << ‘#’ << falseval ; } cout << ‘,’ << st->ident << endl ;}

Page 21: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.216.21

genStatementsgenStatements

• genStatements simply steps through the AST calling other subprograms to generate the code for individual statements :

void genStatements(...){ while (ast != NULL) { if (ast->tag == IFST) genIfSt(st, ast, label, finalLabel) ; else if (ast->tag == WHILEST) genWhileSt(st, ast, label, finalLabel) ; else if (ast->tag == CINST) genCinSt(st, ast, label, finalLabel) ; else if (ast->tag == COUTST) genCoutSt(st, ast, label, finalLabel) ; else if (ast->tag == ASSIGNST) genAssignSt(st, ast, label, finalLabel) ; } ast = ast->next ;} // genStatements

Page 22: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.226.22

cin Statementscin Statements

• C-- :

cin >> invar ;

• M68K :

LINK A6,#-4LEA A6@(-4),A0MOVE.L A0,SP@-PEA _IOintegerJBSR _scanfADDQ.W #8,SPMOVE.L A6@(-4),invarUNLK A6MOVE.L invar,D0CMP.L #INT_MAX_16_BIT,D0BGT LfinalLabelCMP.L #INT_MIN_16_BIT,D0BLT LfinalLabel

Page 23: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.236.23

cout Statementscout Statements

• C-- :

cout >> outvar ;

• M68K for strings :

LINK A6,#-0PEA LoutvarJSBR _printfADDQ.W #4,SPUNLK A6

• M68K for ints :

LINK A6,#-4MOVE.L outvar,SP@-PEA _IOintegerJSBR _printfADDQ.W #4,SPUNLK A6

Page 24: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.246.24

Assignment StatementsAssignment Statements

• C-- :

var = expression ;

• M68K :

| Code to evaluate expression.MOVE.L D0,var

• Code for the expression is generated by genExpression.

• Convention : result of the expression will be left in D0.

• Next lecture on how to write genExpression. For now just use the partial implementation from partialExp.cxx.

– Can only use literal constants.

Page 25: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.256.25

while Statementswhile Statements

• C-- :

while (condition){ statements } ;

• M68K :

Lstartlabel: | Code to evaluate condition. CMP.L trueval,D0 BNE Lendlabel | Code to execute statements. JMP LstartlabelLendlabel:

Page 26: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.266.26

while Statements IIwhile Statements II

• Obviously, use the integer values of ast->whilest->startlabel and ast-whilest->endlabel rather than their names after the Ls.

• Code to evaluate condition expression is generated by genExpression.

– Initially can only use boolean literal constants.

• Code to execute statements is generated by genStatements.

– Must be forward declared as it is mutually recursive with genWhileSt and genIfSt.

Page 27: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.276.27

if Statementsif Statements

• C-- :

if (condition){ statements } ;

• M68K :

| Code to evaluate condition. CMP.L trueval,D0 BNE Lendlabel | Code to execute statements.Lendlabel:

• Obviously, use the integer value of ast->ifst->endlabel rather than its name after the Ls.

• Code to evaluate condition expression is generated by genExpression.

• Code to execute statements is generated by genStatements.

Page 28: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.286.28

if Statements IIif Statements II

• C-- :

if (condition){ thenstatements } ;else { elsestatements } ;

• M68K :

| Code to evaluate condition. CMP.L trueval,D0 BNE Lelselabel | Code to execute thenstatements. JMP LendlabelLelselabel: | Code to execute elsestatements.Lendlabel:

• Obviously, use the integer values of ast->ifst->elselabel and ast-ifst->endlabel rather than their names after the Ls.

Page 29: 6.1 6. Phase 3 : Code Generation Part I Overview of compilation. The unit directory. genProg.cxx. What you must do. Compiling, assembling, downloading

6.296.29

SummarySummary

• Copy gener.template, makefile and gener (renamed dhgener) into your directory.

• Print out gener.h, rpolish.cxx and partialExp.cxx.

• Rename gener.template to gener.cxx.

• Complete the stubs in gener.cxx in the following order :

– genHeader, genFooter, genDeclarations, genDec, genCinSt, genCoutSt, genAssignSt, genIfSt, genWhileSt.

• For now, assume all expressions are simply literal constants.

– Use the genExpression in partialExp.cxx. #included into gener.template.