Upload
shannon-harvey
View
214
Download
1
Embed Size (px)
Citation preview
6.16.1
6. Phase 3 : Code Generation Part I6. Phase 3 : Code Generation Part I
• Overview of compilation.
• The unit directory.
• genProg.cxx.
• What you must do.
• Compiling, assembling, downloading and running C--.
• The Gnu assembler and the VxWorks dynamic linker.
• Monkey see, monkey do.
• gener.cxx.
• Structure of an M68K assembly file.
• Declarations.
• Statements.
6.26.2
OverviewOverview
• A compiler is lexer + syner + gener.
– Written lexer and syner. Now you write the gener.
– Easiest of the three once you know what M68K code to generate. And I tell you that bit.
compilerstdin stdout Errors ||
M68KC-- Code
6.36.3
The Unit DirectoryThe Unit Directory
• The unit directory for this phase is
/usr/users/staff/aosc/cm049icp/phase3
• Among other things it contains the following :
– genprog.cxx : the test bed program for phase 3.
– gener.template : a template file for your phase 3 programs.
– gener.h : the header file for phase 3. trueval, falseval : constants to represent true
and false in M68K code. INT_MAX_16_BIT,INT_MIN_16_BIT : maximum
and minimum values for 16 bit 2s complement integers.
RPolish : struct for holding the Reverse Polish representation of C-- expressions.
– makefile : the makefile for phase 3.
6.46.4
The Unit Directory IIThe Unit Directory II
– gener : an executable for my phase 3 program.
– tests/test*.c-- : testing programs for the demo.
– rpolish.cxx : Reverse Polish conversion programs.RPolish *append(RPolish *rp1, RPolish *rp2)RPolish *toRPF(Factor *fact)RPolish *toRPT(Term *term)RPolish *toRPBE(BasicExp *bexp)RPolish *toRPE(Expression *expr)
– partialExp.cxx : partial code generator for expressions.void genExpression(SymTab *st, Expression *expr, int &label, int &finalLabel) Only handles literal constants. Do a full implementation of genExpression after
you’ve got the rest of it to work.
6.56.5
genProg.cxxgenProg.cxx
• The test bed program is as follows :
#include “.../phase2/syner.h”#include “.../phase3/gener.h”
void main(){ SymTab *st = NULL ; AST *ast = NULL ; int label = 0 ;
synAnal(st, ast, label) ; generate(st, ast, label) ;}
• First calls synAnal to parse the C--, then calls generate to produce M68K code.
• Input/Output is from/to stdin/stdout.
• For a ‘real’ compiler would use argc/argv and command line arguments to use files for Input/Output.
You must write this
subprogram
You must write this
subprogram
6.66.6
What You Must DoWhat You Must Do
• Your implementation of generate must be in a file called gener.cxx in your directory.
• Take a copy of makefile and gener.template.
• Print out a copy of gener.h.
• Print out a copy of rpolish.cxx.
• Print out a copy of partialExp.cxx.
• Useful commands : testphase3, demophase3.
– They work as usual.
– Your program’s output must be exactly the same as mine to get the marks.
6.76.7
Compiling, Assembling, Downloading & Running C--Compiling, Assembling, Downloading & Running C--
• C-- program in prog.c-- :
const string s = “Hello\n” ;{ cout << s ; }
• Make and run gener :
jaguar> make generjaguar> gener < prog.c-- > a.sjaguar> assem ajaguar>
• Connect to VxWorks box and download and run :
rlogin moloch-> ld < a-> runHello->
UNIX command.
UNIX command.
6.86.8
M68K Assembler & VxWorks Dynamic LinkerM68K Assembler & VxWorks Dynamic Linker
• assem is a shell script (in /usr/users/staff/aosc/bin) which calls the Gnu Motorola 68000 assembler.
• Gnu assembler is a high level Macro-Assembler.
– Supports medium level memory management.
– Makes variables/constants very easy.
• VxWorks has a dynamic linker.
– Similar to NT except that it works.
– M68K programs contain calls to library subroutines. e.g. scanf, printf. Run-time addresses of these subroutines are not known to
the compiler.
– When programs are downloaded the required addresses are automatically linked in.
6.96.9
Monkey See, Monkey DoMonkey See, Monkey Do
• My code generator is in a file called gener in the unit directory for this phase.
– /usr/users/staff/aosc/cm049icp/phase3
– To work out what assembly code you need to generate run gener on C-- source code and inspect the output that is produced.
– More or less the approach I adopted using C source and the GNU C compiler, cc68k.
– Took longer than I expected because GNU assembler uses non-standard M68K assembly code mnemonics and assembler directives. Bloody idiots.
• Rest of this lecture is just a few ‘handy hints’.
6.106.10
Monkey See, Monkey Do IIMonkey See, Monkey Do II
• Monkey see, monkey do is standard in the industry.
• Usually have to tweak the instruction set of one chip into the instruction set of another.
• Tend to stick to a small set of instructions which are common to all chips.
– e.g. MOVE, ADD, JMP etc.
– Usually about 20% of a CISC chip’s instruction set.
• Main reason for RISC chips.
– Why provide lots of instructions that no-one uses?
– RISC chips have a lot fewer instructions than CISC chips.
– Fewer instructions means less tweaking means less work.
6.116.11
Top Level Structure For gener.cxxTop Level Structure For gener.cxx
• Contents of gener.template :
#include <iostream.h>#include <fstream.h>#include <iomanip.h>#include <ctype.h>#include <stddef.h>#include <stdlib.h>
#include “.../lib/cstring.h”#include “.../phase2/syner.h”
#include “.../phase3/rpolish.cxx”
void genHeader(){ cout << “genHeader\n” ; }
void genFooter(int finalLabel){ cout << “genFooter\n” ;
6.126.12
Top Level Structure For gener.cxx IITop Level Structure For gener.cxx II
void genDec(SymTab *st){ cout << “genDec\n” ; }
void genDeclarations(SymTab *st){ cout << “genDeclarations\n” ;
#include “.../phase3/partialExp.cxx”
// Forward Declaration.void genStatements(SymTab *st, AST *ast, int &label, int &finalLabel) ;
void genIfSt(SymTab *st, AST *ast, int &label, int &finalLabel){ cout << “genIfSt\n” ; }
6.136.13
Top Level Structure For gener.cxx IIITop Level Structure For gener.cxx III
void genWhileSt(SymTab *st, AST *ast, int &label, int &finalLabel){ cout << “genWhileSt\n” ; }
void genCinSt(SymTab *st, AST *ast, int &label, int &finalLabel){ cout << “genCinSt\n” ; }void genCoutSt(SymTab *st, AST *ast, int &label, int &finalLabel){ cout << “genCoutSt\n” ; }
6.146.14
Top Level Structure For gener.cxx IVTop Level Structure For gener.cxx IV
void genAssignSt(SymTab *st, AST *ast, int &label, int &finalLabel){ cout << “genAssignSt\n” ; }
void genStatements(SymTab *st, AST *ast, int &label, int &finalLabel){ cout << “genStatements\n” ; }
6.156.15
Top Level Structure For gener.cxx VTop Level Structure For gener.cxx V
void generate(SymTab *st, AST *ast, int label){ int finalLabel = label++ ;
genHeader() ; genDeclarations(st) ; genStatements(st, ast, label, finalLabel) ; genFooter(finalLabel)} // generate
• finalLabel used to label the error code for integer overflow.
– Avoids using a 2-pass generator.
6.166.16
Structure Of A M68K Assembler FileStructure Of A M68K Assembler File
• Assembler code file is made up of 3 parts :
– Standard header part.
– Specific assembly code generated from C-- source.
– Standard footer part.
• Standard header :
#NO_APP_IOinteger:
.asciz “%d”_Eintegeroverflow:
.asciz “\n\nInteger Overflow!\n”
|| Declarations go here.|.even.globl _run_run:
To find out what this means RTFM.
To find out what this means RTFM.
6.176.17
Structure Of A M68K Assembler File IIStructure Of A M68K Assembler File II
• Standard footer :
RTSLfinalLabel:
LINK A6,#0PEA _EintegeroverflowJBSR __printfADDQ.W #4,SPUNLK A6RTS
• Code after LfinalLabel label is integer overflow handling code.
– Obviously, use value of finalLabel not its name.
• RTS : VxWorks calls the assembly code as a subprogram.
• ‘\t’ at start of all indented lines throughout assembly code.
• No ‘\t’ anywhere else (except in strings).
6.186.18
Variable And Constant DeclarationsVariable And Constant Declarations
• genDec handles a single declaration.
• C-- :
int i1 = 0 ;int i2 ;const string str = “Hello\n” ;bool b1 = false ;bool b2 ;
• M68K :
.comm i1,4
.comm i2,4Lstr: .asciz “Hello\n” ;.comm b1,4.comm b2,4
• Note that strings are initialised on declaration.
6.196.19
Code For genDeclarationsCode For genDeclarations
void genDeclarations(SymTab *st){ SymTab *stsave = NULL ;
stsave = st ; while (st != NULL){ genDec(st) ;
st = st->next ; }
cout << “.even\n” ; cout << “.globl _run\n” ; cout << “_run\n” ;
st = stsave ;
while (st != NULL) // Initialise ints and bools. st = st->next ;}
6.206.20
Initialising ints and boolsInitialising ints and bools
• int and bool constants and variables must be initialised when the program runs.
– i.e. by M68K MOVE instructions.
• In genDeclarations : if (st->initialise != NULL) && (st-type != STRINGDATA){ cout << “\tMOVE.W “ ; if (st->type == INTDATA) cout << “#’ << st->initialise->litInt ; else if (st->type == BOOLDATA) { if (st->initialise->litBool == “true”) cout << ‘#’ << trueval ; else if (st->initalise->litBool == “false”) cout << ‘#’ << falseval ; } cout << ‘,’ << st->ident << endl ;}
6.216.21
genStatementsgenStatements
• genStatements simply steps through the AST calling other subprograms to generate the code for individual statements :
void genStatements(...){ while (ast != NULL) { if (ast->tag == IFST) genIfSt(st, ast, label, finalLabel) ; else if (ast->tag == WHILEST) genWhileSt(st, ast, label, finalLabel) ; else if (ast->tag == CINST) genCinSt(st, ast, label, finalLabel) ; else if (ast->tag == COUTST) genCoutSt(st, ast, label, finalLabel) ; else if (ast->tag == ASSIGNST) genAssignSt(st, ast, label, finalLabel) ; } ast = ast->next ;} // genStatements
6.226.22
cin Statementscin Statements
• C-- :
cin >> invar ;
• M68K :
LINK A6,#-4LEA A6@(-4),A0MOVE.L A0,SP@-PEA _IOintegerJBSR _scanfADDQ.W #8,SPMOVE.L A6@(-4),invarUNLK A6MOVE.L invar,D0CMP.L #INT_MAX_16_BIT,D0BGT LfinalLabelCMP.L #INT_MIN_16_BIT,D0BLT LfinalLabel
6.236.23
cout Statementscout Statements
• C-- :
cout >> outvar ;
• M68K for strings :
LINK A6,#-0PEA LoutvarJSBR _printfADDQ.W #4,SPUNLK A6
• M68K for ints :
LINK A6,#-4MOVE.L outvar,SP@-PEA _IOintegerJSBR _printfADDQ.W #4,SPUNLK A6
6.246.24
Assignment StatementsAssignment Statements
• C-- :
var = expression ;
• M68K :
| Code to evaluate expression.MOVE.L D0,var
• Code for the expression is generated by genExpression.
• Convention : result of the expression will be left in D0.
• Next lecture on how to write genExpression. For now just use the partial implementation from partialExp.cxx.
– Can only use literal constants.
6.256.25
while Statementswhile Statements
• C-- :
while (condition){ statements } ;
• M68K :
Lstartlabel: | Code to evaluate condition. CMP.L trueval,D0 BNE Lendlabel | Code to execute statements. JMP LstartlabelLendlabel:
6.266.26
while Statements IIwhile Statements II
• Obviously, use the integer values of ast->whilest->startlabel and ast-whilest->endlabel rather than their names after the Ls.
• Code to evaluate condition expression is generated by genExpression.
– Initially can only use boolean literal constants.
• Code to execute statements is generated by genStatements.
– Must be forward declared as it is mutually recursive with genWhileSt and genIfSt.
6.276.27
if Statementsif Statements
• C-- :
if (condition){ statements } ;
• M68K :
| Code to evaluate condition. CMP.L trueval,D0 BNE Lendlabel | Code to execute statements.Lendlabel:
• Obviously, use the integer value of ast->ifst->endlabel rather than its name after the Ls.
• Code to evaluate condition expression is generated by genExpression.
• Code to execute statements is generated by genStatements.
6.286.28
if Statements IIif Statements II
• C-- :
if (condition){ thenstatements } ;else { elsestatements } ;
• M68K :
| Code to evaluate condition. CMP.L trueval,D0 BNE Lelselabel | Code to execute thenstatements. JMP LendlabelLelselabel: | Code to execute elsestatements.Lendlabel:
• Obviously, use the integer values of ast->ifst->elselabel and ast-ifst->endlabel rather than their names after the Ls.
6.296.29
SummarySummary
• Copy gener.template, makefile and gener (renamed dhgener) into your directory.
• Print out gener.h, rpolish.cxx and partialExp.cxx.
• Rename gener.template to gener.cxx.
• Complete the stubs in gener.cxx in the following order :
– genHeader, genFooter, genDeclarations, genDec, genCinSt, genCoutSt, genAssignSt, genIfSt, genWhileSt.
• For now, assume all expressions are simply literal constants.
– Use the genExpression in partialExp.cxx. #included into gener.template.