Upload
felix-burke
View
214
Download
0
Embed Size (px)
Citation preview
1Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
MIPS Assembly language
In computer programs we have :
DataInstructions (Arithmetic & control)
We need a memory & a CPU:
2Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
What should the CPU do?
• We need to transfer data from registers to the memory and vice versa
• We need to perform arithmetic operations on the data residing in registers
• We need to have the ability of controlling the flow of the program (Ifs and Jumps)
3Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
The machine understands only bits!
4Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Humans prefer a language “closer” to English
5Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Labels will improve the situation
6Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
High-level language is even better
7Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
The process of Compilation
8Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Preface
We’ll learn:
• How a computer is built
• Some performance analysis
• Advanced subject (Pipeline, caches) that are used in all modern computers
The course book:
Computer Organization & Design The hardware/software interface, David A. Patterson and John L. Hennessey.
Second Edition 1998
9Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Machine language, Assembly and C
swap(int v[], int k){int temp; temp = v[k]; v[k] = v[k+1]; v[k+1] = temp;}
swap: muli $2, $5,4 add $2, $4,$2 lw $15, 0($2) lw $16, 4($2) sw $16, 0($2) sw $15, 4($2) jr $31
00000000101000010000000000011000000000001000111000011000001000011000110001100010000000000000000010001100111100100000000000000100101011001111001000000000000000001010110001100010000000000000010000000011111000000000000000001000
Binary machinelanguageprogram(for MIPS)
C compiler
Assembler
Assemblylanguageprogram(for MIPS)
High-levellanguageprogram(in C)
• The CPU understands machine Language only
• Assembly language is easier to understand:
- Abstraction
- A unique translation
• C language (a high level language):
- The translation is
not unique !
- It depends on the Compiler
and the optimization requested
- It is portable (fits every CPU)
When do we use Assembly?
C is closer to regular English.( The high level language can be adjusted to the desired processing). High productivity: In a single sentence we define many machine operations. And the most important: It does not depend on the CPU !!!
Looks like regular English. Depends on the specific CPU. Every CPU has a different set of Assembly instructions since it has different sets of registers and machine operations.
Nowadays we use Assembly only when we have to:* When processing time is critical and we need “manually” adjustments of the code* When all resources of the CPU (e.g., Overflow, Index registers etc.) are needed, but not supported by the hig level language* When memory is critical an an optimization of its management is required (e.g., in DSPs)
10Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Instruction Set Architecture
• There is similarity between Assembly languages of different CPUs
• We’ll learn the MIPS Assembly language. It was developed in the 80’s and was used in workstation such as Silicon Graphics, NEC, Sony.
• RISC v. CISC
– Reduced Instruction Set Computer - MIPS– 8086 - Complex Instruction Set Computer
Our motto is: “More is less”
By that we mean that a smaller set of simpler instructions results with a better performance. This is so, since the h/w required is simpler, it is therefore faster and takes less Silicon space, which means less expensive.
11Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
1st design rule: Simplicity favors Regularity
Arithmetic operations:Arithmetic operations:
• MIPS
addi a,b,100 # a=b+100
add a,b,c # a=b+c• 8086
ADD EAX,B # EAX= EAX+B
We prefer a simple mechanism with minimalistic instructions such as: R3 = R1 op R2 on a CPU which might be easier to program since it has instructions like: R5 = ( R1 op1 R2) op2 (R3 op3 R4 )but is much harder to design and implement
12Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
2nd design rule: Smaller is faster
• Arithmetic operations are allowed only on registers• The operands could be registers or a single constant • There are only 32 registers
( spilling is applied when needed) • A register contains a word = 32 bits = 4 bytes• We use conventions
$1,$2 …
$s0,$s1 ... - C variables
$t1,$t2 … - temporary vars.
An example:
f=(g+h)-(k+j) # $s0=f, $s1=g, $s2=h, $s3=k, $s4=j
add $t0,$s1,$s2
add $t1,$s3,$s4
sub $s0, $t0, $t1
The rgisters are denoted $0 - $31 or by names related to their job. There is a convention of their jobs.
The example describes how the sentence: f=(g+h)-(k+j) is translated into Assembly language instructions.The registers are the heart of the CPU. Accessing registers is faster than accessing the memory. We access 3 registers simultaneously: read from two and write to the 3rd.
Page 110
Page 115
13Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Policy of Use Conventions
Name Register number Usage$zero 0 the constant value 0$v0-$v1 2-3 values for results and expression evaluation$a0-$a3 4-7 arguments$t0-$t7 8-15 temporaries$s0-$s7 16-23 saved$t8-$t9 24-25 more temporaries$gp 28 global pointer$sp 29 stack pointer$fp 30 frame pointer$ra 31 return address
Other registers are: $at = $1 which is reserved for the Assemblerand $k0, $k1 = $26, $27 which are reserved for the Operating System
14Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
The memory
• The memory is a long array• The address is the index to the array• Byte addressing - The index is in bytes
• The maximal size of memory available is
230 words = 232 bytes 0
1
2
3
4
5
6...
8 bits of data
8 bits of data
8 bits of data
8 bits of data
8 bits of data
8 bits of data
8 bits of data
15Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Accessing the memory
• Load and Store instructions only• in LW we load a word, but the address we access is
given in bytes
lw $s1,100($s2) # $s1=Memory[$s2+100]
sw $s1,100($s2) # Memory[$s2+100]=$s1
offset = the location in the array base register = pointer to the array
Page 112
16Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Accessing bytes
• There are also lb (load byte) and
sb (store byte) instructions
• These are useful for handling “char”s since ASCII characters are stored in bytes
ASCII: American Standard Code For information Interchange
• Another code, Unicode, store characters in two bytes each
17Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
ASCII
18Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Accessing memory
• An example:
A is an array of words
The address of A is in $3
h is in $2
C code: A[2] = h + A[2];
MIPS code: lw $t0, 8($s3) # $t0=$s3[8] add $t0, $s2,$t0 # $t0=$s2+$t0 sw $t0, 8($s3) # $s3[8]=$t0
0
4
8
12
16
32 bits of dataA
Page 114
A[0]
A[1]
INTEL = Little endian
A[2]
MIPS = Big endian
013
0 1 2 3
2
19Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
• All instructions have the same size, 32 bits • In Intel’s 8086 the size changes from 1 to 17 bytes• An R-type instruction:
add $s1,$s2,$s3 # $s1=$17,$s2=$18, $s3=$19
Format of R-type instruction: 0 18 19 17 0 32
31 0
000000 10010 10011 10001 00000 100000
op rs rt rd shamt funct
op - opecode rs - register source
rt- register source no 2 rd - register destination
funct - function shamt - shift amount
Machine language
6 5 5 5 5 6
20H11H13H12H0H 0H
20Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
• Minimize the types. Use similar instructions.
• Example: lw $s1, 32($s2) # $s1 =$17, $s2=18
35 18 17 32
op rs rt 16 bit number
3rd design rule: A good design requires some compromises
op rs rt rd shamt funct
op rs rt 16 bit address
op 26 bit address
R
I
J
6 5 5 5 5 6
6 5 5 5 5 6
The compromise here is using different dormats to different types of instructions. We could have increased the number of bits per instruction so we have enough bits for all fields of all types of instructions. However, many of them would have been unused most of the time. So we use 3 types . (A different size conflicts with simplicuty).
21Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
• The program (code) is stored in memory just like data
Performing the program
• The Program Counter (PC) - a special register keeping the address of the next instruction
• We read the code word from the memory (using the PC as the pointer)
• We then increment the PC
Processor Memory
memory for data, programs, compilers, editors, etc.
The program in memory
22Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
• Jump - is an absolute jump without any conditionsj label
• Branch - a relative conditioned jump
bne $1,$2,label # $1!=$2 go to label
• An example:
if (i!=j) beq $s4, $s5, Lab1 h=i+j; add $s3, $s4, $s5else j Lab2 h=i-j; Lab1: sub $s3, $s4, $s5
Lab2: ...
Branch vs Jump
($s3=h $s4 =i $s5=j )
23Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
• Instructions:
bne $t4,$t5,Label Next instruction is at Label if $t4!= $t5
beq $t4,$t5,Label Next instruction is at Label if $t4 = $t5
j Label Next instruction is at Label
• Formats:
we see that
branch - is a relative jump in a range of 2^16 words
We assume that most of the jumps are local = branches
• Beq $s1,$s2,25 # if ($s1 ==$s2) go to PC +4 +25*4
op rs rt 16 bit address
op 26 bit address
I
J
Addresses in Branches and Jumps
24Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Actual coding example
Loop: lw $8, save($19) # $8=save[i]
bne $8, $21,Exit #Goto Exit if save[i]<> k
add $19,$19,$20 # i:=i+j
j Loop # Goto Loop
Exit:
SAVE - 1000
1000 8 19 35 80,000
2 21 8 5 80,004
32 0 19 20 19 0 80,008
20,000 2 80,012
80,016
25Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
An important rule:
In MIPS Assembly instructions we write
code addresses in words
data addresses in bytes
when a MIPS CPU accesses memory it always gives the address in bytes
We cannot do:
bne $8,$21,far_adrs
since there are only 16 bits of offset in branch instructions.But we can do:
beq $8,$21,nxt j far_adrsnxt:
About long jumps:
This is a service given by the Assembler
26Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
slt $t0, $s1, $s2 means if $s1 < $s2 then $t0 = 1 else $t0 = 0
• We can use slt to “build” a blt instruction
blt $s0,$s1,Less
slt $at,$s0,$s1 # $t0 gets 1 if $s0<$s1
bne $at,$zero,Less # go to Less if $t0 != 0
• blt is a psedoinstruction
• Assembler uses $at (= $1) for pseudoinstructions
How do we code a Branch-if-less-than?
Also: move $t1,$s4 stands for: add $t1,$s4,$zero
27Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Design rule 4: Build the common case faster
• Most arithmetic operation use small constants
• Large constants
addi $29, $29, 4
1010101010101010
0000000000000000 1010101010101010
1010101010101010 1010101010101010
0000000000000000 lui $t0,1010101010101010
ori $t0,$t0,1010101010101010
So we have many instructions with 16 bit constants: addi, slti, andi, ori, xori
These will be take longer to load, but only one more instruction is added
28Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
MIPS Assembly language summaryMIPS operands
Name Example Comments$s0-$s7, $t0-$t9, $zero, Fast locations for data. In MIPS, data must be in registers to perform
32 registers $a0-$a3, $v0-$v1, $gp, arithmetic. MIPS register $zero always equals 0. Register $at is $fp, $sp, $ra, $at reserved for the assembler to handle large constants.
Memory[0], Accessed only by data transfer instructions. MIPS uses byte addresses, so
230
memory Memory[4], ..., sequential words differ by 4. Memory holds data structures, such as arrays,
words Memory[4294967292] and spilled registers, such as those saved on procedure calls.MIPS assembly language
Category Instruction Example Meaning Commentsadd add $s1, $s2, $s3 $s1 = $s2 + $s3 Three operands; data in registers
Arithmetic subtract sub $s1, $s2, $s3 $s1 = $s2 - $s3 Three operands; data in registers
add immediate addi $s1, $s2, 100 $s1 = $s2 + 100 Used to add constants
load w ord lw $s1, 100($s2) $s1 = Memory[$s2 + 100]Word from memory to register
store w ord sw $s1, 100($s2) Memory[$s2 + 100] = $s1 Word from register to memory
Data transfer load byte lb $s1, 100($s2) $s1 = Memory[$s2 + 100]Byte from memory to register
store byte sb $s1, 100($s2) Memory[$s2 + 100] = $s1 Byte from register to memoryload upper immediate
lui $s1, 100 $s1 = 100 * 216 Loads constant in upper 16 bits
branch on equal beq $s1, $s2, 25 if ($s1 == $s2) go to PC + 4 + 100
Equal test; PC-relative branch
Conditional
branch on not equal bne $s1, $s2, 25 if ($s1 != $s2) go to PC + 4 + 100
Not equal test; PC-relative
branch set on less than slt $s1, $s2, $s3 if ($s2 < $s3) $s1 = 1; else $s1 = 0
Compare less than; for beq, bne
set less than immediate
slti $s1, $s2, 100 if ($s2 < 100) $s1 = 1; else $s1 = 0
Compare less than constant
jump j 2500 go to 10000 Jump to target address
Uncondi- jump register jr $ra go to $ra For sw itch, procedure return
tional jump jump and link jal 2500 $ra = PC + 4; go to 10000 For procedure call
29Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
RISC v. CISC
I - number of instructions in program
T - time of the clock cycle
CPI - number of clock cycles per instruction
RISC:
I T CPI
CISC:
I T CPI
30Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
An exercise
A C code and two MIPS Assembly translations are given below:
while (save[i]!=k) do
i=i+j ;
save:array [ 0..100] of word
k is kept in $21. $19 has i and $20 has j.
• First translation:
Loop: muli $9,$19,4 # Temporary reg $9:=i*4
lw $8,save($9) # Temporary reg $8:=save[i]
bne $8,$21,Exit # Goto Exit if save[i]<>k
add $19,$19,$20 # i:=i+j
j Loop # Goto Loop
Exit:
31Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
exercise (cont.)
• 2nd translation
muli $9,$19,4 # Temporary reg $9:=i*4
lw $8,save($9) # Temporary reg $8:=save[i]
bne $8,$21,Exit # Goto Exit if save[i]<>k
Loop: add $19,$19,$20 # i:=i+j
muli $9,$19,4 # Temporary reg $9:=i*4
lw $8,save($9) # Temporary reg $8:=save[i]
beq $8,$21,Loop # Goto Loop if save[i]=k
Exit:
• Assuming the loop is performed 10 times, what is the number of instructions performed in the two translations?
32Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Compiler (Assembler) and Linker
A.asm
B.asm compiler
compilerA.obj
B.obj
linker
C.lib
(c.obj)
P.exeloader
Memory
Page 161
33Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Linker operation
B.asm
compiler
compiler
A.obj
B.obj
s: .word 3,4
j k
lw $1,s ($2)
k: add $1,$2,$3
m: .word 2
sw 7, m($3)
A.asm
3 4
j 2
lw $1,0($2)
add $1,$2,$3
2
sw 7,0($3)2
3
4
sw 7,0($3)
j 3
lw $1,4($2)
add $1,$2,$3
linker
P.exe
Pages 156-160
34Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
The structure of an object file
• Object file header• text segment• data segment• relocation information• symbol table• debugging information
Page 161