Upload
howe
View
46
Download
2
Tags:
Embed Size (px)
DESCRIPTION
CPE555A: Real-Time Embedded Systems. Lecture 4 Ali Zaringhalam Stevens Institute of Technology. Outline. Procedure Calls I/O Exception Handling Multitasking. Memory Models. Global variable. Allocated at compile time. Local/automatic variables. Allocated on the stack at run time. - PowerPoint PPT Presentation
Citation preview
Spring 2016, arz 1
CPE555A:Real-Time Embedded Systems
Lecture 4Ali Zaringhalam
Stevens Institute of Technology
CS555A – Real-Time Embedded Systems Stevens Institute of Technology
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 2
Outline
Procedure Calls I/O Exception Handling
Memory Use
Programs use memory to Store executable code Data used by programs during execution
Memory for data must be available Allocated statically at compile time Allocated dynamically at run time
On the program stack for procedure calls In the heap as needed by program
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 3
Spring 2016, arz4 4
Memory Usage
Global variable. Allocated at compile time.
Local/automatic variables. Allocated on the stack at run time.
Dynamic variables. Allocated in the heap area at run time.
CS555A – Real-Time Embedded SystemsStevens Institute of Technology
Spring 2016, arz5CS555A – Real-Time Embedded Systems
Stevens Institute of Technology 5
Procedure Call: Return
opcode=3 immediate
6 26
J-Type Format
Af ter the procedure fi nishes execution it must return to the instruction at the return address. The address of this instruction is stored in $ra (register 31). Thus to return to this address the procedure executes:
jr $ra
so that no new instruction f or the return instruction is required. Similar to the jump instruction j , the jal instruction is encoded as a j -type instruction with opcode=3 as shown below.
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 6
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 6
ExampleLet’s compile the following procedure:
int leaf _example(int g, int h, int i, int j )
{
int f ;
f = (g+h) – (i+j );
return f ;
}
Now let’s try to compile this program using what we’ve learned sofar. To compile, the compiler must make a decision about how topass the arguments g, h, i and j to the procedure and where theprocedure must return its results to the caller.
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 7
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 7
Argument Passing & Return Value
The MI PS I SA does not specif y how arguments should be passed and values returned. This is done using a compiler/ assembler convention f or MI PS:
Registers 4-7 are used f or argument passing. By convention MI PS refers to these registers as $a0-$a3,
Registers 2-3 are used to return values f rom procedures. By convention MI PS refers to these registers as $v0-$v1.
Let’s assume in our example that the arguments g-i will be passed to the procedure in $a0-$a3 and the result is returned in $v0. The compiler chooses reg16 for the local variable f . I n addition the compiler uses reg8 and reg9 f or temporary storage of (g+h) & (i+j).
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 8
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 8
ExampleLet’s compile the f ollowing procedure:
int leaf _example(int g, int h, int i, int j )
{
int f ;
f = (g+h) – (i+j );
return f ;
}
Now let’s try to compile this program using what we’ve learned so f ar. To compile, the compiler must make a decision about how to pass the arguments g, h, i and j to the procedure and where the procedure must return its results to the caller.
$a0, $a1, $a2, $a3 = R4-R7
R16
$v0=R2
R8R9
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 9
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 9
What Could Go Wrong?
What if the caller also happens to be using the same registers reg8, reg9 and reg16? I n this case the values used by the caller will be overwritten by the callee and it will not be able to execute the program correctly af ter the callee returns. This problem cannot be addressed by simply using a diff erent set of registers in the caller and the callee. For one thing we don’t know how deep the nested procedure calls are. I f n procedure calls are nested, we will need
O n registers whereas we only have 32. For another this scheme
clearly will not work if the procedures are compiled separately and later linked (as in a library). The solution is to spill registers into main memory. We associate a f rame (or activation record) with a procedure call of the f unction 1 2 nf x ,x ,...x . The f rame will contain
registers that the callee plans to use during execution.
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 10
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 10
Call Stack The natural data structure for spilling
registers into memory is a call stack (a last-in first-out structure)
Register values are pushed and saved on the stack when the procedure is called and popped from the stack into the original register at return
Historically call stacks “grow” from High address to low address
A stack pointer is used to address the first unused memory location
MIPS software uses register 29 for stack pointer and refers to it as $sp
other machines (e.g., 80x86) may use a special-purpose stack pointer
main
Proc1
Proc2
Proc3
Proc4
Low Address
High Address
$sp
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 11
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 11
Carnegie Mellon
Pushing a Register on the Stack
Suppose the called procedure wants to use reg16 It must push register reg16 to save it
subi $sp, $sp, 4 Makes room for a 4-byte word on the
stack sw reg16, 0($sp)
Stores reg16 into stack memory Now the called procedure can use
reg16
-4
Stack GrowsDown
IncreasingAddresses
Stack “Bottom”
Stack Pointer: $sp
Stack “Top”
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 12
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 12
Stack Pointer: $sp
Stack GrowsDown
IncreasingAddresses
Stack “Top”
Stack “Bottom”
Carnegie Mellon
Popping a Register From the Stack
+4
Before the procedure returns, it must restore reg16 to original value Pops stack into register reg16
lw reg16, 0($sp) Loads reg16 from stack memory
addi $sp, $sp, 4 Pops the stack
Now the callee can use reg16 as before
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 13
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 13
Example - Continued
leaf_example: #this is the address in memory where the procedure is stored
#The compiler decides to use reg8, reg9 and reg16. Because the
#caller may also be using these same registers, they must be saved on the stack
subi $sp, $sp, 12 #Make room on the stack for three registers
sw reg8, 8($sp) #push reg8
sw reg9, 4($sp) #push reg9
sw reg16, 0($sp) #push reg16
add reg8, $a0, $a1 #compute g+h
add reg9, $a2,$a3 #compute i+j
sub reg16, reg8, reg9 #compute (g+h)- ( i+j)
add $v0, reg16, $zero #return result in $v0
#We are done. But before we return we must restore the registers that we decided to use
lw reg8, 8($sp) #restore reg8
lw reg9, 4($sp) #restore reg9
lw reg16, 0($sp) #restore reg16
addi $sp, $sp, 12 #pop the stack
j $ra #now return
Save current values on stack.
Now you can use/overwrite the registers for the procedure’s computations.
Restore old values to the registers.
Compiler assignmentsg $a0 R4h $a1 R5i $a2 R6j $a3 R7
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 14
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 14
$sn and $tm In the example the procedure saved and restored every register it
intended to use without knowing whether they were used by the caller. When too many registers are spilled, performance suffers.
The alternative is to define and adhere to a protocol where all procedures assume that certain registers need not be saved and restored across a procedure call. MIPS assembler conventions are:
10 registers (8-15 and 24-25) are designated as temporary registers that need not be preserved by the callee. They are referred to as $t0-$t9
if the caller uses $t0-$t9 it must save them before the call and restore them on return (caller-saved).
8 registers 16-23 are designated as saved registers that must be preserved by the callee. They are referred to as $s0-$s7
if the callee uses $s0-$s7 it must save them when the procedure is entered and restore them on return (callee saved). It doesn’t bother with $t0-$t9.
the caller will not save $s0-$s7, and the callee will not save $t0-$t9
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 15
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 15
Recompilation Using Register Spilling Rules
I n the example we do not need to save reg8 and reg9 but we do have to save reg16. leaf_example: #this is the address in memory where the procedure is stored
#The procedure plans to use reg8, reg9 and reg16. Following MI PS assembler conventions, the
#callee must only save reg16. This reduces register spilling improving code size and performance.
subi $sp, $sp, 4 #Make room on the stack for ONE register
sw $s0, 0($sp) #push $s0
add $t0, $a0, $a1 #compute g+h; we don’t need to save $t0 R8
add $t1, $a2,$a3 #compute i+j; we don’t need to save $t1 R9
sub $s0, $t0, $t1 #compute (g+h)- ( i+j)
add $v0, $s0, $zero #return result in $v0
#We are done. But before we return we must restore the registers that we decided to use
lw $s0, 0($sp) #restore $s0
addi $sp, $sp, 4 #pop the staclk
j $ra #now return
f $s0 R16
Compiler assignmentsg $a0 R4h $a1 R5i $a2 R6j $a3 R7
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 16
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 16
But There is More to Spill!
All is well if the called procedure is a leaf procedure. But when there is sequence of nested calls or a recursion we need to spill more registers. Consider compiling the f ollowing equation which computes n! recursively:
/ / f act
int f a
( 4) =
ct( int n)
{
i
4* f act( 3)
f (n< 1)
return 1;
else
ret
= 4*3* f act( 2)=4*3*2*
urn n* f act(
f act(
n- 1);
}
f act(3
1)
);
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 17
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 17
Compiled CodeSo let’s assume that the compiler decides to use $a0 to pass argument and $v0 for the returned value.
f act: / / no need to push saved registers on the stack. None is used.
slt $t0, $a0, 1 / / is n<1?
beq $t0, $zero, L1 / / if n>=1 go to L1
addi $v0, $zero, 1 / / return 1
j r $ra
L1: subi $a0, $a0, 1 / / n=n-1
jal f act / / compute (n-1)!; set $ra to return address
mul $v0, $v0, $a0
/ / compute n!; pretend there is a multiply instruction
j r $ra / / return to where? I nitial point of entry is lost and
/ / with what value of a0?
Now consider calling this f rom main() f or n=3:
addi $a0, $s0, 3
jal f act
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 18
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 18
The Problem & Solution
This is obviously not going to work because we are changing the registers $a0 and $ra across recursive calls. I n addition to overwriting argument registers which may later be needed, we overwrite $ra thus losing track of the address to which the calling procedure must eventually return. The solution again is to spill these registers by saving these registers on the call stack:
The caller saves all argument registers that it needs by pushing them on the stack before a procedure call (caller saved). I t restores them af ter the called procedure returns.
The callee saves the return address ($ra) when the procedure is entered by pushing it on the stack (callee saved). I t restores it before returning to the calling procedure.
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 19
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 19
Compiled Recursive Procedure
fact: #this is the address in memory where the procedure is stored
subi $sp, $sp, 8 #Make room on the stack for two registers
sw $ra, 4($sp) #push return address $ra
sw $a0, 0($sp) #push argument $a0
slti $t0, $a0, 1 #is n<1?
beq $t0, $zero, L1 #if n>=1 then go to L1
addi $v0, $zero, 1 #n<0 so return 1
addi $sp, $sp, 8 #pop the stack; don’t have to restore $ra and $a0
j $ra #now return
L1: subi $a0, $a0, 1 # n>=1. Decrement n and make recursive call with (n- 1)
jal fact
lw $a0, 0($sp) #restore original n
mul $v0, $v0, $a0 #compute n(n- 1)…..
lw $ra, 4($sp) #before we return we must restore the return address register
addi $sp, $sp, 8 #pop the stack
j $ra #now return
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 20
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 20
Stack Frame Pattern
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 21
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 21
What Else?I n our examples so f ar we assumed that the stack pointer $spdoes not change during execution of the procedure. The stackpointer is adjusted to save registers on procedure entry andreadjusted to the original value on procedure’s return. But what ifthe procedure has local variables? I n particular what if the localvariables can be declared anywhere in the body of the procedure?All of these are also created and maintained on the call stack.However as new storage is allocated on the stack the stackpointer keeps getting readjusted. I f all local variables were to bereferred to in terms of the stack pointer, the ref erence wouldhave to be readjusted each time new storage is allocated on thestack. This makes code generation cumbersome and moreimportantly the compiled code is diffi cult to understand (theoff sets keep changing).
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 2222
The Frame PointerThe solution is to use a f rame pointer ($fp). The f rame pointerpoints to the location of the fi rst variable saved on the stack (sothat this variable has a zero off set with respect to $fp) onprocedure entry. The f rame pointer remains fixed during theprocedure’s execution; and all local variables are referred to interms of $fp. Here’s the protocol:
On entry, the procedure saves the current value of $fp onthe stack (callee saved),
On entry, the procedure sets the value of $fp to the value ofthe stack pointer $sp,
Before the procedure returns, the stack pointer is adjustedback to $sp = $fp. The procedure also restores the originalvalue of the f rame pointer $fp f rom the value saved on thestack.
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 23
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 23
Frame & Stack Pointers
Before the call During the call After the call
MI PS assembler uses register 30 for the f rame pointer $fp.
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 24
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 24
Carnegie Mellon
Frame Pointer: $fp
Stack Frames
Contents Local variables Return information Temporary space
Management Space allocated when enter
procedure “Set-up” code
Deallocated when return “Finish” code
Stack Pointer: $sp
Stack “Top”
Previous Frame
Frame for
proc
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 25
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 25
Carnegie Mellon
Example
yoo
who
amI
amI
amI
amI
yoo$fp
$sp
Stack
yooyoo(…){ • • who(); • •}
yoo(…){ • • who(); • •}
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 26
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 26
yoo(…){ • • who(); • •}
yoo(…){ • • who(); • •}
Carnegie Mellon
Example
yoo
who
amI
amI
amI
amI
yoo
$fp
$sp
Stack
yoo
who
who(…){ • • • amI(); • • • amI(); • • •}
who(…){ • • • amI(); • • • amI(); • • •}
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 27
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 27
yoo(…){ • • who(); • •}
yoo(…){ • • who(); • •}
who(…){ • • • amI(); • • • amI(); • • •}
who(…){ • • • amI(); • • • amI(); • • •}
Carnegie Mellon
Example
yoo
who
amI
amI
amI
amI
yoo
$fp
$sp
Stack
yoo
who
amI
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 28
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 28
Carnegie Mellon
Example
yoo
who
amI
amI
amI
amI
yoo
$fp
$sp
Stack
yoo
who
amI
amI
yoo(…){ • • who(); • •}
yoo(…){ • • who(); • •}
who(…){ • • • amI(); • • • amI(); • • •}
who(…){ • • • amI(); • • • amI(); • • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 29
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 29
Carnegie Mellon
Example
yoo
who
amI
amI
amI
amI
yoo
$fp
$sp
Stack
yoo
who
amI
amI
amI
yoo(…){ • • who(); • •}
yoo(…){ • • who(); • •}
who(…){ • • • amI(); • • • amI(); • • •}
who(…){ • • • amI(); • • • amI(); • • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 30
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 30
Carnegie Mellon
Example
yoo
who
amI
amI
amI
amI
yoo
$fp
$sp
Stack
yoo
who
amI
amI
yoo(…){ • • who(); • •}
yoo(…){ • • who(); • •}
who(…){ • • • amI(); • • • amI(); • • •}
who(…){ • • • amI(); • • • amI(); • • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 31
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 31
Carnegie Mellon
Example
yoo
who
amI
amI
amI
amI
yoo
$fp
$sp
Stack
yoo
who
amI
yoo(…){ • • who(); • •}
yoo(…){ • • who(); • •}
who(…){ • • • amI(); • • • amI(); • • •}
who(…){ • • • amI(); • • • amI(); • • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 32
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 32
Carnegie Mellon
Example
yoo
who
amI
amI
amI
amI
yoo
$fp
$sp
Stack
yoo
who
yoo(…){ • • who(); • •}
yoo(…){ • • who(); • •}
who(…){ • • • amI(); • • • amI(); • • •}
who(…){ • • • amI(); • • • amI(); • • •}
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 33
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 33
Carnegie Mellon
Example
yoo
who
amI
amI
amI
amI
yoo
$fp
$sp
Stack
yoo
who
amI
yoo(…){ • • who(); • •}
yoo(…){ • • who(); • •}
who(…){ • • • amI(); • • • amI(); • • •}
who(…){ • • • amI(); • • • amI(); • • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 34
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 34
Carnegie Mellon
Example
yoo
who
amI
amI
amI
amI
yoo
$fp
$sp
Stack
yoo
who
yoo(…){ • • who(); • •}
yoo(…){ • • who(); • •}
who(…){ • • • amI(); • • • amI(); • • •}
who(…){ • • • amI(); • • • amI(); • • •}
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 35
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 35
Carnegie Mellon
Example
yoo
who
amI
amI
amI
amI
yoo$fp
$sp
Stack
yooyoo(…){ • • who(); • •}
yoo(…){ • • who(); • •}
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 36
Typical Microcontroller Board
Stellaris R LM3S8962 evaluation board
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 37
Interface Types
Parallel: multiple data lines for data Speed Short distance Examples: PCI (Peripheral Component Interconnect), ATA
(Advanced technology Attachment) Serial: single data line for data
Longer range than parallel Examples: USB, RS232, I2C, SPI, PCI-Express
Synchronous: there is a clock signal between transmitter and receiver
Examples: USB, I2C, SPI Asynchronous: no clock between transmitter and
receiver Uses START/STOP bits Examples: RS232, UART
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 38
General-Purpose I/O (GPIO)
Open collector circuits are used for GPIO pins
The same pin can be used for input and output
Multiple controllers can be connected to the same bus
When processor write 1 to register, the transistor is turned on and GPIO pin is pulled low
When processor write 0 to register, the transistor is turned off and GPIO pin is pulled high Wired NOR
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology
39
RS-232 Standard
RS-232 is a common interface and supports asynchronous serial connections
RS-232 is being replaced by USB
Voltage levels for an ASCII "K" character (0x4B) with 1 start bit, 8 data bits and 1 stop bit. Read this left to right corresponding to how bits are transmitted on the line:0100 1011= 0x4B.
DB9
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 40
Universal Asynchronous Receiver Transmitter (UART)
Converts 8-bit parallel data to serial data & vice versa The UART provides hardware support for
Parallel-to-Serial and Serial-to-Parallel conversion Start and Stop Bit framing Parity Generation Baud-Rate Generation (2400-115.2kbps)
UART supports Interrupts Transmit Complete Transmit Data Register Empty Receive Complete
Serial interface specification (RS232C) Start bit 6,7,8,9 data bits Parity bit optional Stop bit
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 41
UART Register Interface
CPU uses UART registers to interact with UART UDR (UART Data Register)
CPU writes byte to transmit CPU reads byte received
USR (UART Status Register) Rx/Tx complete signal bits
UCR (UART Control Register) Interrupt enable bits Rx/Tx enable bits Data format control bits (e.g. optional parity bit)
UBRR (UART Baud Rate Register) Baud rate generator division ratio
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 42
UART Transmission
Send a byte by writing to UDR register UART sets TXC bit in USR when the final
bit has finished transmitting UART triggers Tx Complete interrupt if
enabled in the UCR CPU must wait for current byte to finish
transmitting before sending the next one
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 43
UART Receive
How does the CPU know a byte has arrived? Two methods available: Polling: poll the RXC bit in USR or Interrupt: enable the Rx Complete interrupt and
write an ISR routine to handle it Read received bytes from the UDR register
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 44
UART Baud Rate
Set by UBRR (Baud Rate Register) UBRR (0-255) BAUD=fCK/[16*(UBRR+1) fCK is the crystal clock frequency
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 45
Interfacing I/O Devices
How does the CPU interface I/O devices? How is data transferred to/from memory? Role of operating system (OS)
provides system calls for accessing devices (e.g., read, write, seek)
protects one user’s data from another handles interrupts from devices
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 46
I/O Instructions
Addressing: the CPU must be able to address individual device’s registers
Instructions: the CPU must use instructions to send commands to I/O devices
Two techniques for I/O instructions Isolated I/O: special instructions for I/O (e.g., IN, OUT)
Different from instructions to access data memory memory-mapped I/O: same instruction set for I/O
access as for data memory access (e.g., LOAD, STORE)
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 47
Isolated I/O
Separate instructions for memory and I/O references IN R1, device_Address
Separate memory & I/O address space either physically separate bus (shown in diagram above) or same physical bus with a signal to indicate memory or I/O
Used in Intel 80x86
CPU
CPU
Memory
Memory
Interface
I/OPeripheral
I/OPeripheral
Interface
I/OPeripheral
I/OPeripheral
Pro
cessor-
mem
ory
bu
s
Independent I/O bus
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 48
Memory-Mapped I/O
Common memory & I/O bus Same instruction set for memory access & I/O
e.g., LOAD R1, 0(R5): R5 maps to an external I/O register Same address space for memory & I/O More prevalent than isolated I/O: used in RISC processors
CPU
CPU
Memory
Memory
Interface
I/OPeripheral
I/OPeripheral
Interface
I/OPeripheral
I/OPeripheral
Common Memory & I/O bus
ROM
RAM
I/O
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 49
Polling Main loop uses each I/O device periodically If output is to be produced, produce it If input is ready, read it
Example:USR (UART Status Register)
Rx/Tx complete signal bits
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 50
Send on a Polled UART I/0
Loop until TX buffer is empty (6th bit of Status register is set to 1)
Write Data register with your data.
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 51
Send a Byte Sequence
The lower the I/O speed the more CPU cycles are wasted. As CPU clock rate increases, there is more in polling penalty.
(8/57600)*(18000000)=2500 cycles
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 52
Receive With Polling
Loop until RX buffer is full (8th bit of Status register is set to 1)
Why?
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 53
Interrupt-Driven I/O
External hardware alerts the processor that input is ready Processor suspends what it is doing Processor invokes an interrupt service routine (ISR) ISR interacts with the application concurrently
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 54
Control Flow in Absence of Interrupts
<startup>inst1
inst2
inst3
…instn
<shutdown>
Processors do only one thing: From startup to shutdown, a CPU simply reads and executes
(interprets) a sequence of instructions, one at a time This sequence is the CPU’s control flow
Physical control flow
Time
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 55
Altering the Control Flow
Up to now: two mechanisms for changing control flow: Jumps and branches Call and returnBoth react to changes in state within the program
Insufficient for a useful system: A useful system must also react to changes in system state (CPU + peripherals)
data arrives from a disk or a network adapter user hits Ctrl-C at the keyboard System timer expires
System needs mechanisms for “exceptional control flow”
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 56
Exceptional Control Flow
Exceptional events exist at all levels of a computer system
Low level mechanisms Change in control flow in response to a system event
(i.e., divide by zero) Hardware interrupts
Higher level mechanisms Process context switch OS system calls
Exception categories Asynchronous Synchronous
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 57
Asynchronous Exceptions (Interrupts)
Caused by events external to the processor Indicated by asserting the processor’s interrupt pin
which is examined by the processor after executing each instruction
Processor completes execution of “current” instruction
Interrupt handler returns to “next” instruction in the original code
Examples: I/O interrupts
Arrival of a packet from network Arrival of data from a disk
Hard reset interrupt Hitting the reset button
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 58
External Exception Interface
ProcessorProcessorPriorityencoderPriorityencoder
Level 1
Level 2
Level 3
Level 4
Level 5
Level 6
Level 7
I0
I1
I2
000: no interrupt010: level 2 int.
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 59
Synchronous Exceptions Caused by events that occur as a result of executing an
instruction: Traps
Intentional and planted by design Examples: system calls, breakpoint traps, special instructions After handling, control must return to “next” instruction
Faults Unintentional but possibly recoverable Examples: page faults (recoverable), protection faults
(unrecoverable), floating point exceptions Either re-executes faulting (“current”) instruction or aborts
Aborts Unintentional and unrecoverable Examples: memory parity error Aborts current program
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 60
Exception Handling
Triggers A level change on an interrupt request pin Software writing to an interrupt control register
(“software interrupt”), causing a level change in an interrupt pin
Executing a special “SysCall” instruction Responses
Disable interrupts Push the program counter into the special Exception
Program Counter (EPC) Execute Interrupt Service Routine (ISR) instructions
beginning at a designated address in memory Design of interrupt service routine
Save and restore any registers it uses Re-enable interrupts before returning from ISR
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 61
Saving the Processor State
Address Instruction
0x1230 add R0, R1, R10x1234 div R3, R2, R00x1238 add R1, R1, R20x123C addi R2, R2, 1
PC: 0x1234
Exception PC (EPC): 0x1234
CauseValue
• All actions performed byprocessor before enteringexception service routine• Interrupts disabled
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 62
Invoking Exception Service Routine
ExceptionHandler0x80000080
Instruction memory
EPC: 0x1234
Address Instruction
0x1230 add R0, R1, R10x1234 div R3, R2, R00x1238 add R1, R1, R20x123C addi R2, R2, 1
PC: 0x80000080
CauseValue
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 63
Exception Service Routine
ExceptionHandler(){if (cause == Arithmetic Overflow) ArithmeticOverflowHandler();else if (cause == DivideByZero) DivideByZeroHandler();else if (cause == Illegal Instruction) IllegalInstructionHandler();else if (cause == external interrupt) InterruptHandler();……………………………….}
Cause Register
00cause
Overflow12
Breakpoint9
Store addr. error5
Load addr. error4
Ext. interrupt0
Example Cause Values
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 64
Trap Example: Opening File
User calls: int open(filename, options) Function open executes system call instruction via __libc_open
OS must find file, get it ready for reading or writing Returns integer file descriptor as a handle to the user
User Process OS
exception
open filereturns
intpop
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 65
Fault Example: Page Fault
User attempts write to a memory location That portion (page) of user’s memory
is currently on disk
Page handler must load page into physical memory Returns to faulting instruction Successful on second try
int a[1000];main (){ a[500] = 13;}
User Process OS
exception: page faultCreate page and load into memoryreturns
store
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 66
Abort Example: Invalid Memory Reference
Page handler detects invalid address Sends SIGSEGV signal to user process User process exits with “segmentation fault”
int a[1000];main (){ a[5000] = 13;}
User Process OS
exception: page fault
detect invalid address
store
signal process
Suppose address 5000 has not been mapped
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 67
External Timers
Programmable Interval timer (PIT) Counts down from some value to
zero and then triggers an interrupt The initial timer value is set by
writing to a memory-mapped register
It can be configured to trigger repeatedly by HW without software ISR restarting it
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 68
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 69
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 70
Volatile Keyword Use
• An optimizing compiler decidesthat no one in the body of the Program is changing foo.• So it transforms the programto an infinite loop.• But foo may be a memory-mappedI/O or changed by an interrupt routine. It may change external to the program.
The volatile keyword tells the compliernot to optimize this code. Compiler leavesthe code unchanged.
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 71
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 72
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 73
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 74
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 75
Decrementing unsigned int
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 76
#include "stdio.h"
int main(void){ unsigned int x=0; x--; printf("Hello world: %u!\n", x);}
Hello world: 4294967295!
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 77
User Mode vs. System (aka Privileged or Kernel) Mode
Operating system kernel executes in the privileged mode
has unrestricted access to all system resources protects user programs from each other (e.g.,
memory protection) protects system against malicious use (all user access
to system resources is via system calls) User programs run in user mode with controlled
access to system resources via system calls Exception handling is done in system mode
because unrestricted access is typically required
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 78
The Path Of I/O Transfer
In both polled I/O & interrupt-driven I/O, the path for data transfer is through the processor registers
For high-performance systems and high-bandwidth I/O peripherals both techniques are inefficient
Alternative: Direct-Memory Access (DMA) removes the processor from the data transfer path
a limited form of multiprocessing (DMA is a specialized processor)
Common Memory & I/O bus
RegistesRegistes
Processor
ROM
RAM
I/OLOAD
STORE
Mem
ory
-map
ped
I/O
Spring 2016, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 79
I/O Using DMA
CPU sends device name, address, length and transfer direction to DMA controller (via memory-mapped I/O)
CPU issues start command to DMA controller DMA controller provides handshake signals to I/O device &
memory including addresses DMA controller interrupts processor when transfer is complete
CPU
CPU
Memory
Memory
Interface
I/OPeripheral
I/OPeripheral
Interface
I/OPeripheral
I/OPeripheral
ROM
RAM
I/O DMAController
DMAController
DMA
Mem
ory
-map
ped
I/O Data
transferControl