Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
Tema IV – Sistemas Embebidos – MicroBlaze y ARM ZYNQ
1
Roberto Gutiérrez Mazón
2
¨ Introduction ¨ MicroBlaze and ARM ZYNQ Features ¨ Random Memory Access (RAM):
¤ SRAM, DRAM, SDRAM, etc. ¤ Programmable ROM (PROM, EPROM, EEPROM, FLASH)
¤ Content-Addressable Memory (CAM).
¨ Buses – Architectures. ¤ PCI, PCI Express. ¤ Processor Local Bus (PLB) & On-chip Peripheral Bus (OPB). ¤ Advanced Extensible Interface (AXI) .
¨ Hardware/Software co-design. Address Management ¨ Embedded Development Kit (EDK). ¨ Os vs RTOS (Real Time OS)
3
¨ An Embedded System is nearly any computing system (other than a general-purpose computer) with the following characteristics: ¤ Single function
n Typically designed to perform a predefined function
¤ Tightly constrained n Tuned for low cost n Single-to-fewer component based n Performs functions fast enough n Consumes minimum power
¤ Reactive and real-time n Must continually monitor the desired environment and react to changes
¤ Hardware and software coexistence
4
¨ Embedded design in an FPGA consists of the following: ¤ Develop processor system in FPGA
n MicroBlaze processor (soft core) or ARM processor (hard core) n Peripherals
n PLBv46 (XPS) n AXI interconnect
n Reset, clocking, debug ports ¤ Use Operating System (OS) or Real Time Operating System (RTOS)
(optional) ¤ Generate drivers and libraries ¤ Create the software application
n Software routines n Interrupt service routines
5
Power Supply CLK CLK
CLK custom IF-logic
SDRAM SDRAM SRAM SRAM SRAM
Memory Controller
UART LC
Display Controller
Interrupt Controller Timer
Audio Codec
CPU (uP / DSP) Co-
Proc.
GP I/O
Address Decode Unit
Ethernet MAC
6
FPGA CLK CLK
CLK custom IF-logic
SDRAM SDRAM SRAM SRAM SRAM
Memory Controller
UART
Display Controller
Timer
Power Supply
LC
Audio Codec
CPU (uP / DSP) Co-
Proc.
GP I/O
Address Decode Unit
Ethernet MAC
Interrupt Controller
7
Power Supply
SDRAM SDRAM SRAM SRAM SRAM
LC
Audio Codec EPROM
8
¨ Example: ¤ Hummingbird processor from Samsung
n Used by Galaxy phones and tablet, and basis of the Apple’s A4 processor for the Ipad and iPhone4
n An ARM Cortex A8 processor core with a PowerVR SGX 535 graphics chip. ¤ NVIDIA’s Tegra 2 is similar
n Paired two ARM cortex n Processor Cores with an NVIDIA n Gpu.
9
¨ Introduction ¨ MicroBlaze and ARM ZYNQ Features ¨ Random Memory Access (RAM):
¤ SRAM, DRAM, SDRAM, etc. ¤ Programmable ROM (PROM, EPROM, EEPROM, FLASH)
¤ Content-Addressable Memory (CAM).
¨ Bus – Conceps and Architectures. ¤ PCI, PCI Express. ¤ Processor Local Bus (PLB) & On-chip Peripheral Bus (OPB). ¤ Advanced Extensible Interface (AXI) .
¨ Hardware/Software co-design. Address Management ¨ Embedded Development Kit (EDK). ¨ Os vs RTOS (Real Time OS)
MicroBlaze Architecture 10
Optional MMU for
Linux2.6 and MPU block for ease of
software use
PLB based system
Enhanced FSL for CPU
to hw/sw accelerator
11
¨ Scalable 32-bit Core ¤ Single-Issue pipeline
n Supports either 3-stage (resource focused) or 5-stage pipeline (performance focused)
¤ Configurable Instruction and Data Caches n Direct mapped (1-way associative)
¤ Optional Memory Mgt or Memory Protection Unit n Required for Linux OS (Linux 2.6 is currently supported)
¤ Floating-point unit (FPU) n Based upon IEEE 754 format
¤ Barrel Shifter ¤ Hardware multiplier
n 32x32 multiplication to generate a 64-bit result ¤ Hardware Divider ¤ Fast Simplex Link FIFO Channels for Easy, Direct Access to Fabric and
Hardware Acceleration ¤ Hardware Debug and Trace Module
MicroBlaze Architecture
12
¨ New features and improvements ¤ High-performance AXI4 interface and AXI4 peripherals ¤ Memory Management Unit (MMU) implements virtual memory management
n PPC405 processor MMU compatible n Virtual memory management provides greater control over memory protection,
which is especially useful with applications that can use an RTOS
¤ Processing improvements n New float-integer conversion and float-square root instructions n Speeds up
n FP è Int conversion n Int è FP conversion n FP square root
¤ Enhanced XMD support ¤ AXI4 streaming interface
MicroBlaze Architecture
13 ¨ All instructions take one clock cycle, except the following
¤ Load and store (two clock cycles) ¤ Multiply (two clock cycles) ¤ Branches (three clock cycles, can be one clock cycle)
¨ Operating frequency – fast speed grade, 5 stage pipeline ¤ 307 MHz on the Virtex-6 (-3) FPGA ¤ 245 MHz on the Virtex-5 (-3) FPGA ¤ 154 MHz on the Spartan®-6 (-3) FPGA ¤ 119 MHz on the Spartan-3 (-5) FPGA
¨ Performance of 1.15 DMIPS/MHz ¨ Fabric utilization – in LUT’s size optimized/speed optimized
¤ 779/1,134 LUTs in the Virtex-6 FPGA ¤ 240/330 LUTs in the Virtex-5 FPGA ¤ 770/1,154 LUTs in the Spartan-6 FPGA ¤ 1,258/1,821 LUTs in the Spartan-3 FPGA
MicroBlaze Architecture
ZYNQ Features (ARM) 14
¨ Complete ARM®-based processing system ¤ Application Processor Unit (APU)
n Dual ARM Cortex™-A9 processors n Caches and support blocks
¤ Fully integrated memory controllers ¤ I/O peripherals
¨ Tightly integrated programmable logic ¤ Used to extend the processing system ¤ Scalable density and performance
¨ Flexible array of I/O ¤ Wide range of external multi-standard
I/O ¤ High-performance integrated serial
transceivers ¤ Analog-to-digital converter inputs
ZYNQ Features (ARM) 15
¨ Application processing unit (APU)
¨ I/O peripherals (IOP) ¤ Multiplexed I/O (MIO), extended
multiplexed I/O (EMIO) ¨ Memory interfaces ¨ PS interconnect ¨ DMA ¨ Timers
¤ Public and private ¨ General interrupt controller
(GIC) ¨ On-chip memory (OCM): RAM ¨ Debug controller: CoreSight
ZYNQ Features (ARM) 16
¨ Legacy ARM processors ¤ ARM7, ARM9 (not the Cortex-A9
processor), ARM11 ¨ Cortex family of processors
¤ Cortex-A#: "A" application n The products support a memory
management unit (MMU) n Excellent for operating systems
¤ Cortex-R#: "R" real time n The products support a memory
protection Unit (MPU) n Better determinism than an MMU
¤ Cortex-M#: "M" Embedded microcontroller
¨ There are some products that are implemented differently but use the same ARM Architecture ¤ Cortex-A8 and Cortex-A9
processors
¨ ARM Cortex-A9 processor implements the ARMv7-A architecture ¤ ARMv7 is the ARM Instruction Set Architecture (ISA)
n Thumb instructions: 16 bits; Thumb-2 instructions: 32 bits n NEON: ARM’s Single Instruction Multiple Data (SIMD)
instructions
¤ ARMv7-A: Application set that includes support for a Memory Management Unit (MMU)
¤ ARMv7-R: Real-time set that includes support for a Memory Protection Unit (MPU)
¤ ARMv7-M: Microcontroller set that is the smallest set
¨ ARM Advanced Microcontroller Bus Architecture (AMBA®) protocol ¤ AXI3: Third-generation ARM interface
¤ AXI4: Adding to the existing AXI definition (extended bursts, subsets)
¨ Cortex is the new family of processors ¤ ARM family is older generation; Cortex is current; MMUs in
Cortex processors and MPUs in ARM
Application Processing Unit (APU) 17
¨ Heart of the PS ¨ Tightly coupled
processors and sub-components for maximum performance
¨ Tied to other PS components and PL via the PS interconnect
Application Processing Unit (APU) 18
¨ Dual ARM® Cortex™-A9 MPCore with NEON extensions ¤ Up to 800-MHz operation ¤ 2.5 DMIPS/MHz per core ¤ Separate 32KB instruction
and data caches ¨ Snoop control unit
¤ L1 cache snoop control n Accelerator coherency port
¨ Level 2 cache and controller ¤ Shared 512 KB cache with
parity
Application Processing Unit (APU) 19
¨ Introduction to NEON ¤ NEON is the ARM codename for the vector processing unit
n Provides multimedia and signal processing support ¤ FPU is the floating-point unit extension to NEON
n Both NEON and FPU share a single set of registers ¤ NEON technology is a wide single instruction, multiple data (SIMD)
parallel and co-processing architecture n 32 registers, 64-bits wide (dual view as 16 registers, 128-bits wide) n Data types can be: signed/unsigned 8-bit, 16-bit, 32-bit, 64-bit, or 32-bit
float
Application Processing Unit (APU) 20
¨ L1 Cache Features ¤ Separate instruction and data caches
for each processor ¤ Caches are four-way, set associative
and are write-back ¤ Non-lockable ¤ Eight words cache length ¤ On a cache miss, critical word first
filling of the cache is performed followed by the next word in sequence
¨ L2 Cache Features ¤ 512K bytes of RAM built into the SCU
n Latency of 25 CPU cycles n Unified instruction and data cache
¤ Fixed, 256-bit (32 words) cache line size
¤ Support for per-master way lockdown between multiple CPUs
¤ Eight-way, set associative
¤ Two AXI interfaces n One to DDR controller n One to programmable logic master (to peripherals)
Application Processing Unit (APU) 21
¨ APU Sub-components ¤ General interrupt controller (GIC) ¤ On-chip memory (OCM): RAM and boot ROM ¤ Central DMA (eight channels) ¤ Device configuration (DEVCFG) ¤ Private watchdog timer and timer for each CPU ¤ System watchdog and triple timer counters shared between CPUs ¤ ARM CoreSight debug technology
Application Processing Unit (APU) 22
¨ Snoop Control Unit (SCU) ¤ Shares and arbitrates functions between the two processor cores
n Data cache coherency between the processors n Initiates L2 AXI memory access n Arbitrates between the processors requesting L2 accesses n Manages ACP accesses n A second master port with programmable address filtering between OCM and L2 memory
support
23
¨ Introduction ¨ MicroBlaze and ARM ZYNQ Features ¨ Random Memory Access (RAM):
¤ SRAM, DRAM, SDRAM, etc. ¤ Programmable ROM (PROM, EPROM, EEPROM, FLASH)
¤ Content-Addressable Memory (CAM).
¨ Buses – Architectures. ¤ PCI, PCI Express. ¤ Processor Local Bus (PLB) & On-chip Peripheral Bus (OPB). ¤ Advanced Extensible Interface (AXI) .
¨ Hardware/Software co-design. Address Management ¨ Embedded Development Kit (EDK). ¨ Os vs RTOS (Real Time OS)
Introduction - RAM
24
Second Level Cache
(SRAM)
Control
Datapath
Secondary Memory (Disk)
On-Chip Components
RegFile
Main Memory (DRAM) D
ata C
ache Instr
Cache
ITLB
DTLB
eDRAM
Speed (ns): .1’s 1’s 10’s 100’s 1,000’s Size (bytes): 100’s K’s 10K’s M’s T’s
Cost: highest lowest
q By taking advantage of the principle of locality: ● Present the user with as much memory as is available in the cheapest
technology. ● Provide access at the speed offered by the fastest technology.
Introduction - RAM
25
Read-Write Memory Non-Volatile Read-Write
Memory Read-Only Memory
EPROM
E 2 PROM
FLASH
Random Access
Non-Random Access
SRAM
DRAM
Mask-Programmed
Programmable (PROM)
FIFO
Shift Register
CAM
LIFO
Introduction
26
q Grow in DRAM chip Capacity
Introducction - RAM
27
¨ Random Access: ¤ “Random” is good: access time is the same for all locations ¤ DRAM: Dynamic Random Access Memory
n High density, low power, cheap, slow n Dynamic: need to be “refreshed” regularly
¤ SRAM: Static Random Access Memory n Low density, high power, expensive, fast n Static: content will last “forever”(until lose power)
¨ “Non-so-random” Access Technology: ¤ Access time varies from location to location and from time to time ¤ Examples: Disk, CDROM
¨ Sequential Access Technology: access time linear in location (e.g.,Tape)
Introduction - RAM
28
¨ Performance of Main Memory: ¤ Latency: Cache Miss Penalty
n Access Time: time between request and word arrives n Cycle Time: time between requests
¤ Bandwidth: I/O & Large Block Miss Penalty (L2)
¨ Main Memory is DRAM : Dynamic Random Access Memory
¤ Dynamic since needs to be refreshed periodically (8 ms) ¤ Addresses divided into 2 halves (Memory as a 2D matrix):
n RAS or Row Access Strobe n CAS or Column Access Strobe
¨ Cache uses SRAM : Static Random Access Memory
¤ No refresh (6 transistors/bit vs. 1 transistor) Size: DRAM/SRAM - 4-8 Cost/Cycle time: SRAM/DRAM - 8-16
Introduction - RAM
29
Word 0
Word 1
Word 2
Word n-1
Word n-2
Storage Cell
m bits
n w
ords
S0
S1
S2
S3
Sn-2
Sn-1
Input/Output
n words è n select signals
Word 0
Word 1
Word 2
Word n-1
Word n-2
Storage Cell
m bits
S0
S1
S2
S3
Sn-2
Sn-1
Input/Output
A0
A1
Ak-1 Dec
oder
Decoder reduces # of inputs. k = log2 n
1D Memory Architecture
Introduction - RAM
30
2D Memory Architecture
A0
Row
Dec
oder
A1 Aj-1
Sense Amplifiers
bit line
word line
storage (RAM) cell
Row
Add
ress
C
olum
n A
ddre
ss
Aj Aj+1
Ak-1
Read/Write Circuits
Column Decoder
2k-j
m2j
Input/Output (m bits)
amplifies bit line swing
selects appropriate word from memory row
Introduction - RAM
31
Row
A
ddr
Col
umn
Add
r B
lock
A
ddr
Input/Output (m bits) 3D Memory Architecture
Random Memory Access (RAM). SRAM cell.
32
¨ Basic building block: SRAM Cell ¤ Holds one bit of information,
like a latch ¤ These cross-coupled inverters
are often referred to as a latch ¤ The circuit uses positive
feedback
bit
write
write_b
read
read_b
¨ 6T SRAM Cell ¤ Used in most commercial chips ¤ Data stored in cross-coupled
inverters
¨ Read: ¤ Precharge bit, bit_b ¤ Raise wordline
¨ Write: ¤ Drive data onto bit, bit_b ¤ Raise wordline bit bit_b
word
¨ 12-transistor (12T) SRAM cell ¤ Use a simple latch
connected to bitline
Random Memory Access (RAM). SRAM cell.
33
SRAM Read ¨ Precharge both bitlines high ¨ Then turn on wordline ¨ One of the two bitlines will be
pulled down by the cell ¨ Ex: A = 0, A_b = 1
¤ bit discharges, bit_b stays high ¤ But A bumps up slightly
¨ Read stability ¤ A must not flip ¤ N1 >> N2
bit bit_b
N1
N2P1
A
P2
N3
N4
A_b
word
0.0
0.5
1.0
1.5
0 100 200 300 400 500 600time (ps)
word bit
A
A_b bit_b
Random Memory Access (RAM). SRAM cell.
34
SRAM Write ¨ Drive one bitline high, the other low ¨ Then turn on wordline ¨ Bitlines overpower cell with new value ¨ Ex: A = 0, A_b = 1, bit = 1, bit_b = 0
¤ Force A_b low, then A rises high ¨ Writability
¤ Must overpower feedback inverter ¤ N2 >> P1
bit bit_b
N1
N2P1
A
P2
N3
N4
A_b
word
time (ps)
word
A
A_b
bit_b
0.0
0.5
1.0
1.5
0 100 200 300 400 500 600 700
Random Memory Access (RAM). SRAM cell.
35
Decoders ¨ n:2n decoder consists of 2n n-input AND gates
¤ One needed for each row of memory ¤ Build AND from NAND or NOR gates
word0
word1
word2
word3
A0A1 word0
word1
word2
word3
word15
A0A1A2A3
A0
A1
A2
A3
word1
word2
word3
word15
word0
1 of 4 hotpredecoded lines
predecoders
Pre-decoding
Random Memory Access (RAM). SRAM cell.
36
¨ Column Circutry ¤ Bitline conditioning ¤ Sense amplifiers ¤ Column multiplexing
¨ Precharge bitlines high before reads
¨ Equalize bitlines to minimize voltage difference when using sense amplifiers.
f
bit bit_b
φbit bit_b
bit_bbit
sense sense_b
sense_clk isolationtransistors
regenerativefeedback
A0A1
B0 B1 B2 B3
Y
Multiplexing Column.
Sense Amplifiers.
Random Memory Access (RAM). SRAM cell.
37
¨ Ex: UltraSparc 512KB cache ¤ 4 128 KB subarrays. ¤ Each have 16 8KB
banks. ¤ 256 rows x 256 cols /
bank. ¤ 60% subarray area
efficiency. ¤ Also space for tags &
control.
Random Memory Access (RAM). DRAM cell.
38
¨ DRAM 1-T uses a capacitor (Cc) to temporarily store data which must be refreshed periodically to prevent information loss, and the data is lost in most DRAMs during the read cycle.
¨ Due to leakage currents of MA, the data will eventually be corrupted, hence it needs to be refreshed
Random Memory Access (RAM). DRAM cell.
39
Storing a ‘0’
Storing a ‘1’
Random Memory Access (RAM). DRAM cell.
40
¨ DRAM subarray (256 words* 512bits).
Random Memory Access (RAM). DRAM cell.
41
Sense Amplifier.
1 T DRAM cell read operation.
Bitline Conditioning
Column circuitry
Random Memory Access (RAM). DRAM cell.
42
DRAM Timing. Multiplexed Addresing
SRAM Timing.
DRAM Timing . Multiplexed addresing Detailed.
Random Memory Access (RAM). DRAM cell.
43
43
A D
OE_L
256K x 8 DRAM 9 8
WE_L CAS_L RAS_L
OE_L
A Row Address
WE_L
Junk
Read Access Time
Output Enable Delay
CAS_L
RAS_L
Col Address Row Address Junk Col Address
D High Z Data Out
DRAM Read Cycle Time
Early Read Cycle: OE_L asserted before CAS_L Late Read Cycle: OE_L asserted after CAS_L
¨ DRAM Read Timing. Every DRAM access begins at: ¤ Assertion of the RAS_L ¤ 2 ways to read: early or late v. CAS
Junk Data Out High Z
Random Memory Access (RAM). DRAM cell.
44
44
A D
OE_L
256K x 8 DRAM 9 8
WE_L CAS_L RAS_L
WE_L
A Row Address
OE_L
Junk
WR Access Time WR Access Time
CAS_L
RAS_L
Col Address Row Address Junk Col Address
D Junk Junk Data In Data In Junk
DRAM WR Cycle Time
Early Wr Cycle: WE_L asserted before CAS_L Late Wr Cycle: WE_L asserted after CAS_L
¨ DRAM write timing. Every DRAM access begins at: ¤ The assertion of the RAS_L ¤ 2 ways to write: early or late v. CAS
Random Memory Access (RAM). DRAM cell.
45
DUAL DATA RATE (DDR)
Functional Block Diagram 8M* 16b SDRAM
Random Memory Access (RAM). DRAM cell.
46
Prefetch
Burst Length
Double-Data Rate (DDR) DRAM transfers data on both rising and
falling edge of the clock
Command frequency does not change
Read Only Memory (ROM).
47
¨ ROM memories is a nonvolatile structure in that the state is retained indefinitely, even without power.
¨ Mask-programmed ROM can be configured by the presence (‘1’) or absence (‘0’) of a transistor or contact.
ROM Array
2:4DEC
A0A1
Y0Y1Y2Y3Y4Y5
weakpseudo-nMOS
pullups
Word 0: 010101
Word 1: 011001
Word 2: 100101
Word 3: 101010
Read Only Memory. Programmable ROM.
48
¨ Programmable ROMs ¤ PROM uses fuses to store
the information. One-time programmable memory.
¤ The user typically configures the ROM in a specialized PROM programer before the putting it in the system.
¨ Erasable Programmable ROMs (EPROM) ¤ Using a floating gate the
control gate and channel.
¤ EPROM, EEPROM, Flash
Floating gate Source
Substrate
Gate Drain
n + n +_ p
t ox t ox
G
S
D
EPROM
Floating gate Source
Substrate p
Gate Drain
n 1 n 1
20 – 30 nm
10 nm EEPROM
Read Only Memory. Programmable ROM.
49
0 V
2 5 V 0 V
D S
Removing programming voltage Leaves charge trapped
5 V
2 2.5 V 5 V
D S
20 V
10 V 5 V 20 V
D S
Avalanche injection
Programming results in Higher Vt
WL
BL
V DD
EEPROM 2T Cell
Read Only Memory. Programmable ROM.
50
ETOX 1T Cell (Flash)
Read Only Memory. Programmable ROM.
51
¨ 64K cells / pages. 64 cells/line ¤ 256 pages/block.
¨ 4 bits / cells (multilevel Vt)
¨ 2K block/plane. 2 planes.
NA
ND
FLA
SH
Content Addressable Memory (CAM)
52
¨ Extension of ordinary memory (e.g. SRAM) ¤ Read and write memory as
usual.
¤ Also match to see which words contain a key.
CAM
adr data/key
matchread
write
row decoder
weak
missmatch0
match1
match2
match3
clk
column circuitry
CAM cell
address
data
read/write
bit bit_b
word
match
cell
cell_b
CAM cell
Content Addressable Memory (CAM)
53
¨ CAM in Memory Cache
Address D
ecoder
Hit Logic
CAM
ARRAY
Input Drivers
Tag Hit Address
SRAM
ARRAY
Sense Amps / Input Drivers
Data R/W
54
¨ Introduction ¨ MicroBlaze and ARM ZYNQ Features ¨ Random Memory Access (RAM):
¤ SRAM, DRAM, SDRAM, etc. ¤ Programmable ROM (PROM, EPROM, EEPROM, FLASH)
¤ Content-Addressable Memory (CAM).
¨ Buses – Architectures. ¤ PCI, PCI Express. ¤ Processor Local Bus (PLB) & On-chip Peripheral Bus (OPB). ¤ Advanced Extensible Interface (AXI) .
¨ Hardware/Software co-design. Address Management ¨ Embedded Development Kit (EDK). ¨ Os vs RTOS (Real Time OS)
Introduction to Buses 55
¨ What is a bus? ¨ It is a simplified way for many devices to
communicate to each other. ¨ Looks like a “highway” for information. ¨ Actually, more like a “basket” that they all share.
CPU Keyboard Display
Introduction to Buses 56
¨ Suppose CPU needs to check to see if the user typed anything.
CPU Keyboard Display
Introduction to Buses 57
¨ CPU puts “Keyboard, did the user type anything?” (represented in some way) on the Bus.
CPU Keyboard Display
“Keyboard, did the user type anything?”
Introduction to Buses 58
¨ Each device (except CPU) is a State Machine that constantly checks to see what’s on the Bus.
Bus
CPU Keyboard Display
“Keyboard, did the user type anything?”
Introduction to Buses 59
¨ Keyboard notices that its name is on the Bus, and reads info. Other devices ignore the info.
Bus
CPU Keyboard Display
“Keyboard, did the user type anything?”
Introduction to Buses 60
¨ At some point, CPU reads the Bus, and gets the Keyboard’s response.
Bus
CPU Keyboard Display
“CPU: Yes, user typed ‘a’.”
Buses 101 61
¨ A bus is a multiwire path on which related information is delivered ¤ Address, data, and control
buses
¨ Processor and peripherals communicate through buses
¨ Peripherals may be classified as: ¤ Arbiter, master, slave, or
master/slave (bridge)
Master Master/ Slave
Slave Slave Slave
Master Arbiter Arbiter
Buses 101 62
¨ Address Bus : ¤ CPU reads/writes data from the memory by addressing a unique location; outputs the location of
the data (aka address) on the address buss; memory uses this address to access the proper data ¤ Each I/O device (such as monitor, keypad, etc.) has a unique address as well (or a range of
addresses); when accessing a I/O device, CPU places its address on the address bus. Each device will detect if it is its own address and act accordingly
¤ Devices always receive data from the CPU; CPU never reads the address buss (it is never addressed)
¨ Data Bus: ¤ When the CPU fetches data from memory, it first outputs the address on the address bus, then the
memory outputs the data onto the data bus; the CPU reads the data from data bus ¤ When writing data onto the memory, the CPU outputs first the address on the address bus, then
outputs the data onto the output bus; memory then reads and stores the data at the proper location
¨ Control Bus: ¤ Address and data buses consist of n lines, which combine to transmit one n bit value; control bus is
a collection of individual control signals. This bus is mostly a collection of unidirectional signals ¤ These signals indicate whether the data is to be read into or written out the CPU, whether the CPU
is accessing memory or an IO device, and whether the I/O device or memory is ready for the data transfer
Buses 101 63
¨ Bus masters have the ability to initiate a bus transaction ¨ Bus slaves can only respond to a request ¨ Bus arbitration is a three-step process:
¤ A device requesting to become a bus master asserts a bus request signal
¤ The arbiter continuously monitors the request and outputs an individual grant signal to each master according to the master’s priority scheme and the state of the other master requests at that time
¤ The requesting device samples its grant signal until the master is granted access. The master then initiates a data transfer between the master and a slave when the current bus master releases the bus
¨ Arbitration mechanisms ¤ Fixed priority, round-robin, hybrid
Buses 101 64
¨ The IBM CoreConnect bus architecture standard provides three buses for interconnecting cores, library macros, and custom logic: ¤ Processor Local Bus (PLB) ¤ On-Chip Peripheral Bus (OPB)
¤ Device Control Register (DCR) bus
¨ IBM offers a no-fee, royalty-free CoreConnect bus architecture license ¤ Licenses receive the PLB arbiter, OPB arbiter, and PLB/OPB bridge
designs along with bus-model toolkits and bus-functional compilers for the PLB, OPB, and DCR buses
¤ Required only if you create your own CoreConnect bus architecture peripheral or you are using the Bus Functional Model (BFM)
Buses 101 65
The MicroBlaze processor core is organized as a Harvard architecture
MicroBlaze™ DPLB
Local Memory
DLMB
IIC
PLB ARB
GPIO
UART
Ethernet
Timer/PWM
BRAM
Interrupt Controller
ILMB
LMB Buses
IXCL DXCL CacheLinks
Multi-Port Memory Controller
FSL
Co-Processor
IPLB
Separate busses for data and instruction
66
¤ Processor independence ¤ Low-power consumption ¤ Burst use for all read and write
transfers ¤ Bus speed up to 66 MHz ¤ 64-bit bus width ¤ Low pin count (PCI Target: 47,PCI
Initiator: 49 pins) ¤ Concurrent bus operation ¤ Bus master support ¤ Hidden bus arbitration ¤ Auto configuration
PCI Bus
Key Terms ¨ Initiator
¤ Or Master ¤ Owns the bus and initiates the data transfer ¤ Every Initiator must also be a Target
¨ Target ¤ Or Slave ¤ Target of the data transfer (read or write)
¨ Agent ¤ Any initiator/target or target on the PCI bus
67
PCI Bus Clock ¨ All action synchronize to the PCI clock
¨ Clock may be any where from 0 MHz to 33 MHz and all PCI device must be support this range
¨ The revision 2.1 specification define speed up to 66 MHz
Address phase ¨ At the same time, initiator identifiers
target device and the type of transaction
¨ The initiator assert the FRAME# signal
¨ Every PCI target device latch the address and decode it
PCI Bus
Data Phase ¨ Number of data bytes to be transformed is
determined by the number of Command/Byte Enable signals asserted by initiator
¨ Both of initiator and target must t ready to complete data phase
¨ IRDY# and TRDY# used
Transaction Duration ¨ By asserting FRAME# at start of address phase
and remain until the final data phase
Transaction completion and return of bus to idle state ¨ By deasserting the FRAME# but asserting IRDY#
¨ When the last data transfer has completed the initiator returns the PCI bus to idle state by deasserting IRDY#
Intro. to PCI Bus Operation.
68
PCI Bus
Transfer Modes
69
¤ PCI Bus Lines (required) n Systems lines
n Including clock and reset
n Address & Data n 32 time mux lines for address/data n Interrupt & validate lines
n Interface Control n Arbitration
n Not shared n Direct connection to PCI bus arbiter
n Error lines
PCI Bus
¤ PCI Bus Lines (optional) n Interrupt lines
n Not shared
n Cache support n 64-bit Bus Extension
n Additional 32 lines n Time multiplexed n 2 lines to enable devices to
agree to use 64-bit transfer
n JTAG/Boundary Scan n For testing procedures
70
¨ PCI Commands ¤ Transaction between initiator (master)
and target ¤ Master claims bus ¤ Determine type of transaction
n e.g. I/O read/write
¤ Address phase ¤ One or more data phases
PCI Bus
PCI Bus Transaction Start
Address
4 2 3 1 5 6 7 8 CLK
9
FRAME#
AD
C/BE# Command
IRDY#
GNT#
Bus Idle
71
PCI Bus
PCI Bus Read CLK
FRAME#
AD
C/BE#
IRDY#
TRDY#
DEVSEL#
ADDRESS DATA-1 DATA-2 DATA-3
BE#’S BUS CMD
72
PCI Bus
PCI Bus Target Read Transaction
73
PCI Bus
PCI Bus Write
74
PCI Bus
PCI Bus Target Write Transaction
75
¨ PCI Bus Arbitration
PCI Bus
PCI Bus ArbitrationTiming
76
¨ A PCI target can implement up to three different types of address spaces
¨ Configuration space ¤ Stores basic information about the device ¤ Allows the central resource or O/S to program a
device with operational settings
¨ I/O space ¤ Used mainly with PC peripherals and not much else.
¨ Memory space ¤ Used for just about everything else
¨ I/O space ¨ This space is where basic PC peripherals
(keyboard, serial port,etc.) are mapped. ¨ The PCI spec allows an agent to request 4 bytes to
2GB of I/O space.
PCI Bus
¨ Configuration space ¨ Contains basic device information, e.g.,
vendor or class of device. ¨ Also permits Plug-N-Play
¤ Base address registers allow an agent to be mapped dynamically into memory or I/O space.
¤ A programmable interrupt-line setting allows a software driver to program a PC card with an IRQ upon power-up (without jumpers!).
¨ Memory space ¨ This space is used by most everything else – it’s
the general-purpose address space ¤ The PCI spec recommends that a device use memory
space, even if it is a peripheral
¨ An agent can request between 16 bytes and 2GB of memory space ¤ The PCI spec recommends that an agent use at least
4kB of memory space, to reduce the width of the agent’s address decoder
PCI Address Space
77
¨ PCI Express Introduction ¤ PCI Express architecture is a high performance, IO
interconnect for peripherals in computing communication platforms.
¤ Evolved from PCI and PCI-X architectures ¤ PCI Express is a serial point-to-point interconnect between
two devices. Scalable performance based on number of signal Lanes implemented on the PCI Express
¤ Implements packet based protocol for information transfer interconnect.
PCI Express Bus
¨ PCI Express Features ¤ Point-to-point connection ¤ Serial bus means fewer pins ¤ Scalable: x1, x2, x4, x8, x12,
x16, x32 (2.5 Gb/s) ¤ Dual Simplex connection ¤ 2.5VGT/s transfer/direction/s ¤ Packet based transaction
protocol
Devic
e A Frame
Frame
Sequence Number
Packet Request CRC Frame
CRC Packet Request
Sequence Number Frame
Data Data
Data Data
Clock Clock
Devic
e B
x1 Lane
78
Transaction Types, address Spaces ¨ Request are translated to one of four
transaction types by the Transaction Layer: ¤ Memory Read or Memory Write.
Used to transfer data from or to a memory mapped location
¤ I/O Read or I/O Write. Used to transfer data from or to an I/O location
¤ Configuration Read/Write. Used to discover device capabilities, program features, and check status in the 4KB PCI Express configuration space.
¤ Messages. Handled like posted writes. Used for event signaling and general purpose messaging.
PCI Express Bus
79
PCI Express Bus
Programmed I/O Transaction
DMA Transaction
Peer-to-Peer Transaction
80
PCI Express Bus
PCI Express Device Layers
81
¤ Connection infrastructure for high-bandwidth master and slave devices
¤ Fully synchronous to one clock ¤ Centralized bus arbitration—PLB arbiter ¤ 32 or 64-bit address (upper 32-bit are connected to GND) ¤ 32, 64, or 128-bit data bus ¤ Selectable shared bus or point-to-point interconnect topology
n Point-to-point optimization available for 1 master, 1 slave configuration n Point-to-point topology supports 0 cycle latency via arbitration removal
¤ Selectable address pipelining support (2-level only) ¤ Dynamic master request priority based arbitration ¤ Vectored resets and address/qualifier registers
PLB Bus
82
PLB Bus - Interconnect
¨ One to 16 PLB masters, each connect all of their signals to the PLB arbiter
¨ The PLB arbiter multiplexes signals from masters onto a shared bus to which all the inputs of the slaves are connected
¨ One to n PLB slaves OR together their outputs to drive a shared bus back to the PLB arbiter
¨ The PLB arbiter handles bus arbitration and the movement of data and control signals between masters and slaves
83
PLB Bus - Bridge
¨ The PLB-to-PLB is required when two PLB segments are connected ¤ Different bus speed ¤ Different bus width
¨ The bridge translates PLB transactions on one side into the PLB transactions of the other side
¨ The bridge functions as a slave on one PLB side and a master on the other PLB side
¨ For a typical system with two PLB segments, one bridge is necessary for transactions originating from processor ¤ A second bridge is required if a peripheral on the other side is master capable
and wants to address a peripheral on the processor side
84
OPB Bus
¨ The OPB bus decouples lower bandwidth devices from the PLB ¨ It is a less complex protocol than PLB
¤ No split transaction or address pipelining capability ¨ Centralized bus arbitration—OPB arbiter ¨ Connection infrastructure for the master and slave peripheral devices ¨ The OPB bus is designed to alleviate system performance bottlenecks by
reducing capacitive loading on the PLB ¤ Fully synchronous to one clock ¤ Shared 32-bit address bus, shared 32-bit data bus ¤ Supports single-cycle data transfers between the master and the slaves ¤ Supports multiple masters, determined by arbitration implementation ¤ The bridge function can be the master on the PLB or OPB
85
OPB Bus
¨ Supports 16 masters and an unlimited number of slaves (limited by the expected performance)
¨ The OPB arbiter receives bus requests from the OPB masters and grants the bus to one of them ¤ Fixed and dynamic (LRU) priorities
¨ Bus logic is implemented with AND-OR logic. Inactive devices drives zeros
¨ Read and write data buses can be separated to reduce loading on the OPB_DBus signal
86
AMBA
APB AHB AXI
AXI-4 Memory Map
AXI-4 Stream
AXI-4 Lite
ATB AMBA 3.0 (2003)
AMBA 4.0 (Just Announced)
Same Spec
Enhancements for FPGAs
Interface Features Similar to
Memory Map / Full
Traditional Address/Data Burst (single address, multiple data)
PLBv46, PCI
Streaming Data-Only, Burst Local Link / DSP Interfaces / FIFO / FSL
Lite Traditional Address/Data—No Burst (single address, single data)
PLBv46-single OPB
AXI is Part of AMBA: Advanced Microcontroller Bus Architecture
87
ARM AXI
Processor
Peripherals
PLB46
Arbiter
AXI Slaves Interconnect
AXI AXI
AXI
AXI
AXI
“Shared Access” Bus
AXI Interconnect IP § Implementation is not
described in the spec § Several companies build and
sell “AXI interconnect IP” § Xilinx is building its own
Arrows indicate master/slave relationship, not direction of dataflow
Master Slave
AXI
AXI
AXI
PLB
PLB
PLB
PLB
AXI is an interface specification, not a bus specification
AXI Masters
AXI AXI
88
Basic AXI Transactions
¨ Read address channel ¨ Read data channel
¨ Write address channel ¨ Write data channel ¨ Write response channel
¤ Non-posted write model: there will always be a “write response”
89
ARM AXI – AXI4
¨ Also called full AXI, AXI Memory Mapped
¨ Single address multiple data
¨ Burst up to 256 data beats ¨ Targeted Xilinx support
AXI4 Read
AXI4 Write
90
ARM AXI – AXI4
¨ No burst ¨ Data width 32 or 64
only ¤ Xilinx IP will only
support 32 bits
¨ Simple “logic shim” to connect AXI4 master to AXI4-Lite slave ¤ Reflect master’s
transaction ID
AXI4-Lite Read
AXI4-Lite Write
91
ARM AXI – AXI4
¨ No address channel
¨ Not read and write, always just master to slave
¨ Unlimited burst length
AXI4-Streaming Transfer
92
¨ Introduction ¨ MicroBlaze and ARM ZYNQ Features ¨ Random Memory Access (RAM):
¤ SRAM, DRAM, SDRAM, etc. ¤ Programmable ROM (PROM, EPROM, EEPROM, FLASH)
¤ Content-Addressable Memory (CAM).
¨ Buses – Architectures. ¤ PCI, PCI Express. ¤ Processor Local Bus (PLB) & On-chip Peripheral Bus (OPB). ¤ Advanced Extensible Interface (AXI) .
¨ Hardware/Software co-design. Address Management ¨ Embedded Development Kit (EDK). ¨ Os vs RTOS (Real Time OS)
Address Management 93
¨ Embedded processor design requires you to manage the following: ¤ Address map for the peripherals ¤ Location of the application code in the memory space
n Block RAM n External memory
¨ Memory requirements for your programs are based on the following: ¤ The amount of memory required for storing the instructions ¤ The amount of memory required for storing the data associated with
the program
Address Management (MicroBlaze) 94
¨ Memory and peripherals ¤ The MicroBlaze processor uses
32-bit addresses
¨ Special addresses ¤ MicroBlaze processors must have
writeable memory from x00000000 through 0x0000004F so it can be updated by boot sequence
¤ Each vector consists of two instructions IMM followed by a BRAI instruction to address full memory range
0x0000_0000 0x0000_0008 0x0000_0010
0xFFFF_FFFF
0x0000_0018
Reset Address Exception Address Interrupt Address
LMB Memory
Reserved
PLB Memory
Peripherals
0x0000_0020 0x0000_0028 0x0000_004F
Break Hardware Exception
Address Management (ARM) 95
¨ Address Management ¨ All registers for both CPUs are
grouped into two contiguous 4KB pages ¤ Accessed through a dedicated
internal bus ¨ Fixed at 0xF8F0_0000 with a
register block size of 8 KB ¤ Each CPU uses an offset into this
base address
0x0000-0x00FC SCU registers 0x0100-0x01FF Interrupt controller interface 0x0200-0x02FF Global timer 0x0600-0x06FF Private timers and watchdog timers 0x1000-0x1FFF Interrupt distributor
0xFFFC_0000
Address Management 96
¨ The compiler includes pre-compiled startup and end files when forming the executable
¨ Startup files setup the language and platform environment before your application code executes ¤ Sets up vectors as required (reset, interrupt, exception, etc.) ¤ Sets up registers (stack pointer, small data anchors, etc.) ¤ Clears .bss memory region to zero ¤ Invokes language initialization functions, such as C++ constructors ¤ Initializes the hardware sub-system (ie. initialize profiling timers) ¤ Sets up arguments for the main procedure and invokes it
¨ End files include code that must execute after the program ends ¤ Invoke language cleanup functions, such as C++ destructors ¤ De-initialize the hardware sub-system (ie. clean profiling system sub-system)
Address Management 97
¨ Crt0.o initialization file is used when the executable is executed in standalone mode (no debug)
¨ The C runtime file crt0.o is linked with the user program ¤ Starts at address location 0x0, immediately followed
by the user program ¤ Populates reset, interrupt, exception and
hardware exception vectors
crt0.o
main program
0x00000000
Address Management 98
¨ Object File Sections ¨ What is an object file?
¤ An object file is an assembled piece of code n Machine language:
li r31,0 = 0x3BE0 0000
¤ Constant data ¤ There may be references to external objects that are
defined elsewhere ¤ This file may contain debugging information
Address Management 99
.text
.rodata
.sdata2
.sbss2
.data
.sdata
.sbss
.bss
Sectional Layout of an Object or an Executable file
• Text section
• Read-only data section
• Small read-only data section (less than eight bytes)
• Small read-only uninitialized data section
• Read-write data section
• Small read-write data section
• Small uninitialized data section
• Uninitialized data section
Address Management 100
¨ Linker scripts control the linking process ¤ Map the code and data to a specified memory space ¤ Set the entry point to the executable ¤ Reserve space for the stack
¨ Required if the design contains a discontinuous memory space
101
¨ Introduction ¨ MicroBlaze and ARM ZYNQ Features ¨ Random Memory Access (RAM):
¤ SRAM, DRAM, SDRAM, etc. ¤ Programmable ROM (PROM, EPROM, EEPROM, FLASH)
¤ Content-Addressable Memory (CAM).
¨ Buses – Architectures. ¤ PCI, PCI Express. ¤ Processor Local Bus (PLB) & On-chip Peripheral Bus (OPB). ¤ Advanced Extensible Interface (AXI) .
¨ Hardware/Software co-design. Address Management ¨ Embedded Development Kit (EDK). ¨ Os vs RTOS (Real Time OS)
Embedded Development Kit (EDK) 102
¨ What is Embedded Development Kit (EDK)? ¤ The Embedded Development Kit is the Xilinx software
suite for designing complete embedded programmable systems
¤ The kit includes all the tools, documentation, and IP that you require for designing systems with embedded hard processor cores, and/or Xilinx MicroBlaze™ soft processor cores
¤ It enables the integration of both hardware and software components of an embedded system
Embedded Development Kit (EDK) 103
Data2MEM
Download Combined Image to FPGA
Compiled ELF Compiled BIT
RTOS, Board Support Package
Embedded Development Kit
Instantiate the ‘System Netlist’ and Implement
the FPGA
?
HDL Entry
Simulation/Synthesis
Implementation
Download Bitstream Into FPGA
Chipscope
Standard FPGA HW Development Flow
VHDL or Verilog
System Netlist Include the BSP and Compile the Software Image
?
Code Entry
C/C++ Cross Compiler
Linker
Load Software Into FLASH
Debugger
Standard Embedded SW Development Flow
C Code
Board Support Package
1 2 3 Compiled BIT Compiled ELF
Embedded Development Kit (EDK) 104
A. Develop the embedded hardware in XPS ¤ Quickly create a system targeting a board using Base System Builder Wizard ¤ Extend the hardware system, if necessary, by adding peripherals from the IP Catalog ¤ Generate HDL netlists using PlatGen
B. Develop the embedded software in SDK ¤ Generate libraries and drivers with LibGen ¤ Create and debug the software application using Software Development Kit (SDK) ¤ Optionally, debug the application using Xilinx Microprocessor Debug (XMD) and
the GNU debugger (gdb) C. Operate in hardware
¤ Generate the bitstream and configure the FPGA using iMPACT D. Deploy
¤ Initialize external flash memory using the Flash Writer utility or boot from an external compact flash configuration file generated using the System ACE File generator (GenACE) script
Embedded Development Kit (EDK). Debugging Tools
105
¨ Debugging is an integral part of embedded systems development ¨ The debugging process is defined as testing, stabilizing, localizing, and
correcting errors ¨ Two methods of debugging:
¤ Hardware debugging via a logic probe, logic analyzer, in-circuit emulator, or background debugger
¤ Software debugging via a debugging instrument n A software debugging instrument is source code that is added to the program for the
purpose of debugging
¨ Debugging types: ¤ Functional debugging ¤ Performance debugging
Embedded Development Kit (EDK). Debugging Tools
106
¨ EDK supports software debugging via: ¤ ChipScope™ Pro tool cores are available to a Xilinx
Platform Studio design n PLB IBA (Integrated Bus Analyzer) n ILA (Integrated Logic Analyzer) n VIO (Virtual I/O)
¤ Enables co-debug of software with GNU gdb and hardware with ChipScope Analyzer
Embedded Development Kit (EDK). Debugging Tools
107
¨ EDK supports software debugging via: ¤ GNU Debugger (GDB)
n Software debugger that runs on PC ¤ Microprocessor Debug Module (MDM)
n Debug interface in MicroBlaze system ¤ Xilinx Microprocessor Debugger (XMD)
n Facilitates an interface between the GNU tools and the MicroBlaze MDM
108
¨ Introduction ¨ MicroBlaze and ARM ZYNQ Features ¨ Random Memory Access (RAM):
¤ SRAM, DRAM, SDRAM, etc. ¤ Programmable ROM (PROM, EPROM, EEPROM, FLASH)
¤ Content-Addressable Memory (CAM).
¨ Buses – Architectures. ¤ PCI, PCI Express. ¤ Processor Local Bus (PLB) & On-chip Peripheral Bus (OPB). ¤ Advanced Extensible Interface (AXI) .
¨ Hardware/Software co-design. Address Management ¨ Embedded Development Kit (EDK). ¨ Os vs RTOS (Real Time OS)
109
¨ Introduction ¨ MicroBlaze and ARM ZYNQ Features ¨ Hardware/Software co-design. Address Management ¨ Embedded Development Kit (EDK). ¨ Os vs RTOS (Real Time OS) ¨ Interrupts, Exceptions, Watch-Dog, …
Os vs RTOS (Real-Times OS) 110
¨ What`s an Operating System?
¤ Provides environment for executing programs ¤ Process abstraction for multitasking/concurrency
n Scheduling
¤ Hardware abstraction layer (device drivers) ¤ File-systems ¤ Communication ¤ We will focus on concurrent, real-time issues
Os vs RTOS (Real-Times OS) 111
¨ Real Time System ¤ A system is said to be Real Time if it is required to complete it’s
work & deliver it’s services on time. ¤ Example – Flight Control System
n All tasks in that system must execute on time. ¤ Non Example – PC system ¤ Hard Real Time System
n Failure to meet deadlines is fatal n example : Flight Control System
¤ Soft Real Time System n Late completion of jobs is undesirable but not fatal. n System performance degrades as more & more jobs miss deadlines n Online Databases
Os vs RTOS (Real-Times OS) 112
¨ Typical RTOS Task Model ¤ Each task a triplet: (execution time, period, deadline) ¤ Usually, deadline = period ¤ Can be initiated any time during the period
Execution time
Period
Deadline
Time
Initiation
Os vs RTOS (Real-Times OS) 113
¨ Hard real-time system with multirate behavior: Fly-by-wire Avionics
INU 1kHz
GPS 20 Hz
Air data 1 kHz
Joystick 500 Hz
Pitch control 500 Hz
Lateral Control 250 Hz
Throttle Control 250 Hz
Aileron 1 1 kHz
Aileron 2 1 kHz
Elevator 1 kHz
Rudder 1 kHz
gyros, accel.
GPS
Sensor
Stick
Aileron
Aileron
Elevator
Rudder
Sensors Signal Conditioning
Control laws Actuating Actuators
Os vs RTOS (Real-Times OS) 114
¨ Features of RTOS’s ¤ Scheduling. ¤ Resource Allocation.
n The issues with scheduling applicable here. n Resources can be allocated in
n Weighted Round Robin n Priority Based
¤ Interrupt Handling. n Interrupt Latency should be very small
n Kernel has to respond to real time events. n Interrupts should be disabled for minimum possible time
¤ Other issues like kernel size. For embedded applications Kernel Size should be small. Should fit in ROM. No Virtual Memory. No Protection
Os vs RTOS (Real-Times OS) 115
¨ Scheduling Algorithms in RTOS:
¤ Clock Driven Scheduling n All parameters about jobs (release time/ execution time/deadline)
known in advance. Minimal runtime overhead.
¤ Weighted Round Robin Scheduling n Jobs scheduled in FIFO manner. Time quantum given to jobs is
proportional to it’s weight
¤ Priority Scheduling (Greedy / List / Event Driven) n Processor never left idle when there are ready tasks. Processor
allocated to processes according to priorities. Static (at design time) Dynamic (at runtime).
Os vs RTOS (Real-Times OS) 116
¨ Priority-based Preemptive Scheduling ¤ Always run the highest-priority runnable process
1
2
3
¨ Multiple processes at the same priority level?
Os vs RTOS (Real-Times OS) 117
¨ Linux for Real Time Applications. ¤ Scheduling
n Priority Driven Approach n Optimize average case response time.
n Interactive Processes Given Highest Priority n Aim to reduce response times of processes.
n Real Time Processes n Processes with high priority. n No notion of deadlines.
¤ Resource Allocation n No support for handling priority inversion.
Os vs RTOS (Real-Times OS) 118
¨ Interrupt Handling in Linux ¤ Interrupts are disabled in ISR/critical sections of the kernel ¤ No worst case bound on interrupt latency avaliable
n eg: Disk Drivers may disable interrupt for few hundred milliseconds
¤ Not suitable for Real Time Applications n Interrupts may be missed
¤ Processes are non pre-emptible in Kernel Mode n System calls like fork take a lot of time n High priority thread might wait for a low priority thread to complete it’s
system call
¤ Processes are heavy weight n Context switch takes several hundred microseconds
Os vs RTOS (Real-Times OS) 119
¨ RTLinux ¤ Real Time Kernel at the lowest level. ¤ Linux Kernel is a low priority thread.
n Executed only when no real time tasks ¤ Interrupts trapped by the Real Time Kernel and passed onto Linux
Kernel n Software emulation to hardware interrupts
n Interrupts are queued by RTLinux. n Software emulation to disable_interrupt().
¤ Real Time Tasks n Statically allocate memory. No address space protection
¤ Non Real Time Tasks are developed in Linux. ¤ Communication
n Queues, Shared memory.
Os vs RTOS (Real-Times OS) 120
¨ RTLinux Framework
Os vs RTOS (Real-Times OS) 121
¨ LynxOS ¤ Microkernel Architecture
n Kernel provides scheduling/interrupt handling ¤ Additional features through Kernel Plug Ins(KPIs)
n TCP/IP stack, Filesystem. KPI’s are multithreaded ¤ Memory Protection/ Demand Paging Optional. ¤ Development and Deployment on the same host.
n OS support for compilers/debuggers
¨ VxWorks ¤ Monolithic Architecture. RT Posix compliant. Cross development
System
¨ pSOS - Object Oriented OS