Upload
vuhanh
View
223
Download
0
Embed Size (px)
Citation preview
© K. H. Ecker, T.U. Clausthal. Sept. 2000 4. Hardware Platform: Real-Time Requirements 4-1
4. Hardware Platform: Real-Time Requirements
Contents:
4.1 Evolution of Microprocessor Architecture 4.2 Performance-Increasing Concepts 4.3 Influences on System Architecture 4.4 A Real-Time Hardware Architecture
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-2
Obviously, processors have to fulfill the time constraints of the technical process Microprocessors are often integrated in technical processes "embedded systems"
→ r-t systems are the most innovative application areas for today's microprocessors
Question: − architectural concepts of microprocessors; − relevance for r-t systems?
Silicon technology: driving force in processor development increase of integration density; price reduction
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-3
4.1 Evolution of Microprocessor Architecture • CISC versus RISC RISC characteristics:
− almost all instruction need one clock cycle − arithmetical and logical operations are confined to registers; load/store
operation for accessing memory − large number of general purpose registers on chip − reduced instruction set: about 40 or less instructions
as compared to over 120 in CISC
− instruction format has fixed length • Register organization performance of procedure calls stack-, code-, data-cache
register windows (for enhancing program branches)
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-4
• Interrupt system
are very important in r-t systems because of time-related interaction with the environment
conventional interrupt systems of microprocessors: software-programmed interrupt controllers more recent developments:
interrupt controllers are integrated in the processor
• Hardware support for multitasking and multi-user environments
- virtual memory management mapping of virtual addresses to physical addresses:
= segmentation (Motorola 68010, Intel 80286) = paging (Motorola 68030, Intel 80386) = segmentation and paging (Motorola 88000, Intel 80386, AMD 29000)
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-5
- memory protection reasons for memory protection:
= support for simplified debugging and testing by run-time checks and exception handling
= protection of operating system from erroneous application programs = protection of operating system and application programs against illegal
access - realization
= access to main memory pages is checked by the MMU = introduction of privilege levels with different access rights is used to
protect different system components against each other
Trend: integration of memory protection into processor chips; two privilege levels: user mode/supervisor mode
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-6
• Multitasking support
Operating systems are significantly determined by their multitasking capabilities New microprocessors offer more and more multitasking support integrated in VLSI technology
Today, we find two levels of on-chip support: - some processors offer the notion of tasks and explicit context switches
implemented in hardware (Intel 80386, Weitek 32100) - other processors have even implemented complete scheduling algorithms in
hardware (Inmos T800)
In typical r-t systems: large number of concurrently running tasks; high rate of context switches
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-7
• Debugging and monitoring
Tools for software development: - single stepping, breakpointing - event trigger logic - breakpointing and tracing based in different types of events - logical and time related event combination - and others
In typical r-t systems: Increasing complexity of software, the integration of these features into the processor chip is very important
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-8
4.2 Performance-Increasing Concepts
In MIPS, performance of microprocessors has increasing by a factor of 2.25 each year in the past
improvements in silicon technology, in architectural concepts; integration of cache, and on-chip pipelining
4.2.1 Pipelining
Multiple phases of different instructions are executed in parallel; jumps, branches, procedure calls/returns, interrupts and context switches
... cause a pipeline "flush" Pipeline flushes can be reduced by
- code reorganization (delayed execution of jumps, delayed branches)
- doubling of the first pipeline
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-9
- prediction of probability of branches based on heuristics (branch prediction)
- adequate register allocation to avoid pipeline flushes
New trends in processor design: more sophisticated concepts such as: superpipelined, superscalar, long instruction word (LIW), very long instruction word (VLIW)
In typical r-t systems: the high interrupt rates and context switches decrease the performance
of pipelines substantially
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-10
4.2.2 Cache memories ... used to avoid memory-access latency
In addition: cache memories are a precondition for enhancing branch prediction Cache efficiency is measured in hit rates:
Code cache: - 512-byte cache: approx. 80% hit rate - 2-Kbyte cache: approx. 95% hit rate
Data cache: - 4-Kbyte cache: approx. 80% hit rate
In typical r-t systems: - high interrupt and context switch rates cause useless entries in the cache
and consequently a low hit rate
+ very important routines of real-time programs may fit completely into on-chip caches: execution without misses
"freezing the cache"
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-11
4.2.3 Modular design to reduce and manage the complexity of new microprocessors
Modularity of new designs: structure the processor design into well-defined reusable modules - submodules are independent and can be developed separately
- new processors may be combined easily from already existing modules - processor can be distributed among several chips
- modules can be tuned independently of each other with respect to different requirements
- application-specific or special-purpose modules can be added easily
Adequate definitions of interfaces between modules r-t systems: future "real-time" processors are modifications of "standard" microprocessor
families integration of microprocessor cores in application-specific integrated circuits
(ASICs)
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-12
4.2.4 Testability
Higher integration and complexity of microprocessors require new ways of chip testing - complete testing not possible due to the huge number of required test
patterns - pin limitations: increasingly complicated to access internal structures from
outside - testing at the board level becomes more difficult because of higher board
integration, technique of surface-mounted-devices
On-chip test circuits: tests are initiated from the outside via additional pins
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-13
4.3 Influences on System Architecture
Relationship between processor and major components: • Processor bus interface between processor and memory: bottleneck
- separate instruction and data buses - burst mode operation; pipelined transmissions - 64 or 128 bit buses
• Multiprocessors architectures Three classes of available multiprocessor architectures: - homogeneous parallel computers based on identical microprocessors in each
processor node and non-bus-based interconnection networks e.g. hypercube, tree, array
- tightly coupled processors with specific buses and global shared memory - loosely coupled processors based on standard bus systems Ebus; Multibus I, II, Futurebus
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-14
• Coprocessors
... for special purposes
Coprocessors enhance system performance by off-loading the CPU Types of coprocessors: - network coprocessors - direct memory access controllers - I/O processors - graphics coprocessors - floating point arithmetic coprocessors
Interface between processor and coprocessor: tightly coupled (CPU specific) or loosely coupled (CPU independent)
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-15
• Reliability and fault tolerance
Reliability and availability of microprocessor systems decreases with increasing number of components
Solution: integration of fault tolerant concepts Example: integration of master/checker circuits in CPUs
r-t systems: very important in case of security-sensitive applications
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-16
4.4 A Real-Time Hardware Architecture
4.4.1 A Basic Architectural Concept
Early days of real-time processing: - conventional von Neumann computers - adaptation to the real time application by including process peripherals and
externally available interrupt lines - all other real time requirements were met by software
operating system, carefully programming
In this chapter: development of a concept of a hardware platform on which predictably behaving real-time systems may be based
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-17
General hardware requirements: - known time for each machine instruction - hardware must not introduce unpredictably long delays - fail-safe hardware: support of fault detection; predictable graceful degradation; recover within predictable intervals Consequences: - simple, well-defined architecture, - comprehensive instruction set, - no features like pipelining, caches, virtual memory or general DMA
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-18
Proposed solution: • task-oriented hierarchical storage administration scheme, • DMA without cycle stealing • explicit timing of I/O facilities • two-processor architecture:
− general task processor for user tasks and OS tasks that interfere with user tasks mainly supervisor shell services, e.g. data exchange with peripherals file management, as initiated by the user tasks
− co-processor for the OS kernel firmware: responsible for system functions events, time and task management, communication, synchronization
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-19
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-20
4.4.2 Layered Structure of R-T Operating Systems
Basic architectural concepts: • processors for task execution • separate co-processor for the os-kernel and firmware
interrupt handling task management communication synchronization time administration I/O routines operator interface
distinction between operating system nucleus OS processes of the first kind by interrupts
and the operating system shell OS processes of the second kind handled as used tasks
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-21
Advantages: • normal program flow is only interrupted when required by the
scheduling algorithm no unnecessary context switches • event-driven tasks are executed in a way that disturbs other active tasks
as little as possible • operating system overhead becomes predictable • no task preemption due to occurring events event: immediate reaction required the co-processor provides an independently working
event recognition mechanism
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-22
Three layer structure of a co-processor
1. Hardware layer accurate r-t management based on high resolution clock exact timing of operations separate programmable interrupt generator for software simulation event representation by storage element and latch for time of occurrence synchronizer representation shared variable representation
2. Primary reaction layer recognition of events (interrupts, signals, time events, status transfer of synchronizers, value changes of shared variables) initiation of secondary reactions recording of events for error tracking management of time schedules and critical instants
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-23
3. Secondary reaction layer deadline-driven processor scheduling with overload handling task oriented hierarchical storage management execution of (secondary) event reactions (tasks) synchronizer management shared variable management acceptance of requests initiation of processor activities
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-24
Realization of communication between general processor and co-processor communication operations require: − first-in-first-out buffers − shared variables, system internal data: e.g. task control blocks − common memory area directly accessible by all system components
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-25
4.4.3 Predictable Storage Management
Situation: task software ... usually rather small pieces of software "paging elements" for fast and predictable reaction times:
entire task segment should be loaded into main memory each time it is needed
Task-oriented hierarchical storage administration: main storage is divided into
− an area for the supervisor (no paging) − shared data structures (no paging) − K ≥ 2 page frames (paging)
frame size: should keep the code of a single task
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-26
Influence of number of page frames: (T1, ..., Tnt) ... list of ready tasks at time t, in EDF order
loading into main memory: subset B = {Ti | i = 1, ..., min{K, nt}} (K ≥ 2)
each task of B is assigned a free frame T1 ∈ B is running
If a task with earlier deadline as some task in B arrives: task replacement During execution of T1: next task (T2) can be "paged in"
code of T2 should latest be available as soon as T1 terminates
Actual choice of K depends on: − number of I/O channels available for paging − transfer time − average execution time of tasks − frequency of suspension of the running task due to I/O and synchronization when other tasks are processed while the task is suspended
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-27
4.4.4 Direct Memory Access
Cycle stealing: slows down other activities in the system indirectly and unpredictably on the other hand: DMA speeds up I/O transfers of large blocks of data but reaction times in control applications cannot be guaranteed in the presence of DMA
DMA without cycle stealing: − main storage is organized in several independent modules processor operates on one module, DMA on another module dynamic bus subdivision
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-28
temporary (and programmed) separations of bus sections
bus accesses in isolated sections only
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-29
− dynamic RAMs refreshing can be connected with DMA
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-30
4.4.5 Precise Timing Requirement: formulation of the following conditions: − data transmission from the r-t system to the environment at specified
times − data entering the r-t system: time instant must be recorded to ascertain reaction times
Low-level statements for timing: TAKE variable FROM source AT clock_expression SEND expression TO sink AT clock_expression ... for precisely timed program initiated I/O
Externally triggered input operation with time stamp: TAKE variable FROM source RECEIVED clock_variable
© 2003 K. H. Ecker, T.U. Clausthal. 4. Hardware Platform ... 4-31
Summary of Chapter 4
Performance increasing concepts for microprocessors: RISC architecture on-chip interrupt controllers operating system support multitasking support pipelining cache memories testing
Not all these concepts are necessarily well suited for real-time systems
On the level of system architecture, enhancement concepts concern the processor bus co-processors, multiprocessors, integration of fault tolerant solutions