13
Heterogeneous MPSoC includes different kinds of processors (DSP, microcontroller, ASIP, etc.) and different communication schemes Modern cell phones may have four to eight processors, including one or more RISC processors for user interfaces, protocol stack processing, and other control functions; a DSP for video encoding and decoding and radio interface; an audio processor for music playback; a picture processor for camera options; and even a video processor for new video-on-phone capabilities The software design for MPSoC is more complex than a simple software compilation The software design represents the process of producing executable software in the form of a binary code, for a specific architecture The MPSoC hardware architecture is made of several interconnectedhardware and software subsystems The CPU (central processing unit) also known as processor core, processing element, or shortly processor executes programs

Heterogeneous MPSoC Docx

Embed Size (px)

DESCRIPTION

good

Citation preview

Heterogeneous MPSoC includes different kinds of processors (DSP, microcontroller, ASIP, etc.) and different communication schemesModern cell phones may have four to eight processors,including one or more RISC processors for user interfaces, protocol stack processing,and other control functions; a DSP for video encoding and decoding andradio interface; an audio processor for music playback; a picture processor for cameraoptions; and even a video processor for new video-on-phone capabilities

The software design for MPSoC is more complex than a simple software compilation

The software design represents the process of producing executable softwarein the form of a binary code, for a specific architecture

The MPSoC hardware architecture is made of several interconnectedhardware and software subsystemsThe CPU (central processing unit) also

known as processor core, processing element, or shortly processor executes programsstored in the memory by fetching their instructions, examining them, and thenexecuting them one after another

while the multicoreSW-SS can integrate several processor cores in the same subsystem, usuallyof same type. a scalable global interconnection network,such as bus or network on chip (NoC).

Homogeneous MPSoC architectures are made of identical software subsystemsincorporating the same type of processors. In the heterogeneous MPSoC architectures,different types of processors are integrated on the same chip, resulting indifferent types of software subsystems.

On a single processor core node, the multithreading generally occurs by time slicing, wherein a single processor switches between different threads. In this case, the processing is not literally simultaneous, as the single processor is doing only one thing at a time. On a multi-core processor subsystem, threading can be achievedvia multiprocessing, wherein different threads can run literally simultaneously on different processors inside the software node

Functions of the hardware dependent software

It also provides services for resource management and sharing, such as scheduling the application tasks on top of the available processing elements, inter-task communication,external communication, and all other kinds of resource management and control, such as hardware drivers or boot strategyFunctions of the OS: When a task is ready for execution and it is selected by the scheduler of OS according to the scheduler algorithm, the OS is also responsible to perform the context switch between the currently running task and the new task. The context switch represents the

process of storing and loading the state of the CPU in order to share the available hardware resources between different tasks.

The interrupt handler is another OS service used for interrupts management. The interrupts representa way to avoid wasting the processor’s execution time in polling loops waitingfor external events. Polling means when the processor waits and monitors a deviceuntil the device is ready for an I/O operation.

A microprocessor executes a collection of machine instructions that tell the processorwhat to do. Based on the instructions, a microprocessor does three basicactivities:

– Using its ALU (arithmetic/logic unit), a microprocessor can perform mathematicaloperations like addition, subtraction, multiplication, and division. Modernmicroprocessors contain complete floating-point processors that can performextremely sophisticated operations on large floating-point numbers.– A microprocessor can move data from one memory location to another.– A microprocessor can make decisions and jump to a new set of instructions based on those decisions

The processor can perform a large set of instructions. The collection of instructions is implemented as bit patterns, each one of which has a different meaning when loaded into the instruction register. A set of short words are defined to represent the different bit patterns. This collection of words is called the assembly language of the processor. This collection of words is called the assembly language of theProcessor. An assembler can translate the words into their bit patterns very easily, and then the output of the assembler is placed in memory for the microprocessor toexecute.

Examples of assembly language instructions for the Intel x86 processors are asfollows [85]: ADC (add operation with carry), ADD (add operation), AND (logicalAND operation), CLI (clear interrupt flag), CMP (compare operands), DEC

(decrement by 1), DIV (unsigned divide operation), IN (input from data port), INC(increment by 1), INT (call interrupt), JMP (jump), LEA (load effective address. operation), MOV (move), MUL (unsigned multiplication operation), NOT (logicalNOT operation), OR (logical OR operation), PUSH (push data into stack), RET(return from procedure), SHL (shift left operation), SUB (subtraction operation), orXOR (exclusive OR logical operation). The list of instructions that can be executed by a processor is called instruction set architecture, shortly ISA

These RISC “reduced instructions” require less transistors of hardware space than the complex instructions, leaving more room for general-purpose registers. Because all of the instructions execute in a uniform amount of time (i.e., one clock), pipelining is alsoPossible. The primary goal of CISC architecture is to complete a task in as few lines of assembly as possible. The complexinstructions are built directly into the hardware.

An application-specific instruction set processor (ASIP) is a stored memory CPU whose architecture is tailored for a particular set of applications. This specialization of the processor core provides a trade-off between the flexibility of a general-purpose CPU and the performance of a DSP. The programmability of the ASIP allows changes to the implementation,use in several different chips, and high data path utilization. The applicationspecificarchitecture provides smaller silicon area and higher computation speed

Another frequently used type of memory is the cache. The cache plays a key role in reducing the average memory access time of a processor. It also decreasesthe bandwidth requirement each processor places on the shared interconnect and memory hierarchyThe most popular SoC approach has been the ARM processor strategy [6]. ARM of a complete family of RICS processors, with the AMBA OCB. AMBA(Advanced Microcontroller Bus Architecture) is currently one of the most widely. used systems bus architectures for SoC applications (even for processors other than ARM). The on-chip bus connects a central processor and standard components like memory, peripherals, interruptunits plus some application-specific components. Among the advantages of this approach we have power savings, higher integration density, lower systems costs, easier procurement,

the system architecture design consists of partitioning theapplication into several parallel tasks and mapping the application tasks onto the target architecture.

The basic components of the system architecture model are the computation and communication components. The computation components consist of the application functions, while the communication makes use of generic I/Os, such asSimulink I/Os or SystemC signalsThe system architecture may be described in Simulink or SystemC. This chapterwill present the modeling style in the case of using the Simulink environment. The software at the system architecture level consists of a set of application functionsgrouped into tasks. Figure 3.9 shows three examples of application functions grouped into tasks. The hardware at the system architecture level consists of a set of abstracthardware and software subsystems that encapsulate the tasks aimed to be executedon those subsystems, and the different communication units introduced between thesubsystems to specify the communication protocol. The hardwarein the token ring system architecture model is represented by the processor subsystemsXTENSA-SS and ARM7-SS, and the inter-subsystem communication unitsCOMM1 and COMM2 that connect the two processor subsystems, as it is illustrated. The Simulink model is used as a reference model for debugging the application’s Algorithm. The system architecture model is used to validate the application’s algorithm through functional simulation. The application’s algorithm does not depend on the final operating system that will be running on the target processors but influences the performance after the application’s parallelization.

The MPSoC design process relies on several decisions and constraints related to hardware and software architecture, which can influence the overall performance of. Examples of hardware architecture decisions are as follows: number and type of processors, memory size, type of memories (local, global), type of communication network (point to point, bus, network on chip), communication latency, etc. Examples of software architecture decisions are: type of scheduling algorithm usedby the operating system for the tasks activation/deactivation, type of communication primitives (blocking or non-blocking semantic), real-time execution requirements, binary code size, synchronization mechanisms between the tasks running on the same processor,

The goal of performance evaluation at the system architecture level is to allow in an early phase of the design process profiling the communication and computation.Programming the complex MPSoC architectures and providing suitable softwaresupport (compiler and operating system) seems to be a key issue. This is due to thefact that either system designers or compilers will have to make the application codeexplicitly parallel to run on these architectures.

A first difficulty found in MPSoC design is how the applications running on these multi-processor architectures are decomposed into several processes/tasks and how these parallel tasks can share the same resources provided by the architectures.In particular, allocation of the computation resources (processing units) and storage resources (memories) is critical, as it dictates both performance and power consumption

The architecture specification has to include information related to the hardware resources of the target architecture, such as number and types of the available processors, size of the local and external memories, and possible communication paths/protocols between the different processors. The communication paths can be captured using the notion of graph, where the nodes are the hardware resources of the architectures that may be crossed during a data exchange between the processors.

Examples of nodes are the CPUs, co-processors, DMA engines, memories,local buses, or the global interconnect components (AMBA, NoC).

Virtual Architecture Design

The virtual architecture design consists of transforming the application functions into the final application tasks C code and mapping the communication onto the hardware resources available in the target architecture.

The key contribution in this chapter represents the virtual architecture definition, organization, and design, usingSystemC. The system architecture tasks made of application functions are transformed into the final application tasks SystemC code. These tasks codes designed in C are adapted to thecommunication mechanism through the use of adequate HdS communication primitives.The virtual architecture model is described using the SystemC language and is generated according to the parameters specified in the initial Simulink model