40
Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM virtual platforms for the exploration, the specification and the validation of critical embedded SoC

Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

Embed Size (px)

Citation preview

Page 1: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

Workshop - November 2011 - Toulouse

A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN

(Astrium)

SystemC/TLM virtual platforms Use of SystemC/TLM virtual platforms for the exploration, the specification and the

validation of critical embedded SoC

Page 2: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

2

OverviewContext

Separation of time & functionality presentationTimed TLM models Vs CABA models

Design Space Exploration with SystemC/TLM 2.0HW in the loop – Use of CHIPit®

Future prospectsOpen questions

Page 3: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

Context

Define a proper method to use SystemC/TLM for SoC modelling

Use SystemC/TLM for DSE (performance estimation, bottleneck identification…)

Use SystemC/TLM models for HW specification

Evaluate the selected methodology

Page 4: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

SystemC/TLM Usage Context

Define a proper method to use SystemC/TLM for SoC modelling

Use SystemC/TLM for DSE (performance estimation, bottleneck identification…)

Use SystemC/TLM models for HW specification

Evaluate the selected methodology

Page 5: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

Programmer’s View (PV) or functional simulationTime is not represented, only functionality is

modelled.

Functional synchronization is necessary. It is done at System Synchronization Points (SSP): configuration registers access, interrupts and all state alternating accesses.

Page 6: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

The need for time

Performance measurementsDesign Space Exploration

…how ???Precision?Modelling granularity?Simulation performance?

Page 7: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

The obvious solution: mixing time and functionalityIt works !!!

…but…Functional modifications cannot be verified without

having to verify all timed aspects as wellModelling granularity is hard to modify once it has

been setModules cannot be easily reused for other platforms

Page 8: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

8

Separation of time & functionality

Initiator port

Target port

MemoryT

ISSPV

ISS PVTMemory PVT

PV routerMemory

PV

ISST

Detailed busmodel

ISS Router Memory

Page 9: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

Functional simulation phaseTimed simulation phase

9

ISSPV

PV routerMemory

PV

Initiator port

Target port

ISST

ISS PVT

Detailed busmodel

MemoryT

Memory PVT

T= 0 nsT= 1 nsT= 2 nsT= 3 nsT= 4 nsT= 5 nsT= 6 nsT= 7 nsT= 8 nsT= 9 nsT= 10 nsT= 11 nsT= 12 ns

Page 10: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

Advantages and limitationsPV & T mixed

Modelling is “natural”. Platforms are simple.

Interrupts can be modelled easily

Granularity is fixedMixed debugging and no control

over simulation performanceReuse problem

PV & T separated

Parallel development and debug of reusable PV and T models

Granularity can be controlled easily (by changing T model)

Modelling is more abstract. Platforms are complex

Interrupts are harder to model

Page 11: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

TTP in the industry

Modelling is too complex to be used by architectsModules are not re-used enough to justify such a

modelling effortTraffic generators are enough for DSE. Detailed

functionality does not need to be specified for performance estimation.

HW specification is easier using cycle “approximate”/bit accurate models

In its current form, TTP cannot be used on an industrial scale:

Page 12: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

Timed TLM vs CABA modelsDifferent time modelling granularities:

CABA in HDL => available, but slow simulationsCABA in SystemC => not interesting (not available and

slow simulations)Timed TLM (SystemC AT) => preferred

A timed TLM model of an existing RTL IP has been build to evaluate the methodology and assess the necessary effort

RTL IP chosen = SDRAM memory controller, because:this is a central module in SoC architecture explorationsits timing behaviour is harder to determine than other

modules’ (AHB buses for example)

Page 13: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

SDRAM Memory ControllerThe Memory Controller is the

interface between the SoC bus and the external (on-board) memories

One access latency depends on: the access parameters the controller internal state

Objective for the timed model : the model should be

pessimistic=longer than the RTL+0 to +20 % timing accuracy

MCTL

AH

B SDRAM

SRAM

EEPROM

Page 14: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

Time analysis methodology

IDLE

ACTV

WRITEWRITE_SCRUB

READ_RMW,READ,

READ_SCRUB

ALL_PRE EARLYPRE SEARLYPRE

ALL_PRE(latepre)

RMW_RSENCODE

PWDOWN

RTL analysisRTL is composed of intricate cycle-

based State MachinesRequires manual extraction of

timing rulesMay need to duplicate the RTL

FSM in the TLM model Not interesting

Macroscopic analysisUsing RTL simulations to produce

timing informationEither guided

statistics choiceOr semi-automated

using scripts Elected method

Page 15: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

Macroscopic time analysisGuided time analysis

Timing data is extracted from RTL simulations(traces of all the timings + relevant parameters)

Rules are guessed by manually analyzing the traces……and then automatically tested against a calibration test setThis process iterates until the timing accuracy is satisfactory

Results of the time analysis iterationsThe parameters of the previous access also have a major

impact (in addition to the parameters of the current access)Some features interfere (refresh and automatic scrubbing)

Page 16: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

Timed Model ValidationThis timing model has been

checked against RTL on an extensive test setmore than 86000 transactionscomes from the RTL validation

test suite

Frequency Mistimed transactio

ns

Latency error

32 MHz 18% 12%

48 MHz 14% 17%

64 MHz 14% 18%

96 MHz 17% 17%

Validation resultsThe model is pessimistic (longer than the RTL)Latency error between 12%-18%

The model is too simple to be 100% exactBut the goal is to keep a high level of abstractionPossibility to increase the accuracy if necessary

Page 17: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

17

OverviewContext

Separation of time & functionality presentationTimed TLM models vs. CABA models

Design Space Exploration with SystemC/TLM 2.0HW in the loop – Use of CHIPit®

Future prospectsOpen questions

Page 18: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

Design Space Exploration with SystemC/TLM 2.0A simple image processing platform has been

designed to assess the use of SystemC/TLM for design space exploration

Page 19: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

AlgorithmImage spectral-compression

platformPerforms “subsampling” on

incoming data packets

Subsampled packets are then transferred to an auxiliary processing unit which performs a 2D-FFT (using a co-processor) and data encoding

Subsampling

Encoding

5N

10N

2D-FFT

5N

N

Input

Output

Page 20: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

Processing platform

Mem_a

DMA_aLeon_a

Mem_b

Leon_bDMA_b

FFT

IO

Page 21: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

Processing platform (cont’d)IO module generates an interrupt causing DMA_a to

transfer the input packet of size 10N to Mem_aAt the end of the transfer, Leon_a subsamples the

data and writes the result to Mem_aLeon_a configures DMA_b to transfer the result to

Mem_bAt the end of the transfer, Leon_b configures the

FFT module to perform a 2D-FFTLeon_b encodes the result and programs DMA_b to

send the result to the IO module

Page 22: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

SystemC implementationTLM-2 compliant (time & functionality are mixed)Data exchange is AMBA – bus accurate

(single/burst transactions, split)Data sizes are respected and packets are

identified by a packet ID.The Leon processor modules act as “smart”

traffic generators: they generate transactions in the correct order towards the appropriate targets.

OS tasks are simulated using SC_THREADs

Page 23: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

SystemC implementation (cont’d)No actual processing is performed. Processing time is

simulatedBus occupation, processing loads for all processing

units were measured accuratelyA system synchronization bug was identified => a “lock”

register has been added to lock DMA_b during its configuration

It was possible to observe the impact of the modification of HW parameters and the input data rate. DMA_a was identified as a bottleneck.

ABV could also be implemented using ISIS

Page 24: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

Example

Page 25: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

25

OverviewContext

Separation of time & functionality presentationTimed TLM models vs. CABA models

Design Space Exploration with SystemC/TLM 2.0HW in the loop – Use of CHIPit®

Future prospectsOpen questions

Page 26: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

HW in the loop – use of CHIPitCHIPit

Virtex-based development platformCustom extension boards (SDRAM, Flash, IO, …)UMRBus = practical & fast PC-CHIPit ready-made interface

Page 27: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

HW in the loop – use of CHIPitCHIPit can be used for :

Incremental validation flow SC/TLM testbench composed of multiple sub-blocks Some sub-blocks may run on hardware (FPGA) The others still run as software SC functional models Soft-hard inter-block transactions via UMRBus + extra SystemC/VHDL

Improved simulation speed 1000+ times faster is possible fewer soft-hard transactions = better improvement

CHIPit

soft

soft

soft

hard soft

hard

soft

Page 28: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

HW in the loop – use of CHIPitWhat happens on a transaction ?

Uncontrolled clock mode HW clock keeps working during a transaction SW clock and HW clock are not synchronised Easy to implement

Controlled clock mode HW clock is stopped upon each transaction, waiting for soft SW clock and HW clock are synchronised on transaction bounds Needed if inputs/outputs must observe precise relative timings Harder to implement, more timing issues Not possible for all designs : complex designs require extra care

SDRAM controller needs constant auto-refresh Inputs from extension boards may need immediate treatment

Page 29: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

HW in the loop – use of CHIPitUncontrolled clock example : whole system overview

Electronic board with inputs/outputs to other electronic systemsSDRAM for internal data storageASIC/FPGA for data processing

SDRAM memory

ASIC

Periph 1 Periph 2

Input 1

Input 2

Output 1

Output 2

Instrument 1

Instrument 2

Storage

RF comm

Electronic board

OBC

Page 30: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

HW in the loop – use of CHIPitUncontrolled clock example : ASIC internal view

Data processing composed of several sub-blocksSub-blocks perform independent tasksSequenced altogether with very few signals (eg. req/ack)

SDRAM memory

ASIC

Processing 1

Memory controller

CoreProcessing 2 Processing 4

Sequencer

req/ack req/ack

Input 1

Input 2

RX TX

Output 1

Output 2

OBC

FIFO FIFOFIFO

Page 31: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

HW in the loop – use of CHIPitUncontrolled clock example : ASIC re-modelling for HW

Sequencer control signals re-modelled as APB transactions Inter-block FIFOs splitted (FIFO->SDRAM and SDRAM->FIFO)FIFOs mapped on AHB buses at fixed addressesAdded DMAs to handle pipeline inputs and outputs from/to memoryDMA channels can perform any AHB transfer (eg. SDRAM<->FIFO)

SDRAM memory

ASIC

Processing 1

Memory controller

CoreProcessing 2 Processing 4

Sequencer

Input 1

Input 2

RX TX

Output 1

Output 2

FIF

O

APB

FIF

O

FIF

O

FIF

O

FIF

O

FIF

O

FIF

O

FIF

O

DMAs

AHBs

Page 32: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

HW in the loop – use of CHIPitUncontrolled clock example : ASIC re-modelling for SC

Use of TLM2 transactions between blocksSDRAM+controller merged into a memory abstraction modelSDRAM access ports re-modelled as AHB buses

ASIC SystemC model

Processing 1

Memory model

CoreProcessing 2 Processing 4

Sequencer

RX

FIF

O

FIF

O

FIF

O

FIF

O

FIF

O

FIF

O

FIF

O

FIF

O

TXDMA DMA

AHB bus(es) model

DMADMA DMA

Page 33: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

HW in the loop – use of CHIPitBenefits

Same C file used for both Gaut VHDL generation and SystemC full-soft emulation► intrinsic algorithm consistency between model and hardware

Few steps necessary from Gaut regeneration to FPGA synthesis and SC model compilation, scriptable for process automation► handy for fast algorithm exploration

Outcome: SystemC model executable, allowing choice at runtime between full-soft functional model and soft+hard co-simulation

$> scmodel SIMU input.bin output_simu.bin > log_simu.txt

$> scmodel CHIPit input.bin output_hard.bin > log_hard.txt

$> diff output_simu.bin output_hard.bin

$>

Page 34: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

HW in the loop – use of CHIPitLimitations

Still have to develop SystemC+VHDL for each new transactor Limits whole process automation Encourages the use of common transactor types (AMBA, etc)

Controlled clock mode much more complex to implement Encourages the design of independent blocks, inter-connected via a

few FIFOs or via a common memory Blocks with strong timing requirements on IO hardly compatible with

uncontrolled clock mode (better design with intelligent IO behaviour : req+ack, handshake, etc)

Implementation limited to actual CHIPit resources SDRAM bus width is static (cannot test larger bus than available) Custom extension boards required as early as algorithm exploration

Page 35: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

HW in the loop – use of CHIPitSceMi : the wanna-be standard for co-simulation

Formerly proposed by Cadence, now transferred to AcceleraDefines a C++ API for HW-SW co-simulation

Controlled clock / uncontrolled clock modes Function-based interface Pipe-based interface (C++ stream = hardware FIFO) Multi-threaded operation on software side

CHIPit SceMi library available Needs a supplementary licence Just a wrapper over UMRBus libraries to provide clock control All transactors still need to be coded by hand (SystemC+VHDL)

► still a lot of work to do before getting co-simulation working

Page 36: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

36

OverviewContext

Separation of time & functionality presentationTimed TLM models vs. CABA models

Design Space Exploration with SystemC/TLM 2.0HW in the loop – Use of CHIPit®

Future prospectsOpen questions

Page 37: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

Space industry applicability

SystemC/TLM is suitable for DSE with the use of HLS

Specification flow needs to be sorted out

Page 38: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

Future prospectsImportant need in development infrastructure:

Abstraction layer (architects are not TLM2 experts)Interrupts and streaming modelling (TLM is currently a

memory mapped platform oriented protocol)Build and assembly tools are neededWell defined modelling guidelines should be established

Page 39: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

Workshop - November 2011

Thank you

? ??

Any questions ?

Page 40: Workshop - November 2011 - Toulouse A.BERJAOUI (AKKA IS for Astrium) A.LEFEVRE & C. LE LANN (Astrium) SystemC/TLM virtual platforms Use of SystemC/TLM

Open questionsWho does the modelling? System, HW or SW

architect?

SW validation uses paper specs => Towards validation using HW based models in SystemC/TLM?

Towards a TLM3 standard? With embedded systems industrial partners such as Airbus and Astrium? (Business model?)