14
RALF 3 - Software for Embedded High Performance Architectures Ivica Crnkovic Chalmers University of Technology & Mälardalen University Software for Competitiveness ‐ Big Data and Other Frontiers Stockholm, Nov 14 2017

Software for Competitiveness ‐Big Data and Other Frontiers ... · massive computations on heterogeneous systems • Optimized synthesis adjusted to a specific computation platform

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Software for Competitiveness ‐Big Data and Other Frontiers ... · massive computations on heterogeneous systems • Optimized synthesis adjusted to a specific computation platform

RALF 3 - Software for Embedded High Performance Architectures

Ivica CrnkovicChalmers University of Technology & Mälardalen University 

Software for Competitiveness ‐ Big Data and Other FrontiersStockholm, Nov 14 2017

Page 2: Software for Competitiveness ‐Big Data and Other Frontiers ... · massive computations on heterogeneous systems • Optimized synthesis adjusted to a specific computation platform

Challenge: Processing big amount of data in real‐time

Ralf 3 and the Society

Page 3: Software for Competitiveness ‐Big Data and Other Frontiers ... · massive computations on heterogeneous systems • Optimized synthesis adjusted to a specific computation platform

Sensors

Sonar

Camera

FPGA

MuticoreCPU

GPU

Performance = f(FPGA, MCPU, GPU)Response time = g(FPGA, MCPU, GPU)Energy Consumption = j(FPGA, MCPU, GPU)

and…Performance = f(FPGA, MCPU, GPU, SA)Response time = g(FPGA, MCPU, GPU, SA)Energy Consumption = j(FPGA, MCPU, GPU, SA)SA = Software architecture

Improved performance on dedicated HW platforms

Page 4: Software for Competitiveness ‐Big Data and Other Frontiers ... · massive computations on heterogeneous systems • Optimized synthesis adjusted to a specific computation platform

Sensors

Sonar

Camera

FPGA

MuticoreCPU

GPU

Goal

Improve the (software) system performance by utilizing computing capabilities of the underlying HW platform 

Page 5: Software for Competitiveness ‐Big Data and Other Frontiers ... · massive computations on heterogeneous systems • Optimized synthesis adjusted to a specific computation platform

SensorsVisualiza-tion and

actuators

N x CPU

M x GPUFPG

A3D-sensorVision

Sonar

...

System

Code synthesisAllocation mapping

Components and software deployment

Software components

Code

Page 6: Software for Competitiveness ‐Big Data and Other Frontiers ... · massive computations on heterogeneous systems • Optimized synthesis adjusted to a specific computation platform

SensorsVisualiza-tion and actuators

n x CPU

m x GPUFPG

A3D-sensorVision

Sonar

Time: ...Memory: ...Energy: ...

Code

Allocation mapping

Software components

...

Models

System

n x CPU

m x GPUFPG

A

HW model

EFPs

Code synthesis Performance: ...

Timing: ...

System EFPs

1) Component specifications in heterogeneous systems

• Metamodels for SW and HW with hardware and software partitioning and components allocations.

• Model‐level analysis methods for timing properties and resource usage information.

Page 7: Software for Competitiveness ‐Big Data and Other Frontiers ... · massive computations on heterogeneous systems • Optimized synthesis adjusted to a specific computation platform

SensorsVisualiza-tion and actuators

n x CPU

m x GPUFPG

A3D-sensorVision

Sonar

Time: ...Memory: ...Energy: ...

Code

Allocation mapping

Software components

...

Models

System

n x CPU

m x GPUFPG

A

HW model

EFPs

Code synthesis Performance: ...

Timing: ...

System EFPs

2) Semi-automated allocation of components to hardware

• Allocation optimization methods, targeting different aspects of the problem and using different optimization techniques.

Page 8: Software for Competitiveness ‐Big Data and Other Frontiers ... · massive computations on heterogeneous systems • Optimized synthesis adjusted to a specific computation platform

SensorsVisualiza-tion and actuators

n x CPU

m x GPUFPG

A3D-sensorVision

Sonar

Time: ...Memory: ...Energy: ...

Code

Allocation mapping

Software components

...

Models

System

n x CPU

m x GPUFPG

A

HW model

EFPs

Code synthesis Performance: ...

Timing: ...

System EFPs

3) Adaptive data structures and algorithms for massive computations on heterogeneous systems

• Optimized synthesis adjusted to a specific computation platform

Page 9: Software for Competitiveness ‐Big Data and Other Frontiers ... · massive computations on heterogeneous systems • Optimized synthesis adjusted to a specific computation platform

SensorsVisualiza-tion and actuators

n x CPU

m x GPUFPG

A3D-sensorVision

Sonar

Time: ...Memory: ...Energy: ...

Code

Allocation mapping

Software components

...

Models

System

n x CPU

m x GPUFPG

A

HW model

EFPs

Code synthesis Performance: ...

Timing: ...

System EFPs

4): Modeling and analysis of extra-functional properties in heterogenous systems

• An algorithm for estimating the Worst‐Case Execution Time (WCET) for thread‐parallel programs with shared memory and locks

Page 10: Software for Competitiveness ‐Big Data and Other Frontiers ... · massive computations on heterogeneous systems • Optimized synthesis adjusted to a specific computation platform

SensorsVisualiza-tion and actuators

n x CPU

m x GPUFPG

A3D-sensorVision

Sonar

Time: ...Memory: ...Energy: ...

Code

Allocation mapping

Software components

...

Models

System

n x CPU

m x GPUFPG

A

HW model

EFPs

Code synthesis Performance: ...

Timing: ...

System EFPs

5)The demonstrator

Page 11: Software for Competitiveness ‐Big Data and Other Frontiers ... · massive computations on heterogeneous systems • Optimized synthesis adjusted to a specific computation platform

Demonstrators

Sensors Visualiza-tion and

actuators

N x CPU

M x GPU FPG

A 3D-sensor Vision

Sonar

Time: ... Memory: ... Energy: ...

Code Software components

...

Models

System

N x CPU

M x GPU FPG

A

HW model

EFPs

Code synthesis

Performance: ... Timing: ...

System EFPs

Allocation mapping Sensors

Sensors

Demonstrator IUnderwater robot with a visual system and heterogeneous platforms

Demonstrator IIMicrovawe Mamacell with massive parallel computation 

Page 12: Software for Competitiveness ‐Big Data and Other Frontiers ... · massive computations on heterogeneous systems • Optimized synthesis adjusted to a specific computation platform

SensorsVisualiza-tion and

actuators

N x CPU

M x GPUFPG

A3D-sensorVision

Sonar

Time: ...Memory: ...Energy: ...

CodeSoftware components

...

Models

System

N x CPU

M x GPUFPG

A

HW model

EFPs

Performance: ...Timing: ...

System EFPs

The research results

Code synthesis

Allocation mapping

MultcoreCPU/GPUHPCFoundations‐‐‐‐‐‐‐HW/SWMCDA 

WCET for Parallel execution(GPU)

Visualizationusing CPU/GPUCodeparallelization

MDE . Codegeneration CPU/GPU

FPGA ‐Object recognitionprogramming

GPU – scatteringcomputation

Page 13: Software for Competitiveness ‐Big Data and Other Frontiers ... · massive computations on heterogeneous systems • Optimized synthesis adjusted to a specific computation platform

Demonstrators

Sensors Visualiza-tion and

actuators

N x CPU

M x GPU FPG

A 3D-sensor Vision

Sonar

Time: ... Memory: ... Energy: ...

Code Software components

...

Models

System

N x CPU

M x GPU FPG

A

HW model

EFPs

Code synthesis

Performance: ... Timing: ...

System EFPs

Allocation mapping Sensors

Sensors

Demonstrator IUnderwater robot with a visual system and heterogeneous platforms

Demonstrator IIMicrowave mammography with massive parallel computation 

Page 14: Software for Competitiveness ‐Big Data and Other Frontiers ... · massive computations on heterogeneous systems • Optimized synthesis adjusted to a specific computation platform

Demonstrator I – Platform development

MEM02Gb DDR3

USB0

MEM11Gb DDR3

microSD

2x GE PHY

QSPI flash

PLFPGA fabric

ZynQ 7020

 

Card edge connector (PCIe x16)

USB1

2x USB-Serial

USB2 

GE0 (RJ45) 

GE1 (RJ45)

2x FE PHY

FE0 (RJ45) FE1 (RJ45)

FE-Switch

Power supplies

USB PHY

PS

APU2x ARM Cortex-A9

CPU

Up to 16 GB DDR3ECC

>256 MBDDR2ECC

Quad (4) Corex86_64 – 64 bit CPU

(2.0 GHz)

GPU500 MHz2 Gpixel/s

160 GFLOPS

12 MgateFPGAEncryption

DSP

UNIBAPGIMME3+

Gimme I Gimme 2 Gimme 3