50
1 CP in Electronic Design Automation (EDA) (Java Constraint Programming) JaCoP solver Radoslaw (Radek) Szymanek

1 CP in Electronic Design Automation (EDA) (Java Constraint Programming) JaCoP solver Radoslaw (Radek) Szymanek

  • View
    228

  • Download
    2

Embed Size (px)

Citation preview

1

CP in Electronic Design Automation (EDA)

(Java Constraint Programming) JaCoP solver

Radoslaw (Radek) Szymanek

2

Outline

• Introduction

• Embedded Systems (CP applications)

• JaCoP

• “free” cake

3

4

5

6

CP in EDA

Constraint-DrivenDesign Space Exploration

for Memory-DominatedEmbedded Systems

Radoslaw Szymanek

7

Embedded Systems

• Processor based system

• Integral part of larger system

• Specific functionality

• Heterogeneous architecture

• Heterogeneous requirements

8

Application

• Data dominated application

• Application model at high abstraction level

• Annotated task graph

• Heterogeneous constraints

9

Processing Architecture

P1

ROM

RAM

P2

ROM

RAM

P3

ROMRAM

A1

RAM

B1

L1

L2

• Heterogeneous units

• Resource view

• Trade-offs

• Architecture Selection

10

Execution Scenario

ErrCor

D4

ASIC

Cancel

D2

DSPUI

D1

P

D3

D3

DSP

D4 D5 D6

Scrabling

Encoding

Decoding

C3

Descrabling

C4

11

Design Flow

Specification Design Execution

Application

(C/C++, SystemC)

•Architecture Selection•Task Assignment•Task Scheduling

Application

TaskGraph Model

•Pareto Diagram Composition•Data Assignment•Data Access Scheduling

Application

ExecutionScenario(s

)

12

Motivation

• Memory contributes most to cost, power consumption, and application execution time

• Exploration of different resource trade-offs (finds efficient execution scenarios)

• Constraints during the system synthesis are abundant (Specify, Explore, Refinement)

• Synthesis problem often changes so we need an easy method to extend, understand, and employ optimization suite

14

Schedule length versus cost (I)

• Architecture selection determines the schedule makespan

• We choose an architecture and optimize schedule makespan for it

• Heterogeneous application, architecture, and constraints

Task2

Task1

Task3

Task2

Task1

Task3

PU 1

PU 2PU 1

timetime

15

Schedule length versus cost (I)

• Uses meta search heuristics to search only part of the design space search – partial search

ExploreArchitecture

selection

ExploreAssignment

ExploreScheduling

1st best solution

16

Schedule length versus cost (I)

• Divide and Conquer based on consecutive refinement

2nd best solution

ExploreArchitecture

selection

ExploreAssignment

ExploreScheduling

17

Schedule length versus cost (I)

• Each exploration step uses results from previous steps

3rd best solution

ExploreArchitecture

selection

ExploreAssignment

ExploreScheduling

19

Memory versus Execution Time (II)

• Faster execution usually requires more data memory

DataT2

DataT3

Mem

ory

Ad

dre

ss

DataT1

DataT2

DataT3

DataT1M

em

ory

Ad

dre

ss

time

time

ParallelExecution

SequentialExecution

20

Data3

Memory versus Execution Time (II)

• Scheduling with data memory placement so memory fragmentation problem is taken into account

Data2

Data1Mem

ory

Ad

dre

ss

time

21

Memory versus Execution Time (II)

• Scheduling with data memory placement so memory fragmentation problem is taken into account

Data3

Data2

Data1Mem

ory

Ad

dre

ss

time

22

Memory versus Execution Time (II)

• Adaptive and estimate guided heuristic (criteria memory consumption or execution time)

• Look-ahead and backtracking capabilities

Memory bottleneck?

Reduce Execution Time

Reduce Memory Usage(backtrack, consume

data)

No Yes

23

Memory versus Execution Time (II)

• Algorithmic pipelining to improve throughput

MemorySequential

Data2

Data1

Data3

Data2

Data1

Data3

Data2

Data1

Data3

Mem

ory

Pip

elin

ing

time

Data2

Data1

Data3

24

Partial Assignment Technique (III)

• Reduce the problem size to simplify task assignment and task scheduling

• Clustering – simplifies the model

T1T2

T3

T4

T5

T7

T8T6

T1T2&T4

T3&T5

T7

T6&T8

25

Partial Assignment Technique (III)

• Clustering can cause deadlock, not all groups of tasks are allowed

• Linear groups of tasks are not optimal any longer if resources such as memories are present

T1

T2

T3

T1&T3

T2

26

Partial Assignment Technique (III)

• Problem simplification through adding constraints; not model simplification

• No deadlock problem, more refine simplifications

• Better than clustering with linear-clusters

T1

T2

T3

T1

T2

T3

PT1= PT3

27

Memory Bandwidth

• A major bottleneck in many data-dominated applications

• Processor often waits for data – latency or bandwidth

• Actual bandwidth depends on access patterns and data assignment

• Higher bandwidth? – more memories– better utilization

28

Memory Architecture (IV-V)

• Most significant resource

• Bandwidth bottleneck

• Energy and timing considerations

• Complex memories such as SDRAM

• Multiple memories

::

row n row n

row 1 row 1

page 1

page 2

SDRAM

port

Bank 1 Bank 2

29

Memory model for SDRAM (IV-V)

• considered as a resource since– Fixed maximal size– Fixed number of page buffers– Fixed maximal bandwidth

B1 B2 B3 B4

………. ……….S1 S2 S3 S4

P1 P2

T1

T2

Time Window

time

30

Energy vs. Execution Time (IV-V)

D1 D2

D3

D4 D5

D6

D7

31

Energy vs. Execution Time (IV-V)

SDRAM 1 SDRAM 2

D1 D2

D3

D4 D5

D6

D7

32

Energy vs. Execution Time (IV-V)en

ergy

time

exploration

Application Pareto Diagram

D1 D2

D3

D4 D5

D6

D7

SDRAM 1 SDRAM 2

33

Task Pareto Diagram (IV-V)

SDRAM 3

SDRAM 2

SDRAM 1

time

ener

gy

34

SDRAM 3

SDRAM 2

SDRAM 1

Task Pareto Diagram (IV-V)

time

ener

gy

35

SDRAM 3

SDRAM 2

SDRAM 1

Task Pareto Diagram (IV-V)en

ergy

time

36

Conflict Graph (IV-V)

• Specifies assignment constraints for different tasks’ execution options

• Memory/Page conflict edge

• Memory/Bank compatibility edge

SDRAM Y

SDRAM X

memorycompatibility

pageconflict

memoryconflict

37

Composition (IV)en

ergy

time

exploration

Application Pareto Diagram

SDRAM 1 SDRAM 2

D1 D2

D3

D4 D5

D6

D7

39

Energy vs. Execution Time (IV)

• Heuristic for trading bandwidth and assignment constraints between tasks to achieve efficient application execution

• Scheduling estimates, data assignment feasibility check

• Memory oriented application model (e.g. SDRAM)

time time time

en

erg

y

en

erg

y

en

erg

y

40

SchedulingOptimizatio

n

Iterative Optimization (V)M

em

ory

Str

uct

ure

Assignment

Optimization

Ap

plic

ati

on

Task

G

rap

h

ParetoCompositio

n

weightsadjustment

Non valid CGComposition

removal

Sub

op

tim

al

Poin

ts r

em

oval

Parallelizationconstraints

Ap

plic

ati

on

Pare

to D

iag

ram

41

Summary of CP applications in EDA

• We considered data-dominated applications and memory issues

• We showed that CP framework can be efficiently used as an optimization framework for modeling and solving embedded system synthesis problems

• We proposed and evaluated different techniques, heuristics, and models for system level synthesis (e.g., PAT)

• We addressed resource and optimization trade-offs

42

Radek Szymanek

(Java Constraint Programming) JaCoP solver

43

Outline

• Basics

• Features

• Marketing stuff

• Typical misconceptions

• Applications

• Licensing

44

Basics

• written by only two people (Krzysztof Kuchcinski and Radoslaw Szymanek)

• entirely based on Java

• the process of developing JaCoP began in 2001

• it is under continuous development

• it has slightly above 20 thousands lines of code

• there is no GUI (just core engine)

45

Basics

• it can be easily plug-in in any other Java based application

• it has reasonable performance

• scheduling application problems from electronic design automation industry influenced development

• it has global constraints (scheduling related)

• it was already used in several research facilities

46

Features

• global constraints - alldifferent, cumulative, diff2, element, and circuit - often available in different flavors (gcc in plans ;))

• it is rather simple to extend

• Application Programming Interface (API) generated using JavaDoc is available

• small JaCoP guide is also available

47

Marketing stuff ;)

• it has simple and convenient API

• it was already tested in different situations (quite robust)

• small footprint (around 200k in jar file)

• it keeps getting better ;)

• great vehicle for research

• complex data structures/reuse of already computed information

48

Typical misconceptions

• it is Java based so it must be really slow (garbage collector is a sweet thing)

• it must be hard to extend it (extensions are easier than with other industrial solvers and extensions can be more efficiently implemented)

• it must be hard to learn (experience with any other solver suffice, it is used for teaching purposes in Sweden and Poland)

49

Applications

• JaCoP authors own research published at EDA conferences and journals (scheduling problems)

• Los Alamos National Laboratory (synthesis of FPGA based designs)

• Kunliga Tekniska Hogskola (KTH) research in the field of Network on Chip (NoC)

• First industrial application on the way ;)

50

Licensing

• It is free for research and it will always be ;)

• commercial applications require fee per contract basis (at least 25euro AND at least 1% contract value), paid when contract is realized

• any further distribution requires notification of JaCoP licensing terms

• any extension which does not required reverse engineering (code is obfuscated) and keeps JaCoP in its original form is allowed

51

Licensing (Special terms for 4C)

• the source code is available, right to modify and share source code within 4C

• No possibility to distribute your own version of JaCoP outside 4C (to keep from forking – like Java)

• authors are eager to incorporate any improvements you suggest/make so you can distribute your own application with standard JaCoP library on normal terms

52

Research Ideas (cooperation)

• Global constraints (how to efficiently compute consistency methods, iterative computation)

• Visualization and explanations within global constraints, extending constraint functionality

• SALSA type of search framework for developing search methods (coarse grain search methods)

• Any of this topic is of interest for me, few more ideas piled on the stack (playing with your own solver gives you plenty of ideas)

53

Thanks!

Questions?