29
S2CBench : Synthesizable SystemC Benchmark Suite for High-Level Synthesis Benjamin Carrion Schafer 1 , Ansuhree Mahapatra 2 The Hong Kong Polytechnic University Department of Electronic and Information Engineering [email protected] 1 , [email protected] 2 @DARC_lab

S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

  • Upload
    lamphuc

  • View
    246

  • Download
    1

Embed Size (px)

Citation preview

Page 1: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

S2CBench : Synthesizable SystemC Benchmark Suite for High-Level

SynthesisBenjamin Carrion Schafer1, Ansuhree Mahapatra2

The Hong Kong Polytechnic UniversityDepartment of Electronic and Information Engineering

[email protected], [email protected]

@DARC_lab

Page 2: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

2

Outline

• Motivation for a Synthesizable SystemC Benchmark Suite• Requirements for S2CBench• A review of past HLS benchmarks• S2CBench overview

– 12 synthesizable design– 1 non synthesizable (tests floating point and trigonometric functions)

• Benchmark composition overview• Detail benchmark characteristics

– Size– Complexity– Arithmetic operations

• How to download• Conclusions

Page 3: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

Motivation for S2CBench

3

Vendor Tool Name Supported Languages

Cadence (Forte) Cynthesizer SystemC

Cadence C-to-Silicon C, C++, SystemC

Calypto CatapultC C++, SystemC

NEC CyberWorkBench C, SystemC

Xilinx Vivado HLS C, C++, SystemC

• Not evaluation standards

• Not enough HLS models for RTL designs to compare QoR

HLS tool evaluation

times

• Different HLS tools support different input languages

• One common language supported by all HLS tools: SystemC

HLS tool input languages

Page 4: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

Requirements for a HLS Benchmark

Must cover different domains of applications

Test specific optimization techniques of HLS tools

Avoid re-writing descriptions for tool evaluations

4

S2CBench enables the direct comparison of commercial HLS tools

Page 5: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

HLS Benchmarks: A Review

5

CHStone

HLS benchmark based on ANSI-C

ANSI-C not supported by all

HLS tools

C does not support fixed-

point operations

HLS92,95

Written in behavioral VHDL

Targets various optimization

options

Currently not supported by any HLS tool

MiBench, MediaBench

Written in C

Do not support fixed point operations

Do not test specific HLS

features

Page 6: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

S2CBench: An Overview

• Covers different domains:

Automotive

Security

Telecommunication

Consumer

• Contains Control dominant and Data dominant designs

6

12+1 SystemC Benchmarks which comply with latest SystemC synthesizable subset draft

12 Synthesizable1 non

synthesizable

• Contains trigonometric & floating-point operations

• Helps check the ability of tools to support them

Page 7: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

S2CBench: Features

7

Synthesis optimizations(e.g. loop unrolling, pipelining, function inlining, array synthesis)

Tool language support(e.g. templates, structures, fixed point data types)

Tool performance(e.g. synthesis run-time, accuracy of synthesis report)

Every application features a testbench along with associated test vectors

HLS optimizations

Synthesizer language support

Synthesizer performance

Page 8: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

S2CBench model

• Testbench functions:– TB sends stimuli data stored in files

(test vectors) to UUT

– TB receives the data and compares it against golden output (stored in file)

– TB reports if results match or not

• Option to dump VCD file

• Test vectors are modifiable

8

Page 9: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

S2CBench : 12+1 designs

9

Design Type Domain Optimizations Tested

qsort dd Auto/Ind Loops, arrays, functions pointers

sobel dd Auto/Ind Loops, functions, IO array expansion, multi-dimensional arrays expansion, fixed arrays (ROM, logic)

aes cipher dd Security IO array expansion, multi-dimensional arrays expansion , large fixed arrays

kasumi dd Security Multi-processes, delay report accuracy

md5c dd Security #define macros, delay report accuracy

snow3G dd Security Templates, delay report accuracy, function synthesis

adpcm cd Telecom Structure synthesis

FFT dd Telecom Floating point, trigonometric functions

FIR dd Consumer IO array expansion, arrays, loops, functions, sum of products

Decimation dd Consumer Resource sharing across loops, fixed point data types

Interp dd Consumer Polynomial decomposition, fixed point data types, sum or products

IDCT dd Consumer #include statement to initialize arrays, loops, functions,

Disparity cd/dd Consumer Hierarchical design, multi-dimensional array expansion, synthesis running time

Page 10: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

Detail Benchmark Contents for each Design

• Top module including UUT and TB(Main.cpp)

• Design description( <benchmark>.cpp/.h)

• Testbench module(tb_<benchmark>.cpp/.h)

SystemC files

• Test vector( input.txt)

• Golden output(outputs_golden.txt)

• Image files(some applications)Stimuli data

• Allows different make options•Make: generates executable binary(default)

•Make ‘wave’: binary having VCD file

•Make ‘debug’: binary with debug version

Makefile

10

Page 11: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

Quick – Quick Sort

• Description

sort design sorts data in ascending order using the well-known quick sort algorithm

• Optimization options to be tested

loop unrolling

array synthesis (register or memory)

function synthesis with pointer argument support

11

Page 12: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

Sobel

• Descriptionedge-detection algorithm that takes a bitmap image

directly as the input and returns a new bitmap image solely consisting of the edges of the original image.

• Optimization options to be tested nested loop unrolling and pipelining optimizations I/O ports expansion (expand inputs specified as arrays

to individual ports)multi-dimensional arrays expansionfixed arrays synthesized as logic or ROMspointer arguments to functions

12

Page 13: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

AES - Advanced Encryption Standard Cipher

• DescriptionAdvanced Encryption Standard Cipher encryption

algorithm performs AES encryption

• Optimization options to be tested Contains a large number of small for loops having

inter-loop data dependencies.

Input port expansion

Array synthesis (memory or registers)

Function synthesis (inline, goto)

Large fixed arrays synthesized as logic or ROMs.

13

Page 14: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

Kasumi

• Descriptionblock cipher algorithm used in mobile

communication systems

Composed of two sc_threads and multiple functions

• Optimization options to be tested Ability to synthesize large logic operations

Accuracy of estimating the critical path

Multi-process systems verification

14

Page 15: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

MD5C - Message Digest Algorithm

• Description

generates hash functions and check data integrity.

• Optimization options to be tested

functions synthesis

arrays of different bit widths

different levels of loop nesting

extensive use of define macros (language support)

15

Page 16: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

Snow3G

• Description

stream cipher that produces a key stream that consists of 32-bit blocks using a 128-bit key

Contains a variable length multiplication operation

• Optimization options to be tested

Support of templates.

Loops, functions and array synthesis

16

Page 17: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

ADPCM -Adaptive Differential Pulse-Code modulation (encoder part only)

• Description

accepts 16-bit Pulse Code Modulation (PCM) samples as input and converts them into 4-bit samples

• Optimization options to be tested

loop unrolling, function synthesis, array synthesis

support for structures synthesis

17

Page 18: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

FFT – Fast Fourier Transform

• Description

– Converts time/space to frequency and vice versa

– Not synthesizable as per latest synthesis draft

• Optimization options to be tested

– Support for floating point operations and trigonometric functions

18

Why is it included

Most HLS vendors do support floating-point, trigonometric operations

Helps evaluation engineers analyze the extent of support of these operations

Page 19: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

FIR – Finite Impulse Response Filter

• Description

– 10- tap FIR filter algorithm designed for 8- bit integer operations.

• Optimization options to be tested

– loop unrolling and pipelining

– automatic array expansion of the I/O ports

– pointers to functions

19

Page 20: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

Decimation Filter

• Description– 5-stage decimation filter. Consists of 5 FIR filters

cascaded together where the output of one stage is the input to the next stage.

• Optimization options to be tested – resource sharing of the Multiply Accumulate

(MAC) operations across loops

– fixed-point data types and its different rounding and saturation modes.

– Ability to preserve the SoP constructs

20

Page 21: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

Interpolation Filter

• Description

– 4-stage interpolation filter

• Optimization options to be tested

– automatic polynomial decompositions(Mathematical optimization of HLS tool)

– fixed- point data types and its different rounding and saturation modes

21

Page 22: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

IDCT - Inverse Discrete Cosine Transform

• Description

– Describes a finite sequence of data points as a sum of cosine functions of different frequencies

• Optimization options to be tested

– initialization of an array using #include statement

– loops, functions, array synthesis

22

Page 23: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

Disparity – Stereoscopic Disparity Estimator

• Description

– estimates the disparity in a stereoscopic image.

– It is the largest of all the designs and consists of 4 processes executed in parallel

• Optimization options to be tested

– Synthesis running time of the HLS tool

– Verification of Multi-process (threads) systems

23

Page 24: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

Detailed Benchmark Characteristics

Variety of operations

24

Page 25: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

Distribution of Number of Program Lines

25Note: Only lines of code of synthesizable description (not including testbench)

Page 26: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

Complexity distribution of designs

26

0

5

10

15

20

25

qsort sobel aescipher

kasumi md5c snow3G adpcm fft fir decim interp idct disparity

NU

MB

ER

APPLICATIONS

Processes

Function

No. of arrays

if statement

for loop

while loop

Page 27: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

Publicly Available for download

• www.s2cbench.org• http://sourceforge.net/projects/s2cbench/

27

Page 28: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

More Information

B. Carrion Schafer and A.Mahapatra, "S2CBench:Synthesizable SystemC Benchmark Suite for High-Level Synthesis “IEEE Embedded Systems Letters, Accepted for publication

YouTube Channel: DARClabify

28

Page 29: S2CBench : Synthesizable SystemC Benchmark Suite for · PDF fileS2CBench : Synthesizable SystemC Benchmark Suite for High-Level ... –TB sends stimuli data stored in files (test vectors)

Summary and Conclusions

• A benchmark suite in a common language supported by all major HLS vendors

• Each benchmark tests unique HLS features1. Tool language support

2. Synthesis optimizations

3. Tool performance

• Benchmarks include testbench with inputs, golden outputs and option to generate VCD file

• Publicly available at www.s2cbench.org or sourceforge.net

29