Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Supercomputing Systems AG Phone +41 43 456 16 00
Technopark 1 Fax +41 43 456 16 10
8005 Zürich www.scs.ch
Vision trifft Realität.
Smart Automotive Sensors
with acceleration on
FPGA and CPU
Workshop at University of Applied Sciences Ulm 26.6.2018
Felix Eberli, Department Head Embeded & Automotive
2 Zürich 06.06.2018 © by Supercomputing Systems AG PUBLIC
Will robots drive our cars soon?
3 Zürich 06.06.2018 © by Supercomputing Systems AG PUBLIC
But already in series as many driver assistant systems
• ADC (Distronic)
• Blind spot detection
• Break assist
• Pedestrian detection
• Park pilot
• Stop & Go Pilot
• Highway Pilot (steering assist)
• ….
• Lets see
5 Zürich 06.06.2018 © by Supercomputing Systems AG PUBLIC
Deep learning results
7 Zürich 06.06.2018 © by Supercomputing Systems AG PUBLIC
Why do we need accelerators for image processing?
• Low processing time
• Low Latency
• Low Power consumption
8 Zürich 06.06.2018 © by Supercomputing Systems AG PUBLIC
Software vs. Hardware
• FPGA imlpementation allows to implement massively parallel and pipelined
9 Zürich 06.06.2018 © by Supercomputing Systems AG PUBLIC
Control path vs data path
• Data path can be implemented in HW to reach acceleration
• Keep more complex control code on processor
10 Zürich 06.06.2018 © by Supercomputing Systems AG PUBLIC
Partitioning SW on processor vs FPGA-Fabric
11 Zürich 06.06.2018 © by Supercomputing Systems AG PUBLIC
Xilinx Zynq Ultrascale+
12 Zürich 06.06.2018 © by Supercomputing Systems AG PUBLIC
SCS Example Projects
• Acceleration on FPGA or ARM
13 Zürich 06.06.2018 © by Supercomputing Systems AG PUBLIC
Stereo accelerator for SGM in FPGA
14 Zürich 06.06.2018 © by Supercomputing Systems AG PUBLIC
Timestamping and PCIe with FPGA:
Measurement System recording 8 Cameras and 10GigE
at ~3 GByte/s
Cam SCS measurement
PC+ storage
12x Coax 12x Coax
Cam
NVIDIA platformDrive PX2
1G/10G ethernet switch
2x10G Base-T
Other sensors
15 Zürich 06.06.2018 © by Supercomputing Systems AG PUBLIC
SCS JPEG DECODER in FPGA Fabric
• Processing rate of up to 140 MSamples/sec on Spartan6 FPGA
• 12Bit / 8Bit version available
• Four Huffmann tables (fixed or extracted from header)
• Up to 8 quantization tables
• Support to decode several interleaved image stripes
• 3 color components
• Support 1 scan configuration and YUV 4:2:0 (Different format on request)
• Supports any image size up to 64kx64k
• Supports DNL and restart markers
Header processor(jpgd_header)
Huffman and run-length decoder
(jpgd_dehuff)
Dequantizer(jpgd_dequant)
De-zigzag and 2D IDCT(jpgd_idct_2d)
JPEG
imag
e
De-
stu
ffed
Hu
ffm
an
enco
ded
dat
a
Qu
anti
zed
DC
T co
effi
cien
ts
Deq
uan
tize
d
DC
T co
effi
cien
ts
YUV
imag
e d
ata
Quantization tables
Huffman tables
16 Zürich 06.06.2018 © by Supercomputing Systems AG PUBLIC
SwissFEL - The PSI Future Project
Impressions
19 Zürich 06.06.2018 © by Supercomputing Systems AG PUBLIC
Automotive FPGA Solutions
Pedestrian Detection
Stereo movie processing time reduced
from 2s / frame to 25 frames/s
Stixel Detection
21 Zürich 06.06.2018 © by Supercomputing Systems AG PUBLIC
significant speed-up in image post-processing time
significant power reduction
Virtex 6, 8 GByte DDR3, 4x GigE, PCIe (500MB/s)
Photogrammetry Solution
5 kW BLADE Server
1 Core: 95s per frame
SCS Virtex6 Board (100W)
1 Board: 1.6s /frame
25 Zürich 06.06.2018 © by Supercomputing Systems AG PUBLIC
Code Optimization on ARM
Preparation and Important Notes
• Prepare testbenches and carefully select your testcases
• Update the documentation of your code
• Always keep unoptimized code (compile-time switches)
• Use GIT or other version control system
• Check for high level algorithm changes before analyzing the innermost loop
26 Zürich 06.06.2018 © by Supercomputing Systems AG PUBLIC
Optimization – How To
Start Profiling on the PC
• Get test setup running (with good testcases)
• Profiling on PC with valgrind and Linux perf
• Both tools output linebased execution time
• This is also a good starting point to understand the code
-> Find bottlenecks independant of the target architecture
• At the moment there are more possibilities on PC
• Instrument the code with timestamps to get a time profile for every algorithm
-> Good benchmark to see the progress in the project
27 Zürich 06.06.2018 © by Supercomputing Systems AG PUBLIC
Optimization – How To
Techniques for Optimization on ARM
• More aggressive selection of the regions of interest. During development,
code is written for maximum flexibility. Check if this flexibility/coverage is still needed
• Check for repeated computation of same results
-> Balance computation and memory consumption / bandwidth
• In nested loops: Fragmented code is difficult to optimize,
because it is not easy to avoid computing the same stuff several times
-> Combine / inline function calls within nested loops into one loop
28 Zürich 06.06.2018 © by Supercomputing Systems AG PUBLIC
Optimization – How To
Techniques for Optimization - Algebra
• Replace trigonometric functions with table lookups
• Algebra simplification: simple transformations or change of computation order
e.g. instead of using rotation matrix, transform data into polar coordinate system
-> instead of computing a rotation matrix, you just do an addition
• Length comparisons: don't compute the square roots, directly compare the squares
• Multiplications are much faster than divisions
29 Zürich 06.06.2018 © by Supercomputing Systems AG PUBLIC
Optimization – How To
Techniques for Optimization
• Complete rework of smaller parts of the algorithms (clean-up)
• Data structures (cache suffers if data is fragmented in memory)
• Replace generic code by specific code once functionality is frozen
• Don’t forget the compiler flags…
• Identify Code that can be accelerated on FPGA Fabric or Hard IP accelerators
30 Zürich 06.06.2018 © by Supercomputing Systems AG PUBLIC
Example Timing of an Algorithm Before Optimization
0
20
40
60
80
100
120
Runtime [ms]
31 Zürich 06.06.2018 © by Supercomputing Systems AG PUBLIC
Example Timing of Same Algorithm After Optimization
4x faster
Runtime [ms]
0
20
40
60
80
100
120
Supercomputing Systems AG Phone +41 43 456 16 00
Technopark 1 Fax +41 43 456 16 10
8005 Zürich www.scs.ch
Vision meets reality.
Supercomputing Systems AG
[email protected] +41 43 456 16 19