Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
1
http://adas.cvc.uab.es/elektra/
Autonomous driving visual perception on the DRIVE PX2
Dr. Antonio M. López
www.cvc.uab.es/~antonio
http://adas.cvc.uab.es/elektra/
Dr. Antonio Espinosa
http://grupsderecerca.uab.cat/hpca4se/en/content/gpu
2
http://adas.cvc.uab.es/elektra/
Our background: camera-based ADAS
3
http://adas.cvc.uab.es/elektra/
Our current research:
4
http://adas.cvc.uab.es/elektra/
Stereo Vision for Depth Computation
Disparity: distance between same point in left & right
images higher disparity = Objects are closer
10 meters
5
http://adas.cvc.uab.es/elektra/
Optimize Stereo Matching: Semi-Global Matching (SGM)
Image Resolution Max. Disparity: 256
640 x 480 0.63 B ops.
1280 x 480 1.26 B ops.
1280 x 960 2.51 B ops.
Total Computation Work ( Height × Width × MaxDisp ×
Path Directions )
DEVICE (GPU)HOST
(CPU)
HOST
(CPU)
…
Input:
Left and
Right
ImagesOutput:
Disparity
Image
Matching
CostSmoothed
Cost
6
http://adas.cvc.uab.es/elektra/
GPU Implementation: Proposal
…
x
……
x
…
Cy
d
L
First level parallelism (big granularity)
Second level parallelism (medium granularity)
SGM
stencil
SGM
stencil
Third level parallelism (fine-grained)
Dependency: Serialized
Collaborative work
…
7
http://adas.cvc.uab.es/elektra/
GPU Implementation: Performance & Energy Efficiency
• Tegra X1 achieves real-time ( 20 FPS ) with > 2X efficiency than Titan X
• Newer GPUs, with higher memory bandwidth, will achieve faster solutions
Energy Efficiency
0
50
100TEGRA X1: Performance (Frames/Second, fps)
640x480
1280x480
1280x960
D= 128
# path directions
2 4 8
real-time
8
http://adas.cvc.uab.es/elektra/
GPU Implementation: Results
9
http://adas.cvc.uab.es/elektra/
• Stereo of 1280 x 960 = 1,228,800 pixels => Too much data to process
• Medium-level representation with only relevant information
• Fixed width stixels, variable number of stixels per column
• Stixel = Stick + Pixel
Stereo + Horizon Line + Road Slope Stixels
slopeHorizon
Stereo Images
Stixel World: Compact representation of the world
Obj.
Sky
Obj.
Obj.
Grnd
10
http://adas.cvc.uab.es/elektra/
Image Resolution Computation
640 x 480 147 M ops.
1280 x 480 294 M ops.
1280 x 960 1179 M ops.
Total Computation Work
( Width × Height 2 )Find best configuration
Sky: Far pixels, near 0 disparity
Object: constant disparity
Ground: close to expectedmodel
Stixel World: a Mid-level Image Representation
Computed independently for each column
Enforces constraints: no sky below horizon, no neighbors objects at the same distance…
Combinatorial explosion (of possible segments): dynamic programming technique
11
http://adas.cvc.uab.es/elektra/
Stereo Disparity
First level parallelism (big granularity): Second level parallelism (fine-grained):
• Independent / Task level: Typical CPU parallelization• Each image column is processed in parallel by a CTA• CTA = Cooperating Threads Array
CTA
···
…
CTA
h
h
GPU Implementation: Proposal
12
http://adas.cvc.uab.es/elektra/
GPU Implementation: Second Level Parallelism
CTA
step 1
C
i
C
i
C
i
C
i
step 2 step 3 …..
• Extra parallelism level needed for efficient GPU use• Sequentially perform h (image height) steps• CTA threads collaborate sharing info each step• Decreasing Parallelism: Each step uses one thread less
Second level parallelism (fine-grained):
h
Computational AnalysisThread Parallelism h×w
Compute Work per thread h
Total Global Data Reads h2×w
Total Global Data Stores h×w
step h
…..
13
http://adas.cvc.uab.es/elektra/ 13
GPU Performance (Frames/Second, fps) Energy Efficiency
• Real-time performance for energy efficient GPU: NVIDIA DRIVE PX
• NVIDIA Drive PX has better energetic efficiency than high-end GPUs
8.68
4.004.57
2.322.23 1.490
1
2
3
4
5
6
7
8
9
10
Tegra X1 Titan X
fram
es
pe
r se
con
d /
Wat
t
1280 x 240
640 x 480
1280 x 480
86.8
1000
45.7
581.0
22.3
373.0
1.0
10.0
100.0
1000.0
Tegra X1 Titan X
fram
es
pe
r se
con
d
1280 x 240
640 x 480
1280 x 480
GPU Implementation: Performance & Energy Efficiency
14
http://adas.cvc.uab.es/elektra/
GPU Implementation: Result (using SYNTHIA)
15
http://adas.cvc.uab.es/elektra/
Image generator to acquire thousands of precise data with several kinds of ground truth. • RGB & Per pixel: depth, semantic class, instance ID, optical flow• Covering popular Cityscapes classes (see www.synthia-dataset.net)
What is this for? teaching cars to perceive the environment using machine learning techniques (e.g. deep learning, reinforcement learning).
16
http://adas.cvc.uab.es/elektra/