16
1 http://adas.cvc.uab.es/elektra/ Autonomous driving visual perception on the DRIVE PX2 Dr. Antonio M. López www.cvc.uab.es/~ antonio http://adas.cvc.uab.es/elektra/ Dr. Antonio Espinosa http://grupsderecerca.uab.cat/hpca4se/en/content/gpu

Autonomous driving visual perception on the DRIVE PX2on-demand.gputechconf.com/gtc/2017/presentation/s7848-antonio-e… · • Fixed width stixels, variable number of stixels per

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Autonomous driving visual perception on the DRIVE PX2on-demand.gputechconf.com/gtc/2017/presentation/s7848-antonio-e… · • Fixed width stixels, variable number of stixels per

1

http://adas.cvc.uab.es/elektra/

Autonomous driving visual perception on the DRIVE PX2

Dr. Antonio M. López

www.cvc.uab.es/~antonio

http://adas.cvc.uab.es/elektra/

Dr. Antonio Espinosa

http://grupsderecerca.uab.cat/hpca4se/en/content/gpu

Page 2: Autonomous driving visual perception on the DRIVE PX2on-demand.gputechconf.com/gtc/2017/presentation/s7848-antonio-e… · • Fixed width stixels, variable number of stixels per

2

http://adas.cvc.uab.es/elektra/

Our background: camera-based ADAS

Page 3: Autonomous driving visual perception on the DRIVE PX2on-demand.gputechconf.com/gtc/2017/presentation/s7848-antonio-e… · • Fixed width stixels, variable number of stixels per

3

http://adas.cvc.uab.es/elektra/

Our current research:

Page 4: Autonomous driving visual perception on the DRIVE PX2on-demand.gputechconf.com/gtc/2017/presentation/s7848-antonio-e… · • Fixed width stixels, variable number of stixels per

4

http://adas.cvc.uab.es/elektra/

Stereo Vision for Depth Computation

Disparity: distance between same point in left & right

images higher disparity = Objects are closer

10 meters

Page 5: Autonomous driving visual perception on the DRIVE PX2on-demand.gputechconf.com/gtc/2017/presentation/s7848-antonio-e… · • Fixed width stixels, variable number of stixels per

5

http://adas.cvc.uab.es/elektra/

Optimize Stereo Matching: Semi-Global Matching (SGM)

Image Resolution Max. Disparity: 256

640 x 480 0.63 B ops.

1280 x 480 1.26 B ops.

1280 x 960 2.51 B ops.

Total Computation Work ( Height × Width × MaxDisp ×

Path Directions )

DEVICE (GPU)HOST

(CPU)

HOST

(CPU)

Input:

Left and

Right

ImagesOutput:

Disparity

Image

Matching

CostSmoothed

Cost

Page 6: Autonomous driving visual perception on the DRIVE PX2on-demand.gputechconf.com/gtc/2017/presentation/s7848-antonio-e… · • Fixed width stixels, variable number of stixels per

6

http://adas.cvc.uab.es/elektra/

GPU Implementation: Proposal

x

……

x

Cy

d

L

First level parallelism (big granularity)

Second level parallelism (medium granularity)

SGM

stencil

SGM

stencil

Third level parallelism (fine-grained)

Dependency: Serialized

Collaborative work

Page 7: Autonomous driving visual perception on the DRIVE PX2on-demand.gputechconf.com/gtc/2017/presentation/s7848-antonio-e… · • Fixed width stixels, variable number of stixels per

7

http://adas.cvc.uab.es/elektra/

GPU Implementation: Performance & Energy Efficiency

• Tegra X1 achieves real-time ( 20 FPS ) with > 2X efficiency than Titan X

• Newer GPUs, with higher memory bandwidth, will achieve faster solutions

Energy Efficiency

0

50

100TEGRA X1: Performance (Frames/Second, fps)

640x480

1280x480

1280x960

D= 128

# path directions

2 4 8

real-time

Page 8: Autonomous driving visual perception on the DRIVE PX2on-demand.gputechconf.com/gtc/2017/presentation/s7848-antonio-e… · • Fixed width stixels, variable number of stixels per

8

http://adas.cvc.uab.es/elektra/

GPU Implementation: Results

Page 9: Autonomous driving visual perception on the DRIVE PX2on-demand.gputechconf.com/gtc/2017/presentation/s7848-antonio-e… · • Fixed width stixels, variable number of stixels per

9

http://adas.cvc.uab.es/elektra/

• Stereo of 1280 x 960 = 1,228,800 pixels => Too much data to process

• Medium-level representation with only relevant information

• Fixed width stixels, variable number of stixels per column

• Stixel = Stick + Pixel

Stereo + Horizon Line + Road Slope Stixels

slopeHorizon

Stereo Images

Stixel World: Compact representation of the world

Obj.

Sky

Obj.

Obj.

Grnd

Page 10: Autonomous driving visual perception on the DRIVE PX2on-demand.gputechconf.com/gtc/2017/presentation/s7848-antonio-e… · • Fixed width stixels, variable number of stixels per

10

http://adas.cvc.uab.es/elektra/

Image Resolution Computation

640 x 480 147 M ops.

1280 x 480 294 M ops.

1280 x 960 1179 M ops.

Total Computation Work

( Width × Height 2 )Find best configuration

Sky: Far pixels, near 0 disparity

Object: constant disparity

Ground: close to expectedmodel

Stixel World: a Mid-level Image Representation

Computed independently for each column

Enforces constraints: no sky below horizon, no neighbors objects at the same distance…

Combinatorial explosion (of possible segments): dynamic programming technique

Page 11: Autonomous driving visual perception on the DRIVE PX2on-demand.gputechconf.com/gtc/2017/presentation/s7848-antonio-e… · • Fixed width stixels, variable number of stixels per

11

http://adas.cvc.uab.es/elektra/

Stereo Disparity

First level parallelism (big granularity): Second level parallelism (fine-grained):

• Independent / Task level: Typical CPU parallelization• Each image column is processed in parallel by a CTA• CTA = Cooperating Threads Array

CTA

···

CTA

h

h

GPU Implementation: Proposal

Page 12: Autonomous driving visual perception on the DRIVE PX2on-demand.gputechconf.com/gtc/2017/presentation/s7848-antonio-e… · • Fixed width stixels, variable number of stixels per

12

http://adas.cvc.uab.es/elektra/

GPU Implementation: Second Level Parallelism

CTA

step 1

C

i

C

i

C

i

C

i

step 2 step 3 …..

• Extra parallelism level needed for efficient GPU use• Sequentially perform h (image height) steps• CTA threads collaborate sharing info each step• Decreasing Parallelism: Each step uses one thread less

Second level parallelism (fine-grained):

h

Computational AnalysisThread Parallelism h×w

Compute Work per thread h

Total Global Data Reads h2×w

Total Global Data Stores h×w

step h

…..

Page 13: Autonomous driving visual perception on the DRIVE PX2on-demand.gputechconf.com/gtc/2017/presentation/s7848-antonio-e… · • Fixed width stixels, variable number of stixels per

13

http://adas.cvc.uab.es/elektra/ 13

GPU Performance (Frames/Second, fps) Energy Efficiency

• Real-time performance for energy efficient GPU: NVIDIA DRIVE PX

• NVIDIA Drive PX has better energetic efficiency than high-end GPUs

8.68

4.004.57

2.322.23 1.490

1

2

3

4

5

6

7

8

9

10

Tegra X1 Titan X

fram

es

pe

r se

con

d /

Wat

t

1280 x 240

640 x 480

1280 x 480

86.8

1000

45.7

581.0

22.3

373.0

1.0

10.0

100.0

1000.0

Tegra X1 Titan X

fram

es

pe

r se

con

d

1280 x 240

640 x 480

1280 x 480

GPU Implementation: Performance & Energy Efficiency

Page 14: Autonomous driving visual perception on the DRIVE PX2on-demand.gputechconf.com/gtc/2017/presentation/s7848-antonio-e… · • Fixed width stixels, variable number of stixels per

14

http://adas.cvc.uab.es/elektra/

GPU Implementation: Result (using SYNTHIA)

Page 15: Autonomous driving visual perception on the DRIVE PX2on-demand.gputechconf.com/gtc/2017/presentation/s7848-antonio-e… · • Fixed width stixels, variable number of stixels per

15

http://adas.cvc.uab.es/elektra/

Image generator to acquire thousands of precise data with several kinds of ground truth. • RGB & Per pixel: depth, semantic class, instance ID, optical flow• Covering popular Cityscapes classes (see www.synthia-dataset.net)

What is this for? teaching cars to perceive the environment using machine learning techniques (e.g. deep learning, reinforcement learning).

Page 16: Autonomous driving visual perception on the DRIVE PX2on-demand.gputechconf.com/gtc/2017/presentation/s7848-antonio-e… · • Fixed width stixels, variable number of stixels per

16

http://adas.cvc.uab.es/elektra/