GPU’s Integral Role in AI and Autonomous...

Preview:

Citation preview

2018-11-22

GPU’s Integral Role in AI and Autonomous Driving

Zhang Ning, Director of Engineering & Site Lead of China HQ

1

Pony.ai at a glance

2

Autonomous Driving

Everywhere

Our

mission

…by building the safest

and most reliable self-

driving technology

Full stack autonomous

driving (L4-L5) technology

and full system optimization

Tech

Close partnerships with

major OEMs,

manufacturers, and local

governments e.g.

Guangzhou, Beijing,

Shanghai

Support

Confidential

Global presence with fleet operations in China and the U.S.

3

B E I J I N G

G U A N G Z H O U

• Opened office in Mar 2017

• Awarded T3 permit for AV public road tests in Beijing, June 2018

• Soft-launched consumer AV ride-sharing fleet in Nansha, Feb 2018

• Strategic partnership with Chinese OEM GAC Group

• Established Pony.ai AI Research Institute

F R E M O N T

• Established in Dec 2016

• R&D hub with public road testing since Jun 2017

Confidential

Led by a technical and experienced executive team

4

• Previous Chief Architect at

Baidu, leading AV, ads,

infrastructure, big data, etc.

• Google Software Engineer for

7 years; awarded Google

Founders’ Award

• PhD from Stanford, B.S. from

Tsinghua University

James Peng

Co-founder & CEO

• TopCoder champion &

medalist for 10 consecutive

years; 2x Google Code Jam

champion

• 3+ years at Google X Self-

Driving Car Project

• PhD and B.S. from Tsinghua

University

Tiancheng Lou

Co-founder & CTO

• Recipient of Turing Award -“Nobel Prize of Computing”

• Dean of Institute for Interdisciplinary Information Sciences (IIIT) at Tsinghua

• Member of US National Academy of Science and Chinese Academy of Sciences

• PhDs from Harvard and UIUC

Andrew Yao

Chief Advisor

• Former TMT lead of ICBC

International Investments

• Previous CFO and VP at QKM

Robotics

• Mentor of MIT Innovation

Initiative in Hong Kong

• M.S from MIT Sloan

Harry Hu

COO

Confidential

Best-in-class technology driven by system engineering approach

5

Co

ntr

ol

HD Mapping &Localization

Sensors & Hardware

Autonomous driving software modules

Pe

rce

ptio

n

Se

nso

r fu

sio

n

LiDAR

High-Res Camera

Radar

Pony.ai System Engineering

In-house SWinfrastructure

Pre

dic

tio

n

Vehicleactuation

GNSS/IMU

On-vehicle compute

Algorithms and infrastructure fully built in-house

• Real-time system and algorithms

• Simulation system• PonyBrain - proprietary

operating platform

Full-stack solution ranging from

perception to HD mapping to

data management

Customized computing architecture for optimized

performance

Deep and adaptive sensor fusion technology

Pa

th p

lan

nin

g

Computational

geometry;combinatorial optimization; data platform

Confidential

Continuous product development – three generations to date

Q2 2017 Q1 2018

1st GENERATION 2nd GENERATION 3rd GENERATION

Q3 2018

• Early industrial lidar, front-facing cameras

PonyAlpha

• Lidar upgrade, multiple cameras & radars

• Multiple lidars (high & lower-res), cameras, and radars

• Customized computing

• Semi-urban, public road environment

• Basic public road scenarios mastered

• Introduction of LiDAR / camera sensor fusion

• Mastery of more weather conditions, e.g. rain

• Deep and flexible sensor fusion based on environment

• Handling of more complex and crowded road scenarios

• Blind spot elimination

SENSORS

PERFOR-MANCE

6Confidential

PonyAlpha: our most advanced AV system to-date

3 LiDARsRange 100m

Front radarRange 200m

4 wide angle cameras

Range 80m

2 rear radarsRange 200m

Rear radar

Effective and safe lane changing

Front long range cameraRange 200m

Adding up to 8 radars around vehicle

Front mid-range camera

Range 150m

Sensor fusion• Improved time synchronization for

full sensor suite

• Flexible fusion foundation

combines most reliable data from sensors

360°, 200m sensor coverage• Greater camera coverage for

greater object detection, classification

• Long range radar enables faster driving speed

• Coverage of near-vehicle blind

spot enables varied environments: industrial, residential, commercial

PonyAlpha a product of extensive cross-site testing to determine best overall solution

1st GENERATION 2nd GENERATION 3rd GENERATION

7Confidential

PonyAlpha fleet launched during 2018 WAIC

8Confidential

Raw data-level sensor fusion

9

June, 2017

Confidential

From early mastery of complex road scenarios in US...

10

July, 2017

Confidential

...to complex scenarios typical in China

11

Dec, 2017

Confidential

Even extreme scenarios like this…

12Confidential

Or this…

13Confidential

Powered by deep learning data and model workflow

14

Data collection from log

Auto+manualdata selection and

labeling

Model training, evaluation and

deployment

On-vehicle Test

Confidential

Real-time inference for onboard perception module

• Deploy Caffe and TensorFlow models using TensorRT• Real-time object detection on single GPU (~70ms perception latency on 1 GPU)• >99.9% obstacle recall rate within 60m range• Traffic light accuracy > 99.99% in all mapped areas

15

Semantic Segmentation

Instance Segmentation

Obstacle Classification

Traffic LightClassification

Camera Driver

LiDAR Driver

Data Fusion

To Planner

Tracking

Confidential

Deep learning adapted to perception use

16

Loss from deep learning Fitting deep learning to perception

Object classification

Regression

Classification

Pixel labeling

Object detection

Confidential

HD mapping on GPUs

• Tile-based 3D occupancy grids

• Point cloud SLAM with GPU accelerated point set registration

• GPU laser tracing in occupancy grids• Scatter - Distribute rays to tiles• Gather - Per-cell occupancy estimation for each tile

• ~15x overall speedup (16-core CPU vs. 6 GPUs)• Performance limited by disk I/O after GPU acceleration

17Confidential

GPU-based localization

• Online localization with GPU scanmatch:• Parallelize point cloud arithmetic calculation.• Parallelize memory access.• Asynchronous map loading• Results also combine other sensor input (e.g. GPS, IMU, etc.)

18

CPU GPU Speedup

Mean 60ms 4.5ms 13x

99% 97ms 13ms 7.5x

Confidential

Road test: challenging localization scenarios

19Confidential

Road tests: night scenarios

20Confidential

Onboard occupancy grid estimation on GPU

• Online LiDAR laser ray tracing• LiDAR laser points are grouped by sectors• One GPU thread per occupancy cell

21

• ~13x speedup• Results are consumed by the perception module

Confidential

GPU image codec for camera data recording

• 6 HD cameras with data rate ~360MB/s• ~35% CPU cost reduction• Consumes ~4% of overall GPU time in our system

22Confidential

Real-time onboard task scheduling

• Asynchronous CPU-GPU task scheduling

• Optimized for end-to-end latency from sensor trigger to control signal

• Overall GPU utilization > 80% (1 GPU on vehicle)

• Maximum reaction time < 100ms

23

CPU Planning & Control Camera/Lidar/Fusion Perception Planning & Control

GPU Localization Camera/Lidar/Fusion Perception Localization

Confidential

New challenges and research opportunities

• Peta-scale sensor data collection, selection, labeling and training

• City-scale HD mapping at daily frequency

• Software-hardware co-design for minimal end-to-end system latency

• Real-time sensor, system and dynamics simulation for rapid iteration

24Confidential

T H A N K Y O U !

25

Recommended