Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
2018-11-22
GPU’s Integral Role in AI and Autonomous Driving
Zhang Ning, Director of Engineering & Site Lead of China HQ
1
Pony.ai at a glance
2
Autonomous Driving
Everywhere
Our
mission
…by building the safest
and most reliable self-
driving technology
Full stack autonomous
driving (L4-L5) technology
and full system optimization
Tech
Close partnerships with
major OEMs,
manufacturers, and local
governments e.g.
Guangzhou, Beijing,
Shanghai
Support
Confidential
Global presence with fleet operations in China and the U.S.
3
B E I J I N G
G U A N G Z H O U
• Opened office in Mar 2017
• Awarded T3 permit for AV public road tests in Beijing, June 2018
• Soft-launched consumer AV ride-sharing fleet in Nansha, Feb 2018
• Strategic partnership with Chinese OEM GAC Group
• Established Pony.ai AI Research Institute
F R E M O N T
• Established in Dec 2016
• R&D hub with public road testing since Jun 2017
Confidential
Led by a technical and experienced executive team
4
• Previous Chief Architect at
Baidu, leading AV, ads,
infrastructure, big data, etc.
• Google Software Engineer for
7 years; awarded Google
Founders’ Award
• PhD from Stanford, B.S. from
Tsinghua University
James Peng
Co-founder & CEO
• TopCoder champion &
medalist for 10 consecutive
years; 2x Google Code Jam
champion
• 3+ years at Google X Self-
Driving Car Project
• PhD and B.S. from Tsinghua
University
Tiancheng Lou
Co-founder & CTO
• Recipient of Turing Award -“Nobel Prize of Computing”
• Dean of Institute for Interdisciplinary Information Sciences (IIIT) at Tsinghua
• Member of US National Academy of Science and Chinese Academy of Sciences
• PhDs from Harvard and UIUC
Andrew Yao
Chief Advisor
• Former TMT lead of ICBC
International Investments
• Previous CFO and VP at QKM
Robotics
• Mentor of MIT Innovation
Initiative in Hong Kong
• M.S from MIT Sloan
Harry Hu
COO
Confidential
Best-in-class technology driven by system engineering approach
5
Co
ntr
ol
HD Mapping &Localization
Sensors & Hardware
Autonomous driving software modules
Pe
rce
ptio
n
Se
nso
r fu
sio
n
LiDAR
High-Res Camera
Radar
Pony.ai System Engineering
In-house SWinfrastructure
Pre
dic
tio
n
Vehicleactuation
GNSS/IMU
On-vehicle compute
Algorithms and infrastructure fully built in-house
• Real-time system and algorithms
• Simulation system• PonyBrain - proprietary
operating platform
Full-stack solution ranging from
perception to HD mapping to
data management
Customized computing architecture for optimized
performance
Deep and adaptive sensor fusion technology
Pa
th p
lan
nin
g
Computational
geometry;combinatorial optimization; data platform
Confidential
Continuous product development – three generations to date
Q2 2017 Q1 2018
1st GENERATION 2nd GENERATION 3rd GENERATION
Q3 2018
• Early industrial lidar, front-facing cameras
PonyAlpha
• Lidar upgrade, multiple cameras & radars
• Multiple lidars (high & lower-res), cameras, and radars
• Customized computing
• Semi-urban, public road environment
• Basic public road scenarios mastered
• Introduction of LiDAR / camera sensor fusion
• Mastery of more weather conditions, e.g. rain
• Deep and flexible sensor fusion based on environment
• Handling of more complex and crowded road scenarios
• Blind spot elimination
SENSORS
PERFOR-MANCE
6Confidential
PonyAlpha: our most advanced AV system to-date
3 LiDARsRange 100m
Front radarRange 200m
4 wide angle cameras
Range 80m
2 rear radarsRange 200m
Rear radar
Effective and safe lane changing
Front long range cameraRange 200m
Adding up to 8 radars around vehicle
Front mid-range camera
Range 150m
Sensor fusion• Improved time synchronization for
full sensor suite
• Flexible fusion foundation
combines most reliable data from sensors
360°, 200m sensor coverage• Greater camera coverage for
greater object detection, classification
• Long range radar enables faster driving speed
• Coverage of near-vehicle blind
spot enables varied environments: industrial, residential, commercial
PonyAlpha a product of extensive cross-site testing to determine best overall solution
1st GENERATION 2nd GENERATION 3rd GENERATION
7Confidential
PonyAlpha fleet launched during 2018 WAIC
8Confidential
Raw data-level sensor fusion
9
June, 2017
Confidential
From early mastery of complex road scenarios in US...
10
July, 2017
Confidential
...to complex scenarios typical in China
11
Dec, 2017
Confidential
Even extreme scenarios like this…
12Confidential
Or this…
13Confidential
Powered by deep learning data and model workflow
14
Data collection from log
Auto+manualdata selection and
labeling
Model training, evaluation and
deployment
On-vehicle Test
Confidential
Real-time inference for onboard perception module
• Deploy Caffe and TensorFlow models using TensorRT• Real-time object detection on single GPU (~70ms perception latency on 1 GPU)• >99.9% obstacle recall rate within 60m range• Traffic light accuracy > 99.99% in all mapped areas
15
Semantic Segmentation
Instance Segmentation
Obstacle Classification
Traffic LightClassification
Camera Driver
LiDAR Driver
Data Fusion
To Planner
Tracking
Confidential
Deep learning adapted to perception use
16
Loss from deep learning Fitting deep learning to perception
Object classification
Regression
Classification
Pixel labeling
Object detection
Confidential
HD mapping on GPUs
• Tile-based 3D occupancy grids
• Point cloud SLAM with GPU accelerated point set registration
• GPU laser tracing in occupancy grids• Scatter - Distribute rays to tiles• Gather - Per-cell occupancy estimation for each tile
• ~15x overall speedup (16-core CPU vs. 6 GPUs)• Performance limited by disk I/O after GPU acceleration
17Confidential
GPU-based localization
• Online localization with GPU scanmatch:• Parallelize point cloud arithmetic calculation.• Parallelize memory access.• Asynchronous map loading• Results also combine other sensor input (e.g. GPS, IMU, etc.)
18
CPU GPU Speedup
Mean 60ms 4.5ms 13x
99% 97ms 13ms 7.5x
Confidential
Road test: challenging localization scenarios
19Confidential
Road tests: night scenarios
20Confidential
Onboard occupancy grid estimation on GPU
• Online LiDAR laser ray tracing• LiDAR laser points are grouped by sectors• One GPU thread per occupancy cell
21
• ~13x speedup• Results are consumed by the perception module
Confidential
GPU image codec for camera data recording
• 6 HD cameras with data rate ~360MB/s• ~35% CPU cost reduction• Consumes ~4% of overall GPU time in our system
22Confidential
Real-time onboard task scheduling
• Asynchronous CPU-GPU task scheduling
• Optimized for end-to-end latency from sensor trigger to control signal
• Overall GPU utilization > 80% (1 GPU on vehicle)
• Maximum reaction time < 100ms
23
CPU Planning & Control Camera/Lidar/Fusion Perception Planning & Control
GPU Localization Camera/Lidar/Fusion Perception Localization
Confidential
New challenges and research opportunities
• Peta-scale sensor data collection, selection, labeling and training
• City-scale HD mapping at daily frequency
• Software-hardware co-design for minimal end-to-end system latency
• Real-time sensor, system and dynamics simulation for rapid iteration
24Confidential
T H A N K Y O U !
25