Beyond Pedestrian Detection: Deep Neural Networks Level-Up...

Preview:

Citation preview

Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved. 3/26/2014

Ikuro Sato, Hideki Niihara

R&D Group, Denso IT Laboratory, Inc.

Beyond Pedestrian Detection:

Deep Neural Networks Level-Up Automotive Safety

1/26

Advanced Driver Assistance Systems (ADAS)

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.

pedestrian detection with infrared camera forerunning car detection with radar

Images taken from DENSO’s TV ad.

2/26

What would a skilled driver do?

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved. 3/26

What would a skilled driver do?

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved. 4/26

Today’s Technology

• Only to detect pedestrians with a camera

– limited autonomous maneuvers possible, e.g., slows down to stop

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.

today’s industrial technology

5/26

The Goal

• Able to extract richer information about pedestrians

– various autonomous maneuvers possible like a skilled driver

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.

future industrial technology

old-aged man finishing crossing

running toward the vehicle

man holding smartphone, crossing intersection

6/26

Three Technical Challenges

• multiclass classification

– Need to identify various annotative info on top of binary classification

• visual feature design

– Who knows what visual features best separate a man with a

smartphone from a man without?

• real-time processing

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved. 7/26

Promising Tools

• multiclass classification

– need to identify various “states” on top of binary classification

• visual feature design

– Who knows what visual features best separate a man with a cell phone

from a man without?

• real-time processing

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.

Deep Neural Networks

GPGPUs

8/26

• Why Deep Neural Networks?

• Why GPGPUs?

• Demo Setup

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved. 9/26

Basics

• Neural Network: one of the machine learning algorithms for

classification and/or regression

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.

input 𝑥

hidden ℎ

simple Neural Network

By tuning 𝑊𝑘, one obtains an arbitrary function.

weight 𝑊1

𝑊2

ℎ = 𝑔 𝑊1𝑥

𝑦 = 𝑔 𝑊2ℎ

𝑔: differentiable nonlinear function

(simplified)

output 𝑦

10/26

Basics

• Neural Network: one of the machine learning algorithms for

classification and/or regression

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.

input 𝑥 output

𝑦

hidden ℎ

simple Neural Network

By tuning 𝑊𝑘, one obtains an arbitrary function.

weight 𝑊1

𝑊2

Can have arbitrary many units

multiclass

11/26

• Convolutional Neural Network (CNN) comprising

– Convolutional layer(s) local feature extraction

– Pooling layer(s) dimensionality reduction

– Fully-connected layer(s) classification/regression

simple CNN

Neural Networks for Visual Tasks

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.

input image C P F F

12/26

• Convolutional Neural Network (CNN) comprising

– Convolutional layer(s) local feature extraction

– Pooling layer(s) dimensionality reduction

– Fully-connected layer(s) classification/regression

Neural Networks for Visual Tasks

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.

input image C P F F

visual features automatically extracted

simple CNN

13/26

• Transforming data vectors at multiple layers help to increase the

level of abstraction of visual contents (Bengio, 2009).

Why is “Deep” Architecture Advantageous?

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.

C P P P F F F C C

edges set of segments parts classifier

Deep CNN

14/26

• Why Deep Neural Networks?

• Why GPGPUs?

• Demo Setup

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved. 15/26

Major Reason

• Classification can be boosted.

– Deep CNN can run at real-time with a compact Tegra K1 processor.

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.

why?

Deep CNN is very GPU friendly!

A Deep CNN typically has a small number of sequential operations, each of which is dominated by highly-parallelizable, massive sums-of-products.

16/26

Minor Reasons

• Classification can be boosted.

– Deep CNN can run at real-time with compact Tegra K1 processor.

• Learning can be boosted.

– It can take a month without GPUs. Takes only several days with GPUs.

• Development can be boosted.

– Capability of processing generic programing language (i.e., CUDA)

makes R&D process efficient.

– No more fixed-points!

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved. 17/26

Minor Reasons

• Classification can be boosted.

– Deep CNN can run at real-time with compact Tegra K1 processor.

• Learning can be boosted.

– It can take a month without GPUs. Takes only several days with GPUs.

• Development can be boosted.

– Capability of processing generic programing language (i.e., CUDA)

makes R&D process efficient.

– No more fixed-points!

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved. 18/26

• Why Deep Neural Networks?

• Why GPGPUs?

• Demo Setup

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved. 19/26

Problem Statement

• Real-time pedestrian detection with depth, height, and body

orientation estimations

– requires discrete-valued classification and real-valued regression tasks

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved. 20/26

Deep CNN Model

• A single deep CNN with 9 (processing) layers

– 3 Convolutional + (Max) Pooling layers

– 3 Fully-connected layers

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.

16 maps 16 maps

16 maps

16 units

32 units 140x60

gray image

21/26

Deep CNN Model

• Output layer contains:

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.

1 binary classification unit for pedestrian/background

1 regression unit for relative x-coord. of the body center

1 regression unit for relative y-coord. of the top of the head

1 regression unit for relative y-coord. of the toe

4 classification units for body orientations

𝑥

𝑦

𝑦head

𝑦toe

𝑥center

scanning window

22/26

Deep CNN Model

• Output layer contains:

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.

1 binary classification unit for pedestrian/background

1 regression unit for relative x-coord. of the body center

1 regression unit for relative y-coord. of the top of the head

1 regression unit for relative y-coord. of the toe

4 classification units for body orientations

𝑥

𝑦

𝑦head

𝑦toe

scanning window converted to depth & height

𝑥center

23/26

Processor

• NVIDIA’s Jetson platform equipped with Tegra K1 processor

– 192 Kepler CUDA cores

– 200 GFLOPS (automotive K1)

– memory shared between GPU and CPU (ARM)

– low power consumption for automotive applications, e.g., ADAS

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.

Jetson

24/26

Processor

• NVIDIA’s Jetson platform equipped with Tegra K1 processor

– 192 Kepler CUDA cores

– 200 GFLOPS (automotive K1)

– memory shared between GPU and CPU (ARM)

– low power consumption for automotive applications, e.g., ADAS

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.

Jetson Deep CNN

25/26

Take-Home Messages

1. Extracting rich information about pedestrians will be

one of the key technologies for automotive safety.

2. Deep Neural Networks are the best algorithms for

visual recognition tasks.

3. GPGPUs enable real-time processing.

– Demonstrated that Tegra K1 indeed does!

3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved. 26/26

Recommended