47
Computer vision and machine learning for perception and decision making in driverless cars Ljubljana, 12.06. 2018 Danijel Skočaj Visual Cognitive Systems Lab Faculty of Computer and Information Science University of Ljubljana

Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

Computer vision and machine learning for perception and decision making in driverless cars

Ljubljana, 12.06. 2018

Danijel Skočaj

Visual Cognitive Systems Lab

Faculty of Computer and Information Science

University of Ljubljana

Page 2: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

2

Driverless cars as intelligent agents

Requirements

Perception-action loop

Perception and decision making

Computer vision

Deep learning

Computer vision and machine learning for perception and decision making in driverless cars

Page 3: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

3

Stone age of driverless driving

DARPA 2004 Grand Challenge Goal: autonomous drive of a car through Mohave dessert

Main reward: 1.000.000 USD

In 2004, most of the algorithms did not work well in uncontrolled environment!

Increase the robustness, adaptability, and intelligence!

was not awarded!

Page 4: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

4

Driverless cars today

2017

Waymo

Page 5: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

5

Enabling technologies

Improved sensors!

Improved perception!

Improved decision making!

Computer vision

Machine learning

Decision making

Perception-action cycle

Driverless cars=

Intelligent robotsSensors online

Page 6: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

6

Cognitive robot systems

industrialrobots SF

human

cognitive robots

communication

perception action

attention goals

planning reasoning

learning

Page 7: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

7

Perception

Perception

Visual information (image, video; RGB, BW, IR,…)

Sound (speech, music, noise, …)

Haptic information (haptic sensors, collision detectors, ect.)

Range/depth/space information (range images, 3D models, 3D maps, …)

Many different modalities – very multimodal system

Attention

Selective attention

Handling complexity of input signals

Page 8: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

8

Representation of visual information

= + a1 + a2 + a3 +…

Page 9: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

9

Representation of space

Metric information

Topological map

Hierarchical representation

Page 10: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

10

Recognition

Recognition of

objects

properties

faces

rooms

affordances

actions

speech

relations

intentions,…

Categorisation

Multimodalrecognition

Page 11: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

11

Learning

Buildnig representations

Continuous learning

Different learning modes

Multimodal learning

Forgetting, unlearning

Robustness

Nature:nurture

learning recognition

training samples

test samples

representation result

learning

updatedrepresentation

new training sample

Page 12: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

12

Reasoning, planning, decision making

In unpredictable environment

With incomplete information

With robot limitations

In dynamic environment

Considering different modalities

In real time

Page 13: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

13

An example of a cognitive system

Autonomous car

City drive

Competencies

Perception (image, 3D, collision)

Planning

Reasoning

Learning

Navigation

Obstacle avoidance

Action

Flexibility

Robustness

Efficiency

Google self-driving car

Page 14: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

14

Intelligent agents

Perception

Action

Reasoning, planning, decision making

Autonomy

Environment

ENVIRONMENT

see actionAGENT

next state

Page 15: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

15

Perception action cycle

Significant abstractionof the realworld

sense perception

modelling

planning

task execution

motor control act

Page 16: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

16

Simulation of robot perception and control

Page 17: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

17

Sensors

Range sensors

Object recognition

Bumper –

collision detector

Odometer

Page 18: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

18

Planning and control

Planning

Control

Page 19: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

19

Deep learning

Page 20: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

20

Excellent results

ILSVRC results

Deep learning era

Page 21: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

21

Modern deep learning

More data!

More computational power!

Improved learning details!

Page 22: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

22

The main concept

Zelier and Fergus, 2014

Page 23: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

23

Shift in paradigm

Hand-craftedEngineered

Data-drivenLearning-based

Solutions

Features

Applications

Page 24: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

24

Perceptron

Rosenblatt, 1957

Binary inputs and output

Weights

Threshold

Bias

Very simple!

Page 25: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

25

Sigmoid neurons

Real inputs and outputs from interval [0,1]

Activation function: sgimoid function

output =

Page 26: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

26

Sigmoid neurons

Small changes in weights and biases causes small change in output

Enables learning!

Page 27: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

27

Feedfoward neural networks

Network architecture:

Page 28: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

28

Example: recognizing digits

MNIST database of handwritten digits

28x28 pixes (=784 input neurons)

10 digits

50.000 training images

10.000 validation images

10.000 test images

http://neuralnetworksanddeeplearning.com

Page 29: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

29

Loss function

Given:

for all training images

Loss function:

(mean sqare error – quadratic loss function)

Find weigths w and biases b that for given input x

produce output a that minimizes Loss function C

Page 30: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

30

Gradient descend

Find minimum of

Change of C:

Gradient of C:

Change v in the opposite

direction of the gradient:

Algorithm:

Initialize v

Until stopping criterium riched

Apply udate rule

Learning rate

Page 31: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

31

Gradient descend in neural networks

Loss function

Update rules:

Consider all training samples

Very many parameters=> computationaly very expensive

Use Stochastic gradient descend instead

Compute gradient only for a subset ofm training samples

Page 32: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

32

Backpropagation

All we need is gradient of loss function

Rate of change of C wrt. to change in any weigt

Rate of change of C wrt. to change in any biase

How to compute gradient?

Numericaly

Simple, approximate, extremely slow

Analyticaly for entire C

Fast, exact, nontractable

Chain individual parts of netwok

Fast, exact, doable

Backpropagation!

Page 33: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

33

Main principle

We need the gradient of the Loss function

Two phases:

Forward pass; propagation: the input sample is propagated through thenetwork and the error at the final layer is obtained

Backward pass; weight update: the error is backpropagated to the individuallevels, the contribution of the individual neuron to the error is calculated andthe weights are updated accordingly

Page 34: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

34

Backpropagation and SGD

For a number of epochs

Until all training images are used

Select a mini-batch of training samples

For each training sample in the mini-batch

Input: set the corresponding activation

Feedforward: for each

compute and

Output error: compute

Backpropagation: for each

compute

Gradient descend: for each and update:

Page 35: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

35

Convolutional neural networks

From feedforward fully-connected neural networks

To convolutional neural networks

Page 36: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

36

Sparse connectivity

Local connectivity – neurons are only locally connected (receptive field)

Reduces memory requirements

Improves statistical efficiency

Requires fewer operations

from below from above

The receptive field of the units in the deeper layers is large=> Indirect connections!

Page 37: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

37

Parameter sharing

Neurons share weights!

Tied weights

Every element of the kernel is used at every position of the input

All the neurons at the same level detectthe same feature (everywhere in the input)

Greatly reduces the number of parameters!

Equivariance to translation

Shift, convolution = convolution, shift

Object moves => representation moves

Fully connected network with an infinitively strong prior over its weights

Tied weights

Weights are zero outside the kernel region

=> learns only local interactions and is equivariant to translations

Page 38: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

38

CNN layers

Layers used to build ConvNets:

INPUT: raw pixel values

CONV: convolutional layer

ReLU: introducing nonlinearity

POOL: downsampling

FC: for computing class scores

Page 39: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

39

CNN architecture

Stack the layers in an appropriate order

Hu et. al.

Babenko et. al.

Page 40: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

40

Detection of traffic signs

DFG database

200 categories

Tabernik, Skočaj, Deep Learning for for

Large-Scale Traffic Sign Detection and

Recognition, submitted

Page 41: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

41

Detection of traffic signs

Data augmentation

Mask R-CNN +

Online hard-example mining

Distribution of selected training samples

Sample weighting

Adjusting region pass-through during detection

Page 42: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

42

Experimental results

Swedish traffic sign database DFG traffic sign database

Page 43: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

43

Traffic sign detection

Page 44: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

44

Knowing everything

Data, big data!

Machine learning!

Make sense of huge amount of data!

Page 45: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

45

Digitalisation of decision making

Different problem complexities

Simple, well defined problems

Complex, vaguely defined problems

Com

ple

xity

Rule-baseddecision making

Data-driven decision making

Programming Machine learning

Page 46: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

46

Digital decision making

Decision making based on history

Decision making modelling based on previous decisions

Explicit capturing + digitalisation of all important attributes

Bias in favor of past decisions

Size and representativeness of the training set

Updating the model through time

Combining the training model with new rules

Explainabilty of decisions

Page 47: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss

47

Conclusion

T-60 T-30 T T+30