CS231n: Convolutional Neural Network for Visual...

Preview:

Citation preview

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

CS231n: Convolutional Neural

Network for Visual Recognition

Lecture 1: Introduction

24-Mar-211

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

Welcome to CS231n

2 24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

Welcome to CS231n

3 24-Mar-21

20152016

2017

2018 2019 2020

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

Computer

Vision

Neuroscience

Deep learning

Machine learning

Speech, NLP

Information retrieval

Mathematics

Computer

Science

Biology

Engineering

Physics

Robotics

Cognitive

sciences

Psychology

graphics, algorithms,

theory,…

Image

processing

4

systems,

architecture, …

optics

24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

Computer

Vision

Neuroscience

Deep learning

Machine learning

Speech, NLP

Information retrieval

Mathematics

Computer

Science

Biology

Engineering

Physics

Robotics

Cognitive

sciences

Psychology

graphics, algorithms,

theory,…

Image

processing

5

systems,

architecture, …

optics

24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

Artificial Intelligence (AI)

6 24-Mar-21

Machine Learning (ML)

Depp Learning (DL)

Convolutional Neural Network

(CNN)

Computer Vision

• Object detection• Object classification• Scene understanding• Semantic scene

segmentation• 3D reconstruction• Object tracking• Human pose estimation• Activity recognition• VQA• ….

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu 7 24-Mar-21

Jiajun Wu

Fei-Fei Li

Juan CarlosNiebles

Silvio Savarese

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

Today’s agenda

• A brief history of computer vision

• CS231n overview

8 24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

Evolution’s Big Bang: Cambrian Explosion, 530-540million years, B.C.

9 24-Mar-21

This image is licensed under CC-BY 3.0

This image is licensed under CC-BY 2.5

This image is licensed under CC-BY 2.5

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu 11 24-Mar-21

Camera Obscura

Leonardo da Vinci,

16th Century AD

This work is in the public domain

This work is in the public domain

Gemma Frisius, 1545

This work is in the public domain

Encyclopedia, 18th Century

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

Where did we come from?

The known story – Neuroscience inspired AI

12 24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

Hubel and Wiesel, 1959

13 24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu 14

Low-LevelDetails

Neural Networks (Digital)

Cortical Column(Biological)

High-Level Patterns

24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

F. Rosenblatt, 1957 Rumelhart, Hinton & Williams, 1986

15 24-Mar-21

“The mere formulation of a problem is often far

more essential than its solution, which […]

requires creative imagination and marks

real advances in science.”

- Albert Einstein, 1921

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

Where did we come from?

The not-so-known story – the search for computer vision’s “North Star”

17 24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

Larry Roerts1963, 1st thesis of Computer Vision

1960s: Interpretation of synthetic world

18 24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu 19 24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu 20

David Marr, 1970s

24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu 21

This image is CC0 1.0 public domain This image is CC0 1.0 public domain

Input image Edge image2 ½-D sketch 3-D model

Input

Image

Perceived

intensities

Primal

Sketch

Zero crossings,

blobs, edges,

bars, ends,

virtual lines,

groups, curves

boundaries

2 ½-D

Sketch

Local surface

orientation

and

discontinuities

in depth and

in surface

orientation

3-D Model

Representation

3-D models

hierarchically

organized in

terms of

surface and

volumetric

primitives

Stages of Visual Representation, David Marr, 1970s

24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

D. L

ow

e. IJ

CV

, 19

92

Edges, segmentation, and perception

22 24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

3D reconstruction

S. Agarwal et al. ICCV, 2009

24 24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu 25

• Generalized Cylinder • Pictorial Structure

Brooks & Binford, 1979 Fischler and Elschlager, 1973

b

b

24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

D. Lowe. ICCV, 1999

Single Object Recognition

26 24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu 27

Spatial Pyramid Matching, Lazebnik, Schmid & Ponce, 2006

Level 0 Level 1

Image is CC0 1.0 public domain

24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

Histogram of Gradients (HoG)

Dalal & Triggs, 2005

Deformable Part Model

Felzenswalb, McAllester, Ramanan,

2009orientation

freq

uen

cy

Image is CC0 1.0 public domain

28 24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

Face Detection, Viola & Jones, 2001

Image is public domain

29 24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

CVPR topic distribution: 2000

30 24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

In the mean time…

31 24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

I. Biederman, Science, 1972

32 24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

Potter, etc. 1970s

Rapid Serial Visual Perception (RSVP)

33 24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

150 ms !!Thorpe, et al. Nature, 1996

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

Kanwisher et al. J. Neuro. 1997 Epstein & Kanwisher, Nature, 1998

Neural correlates of object & scene recognition

35 24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

A Computer Vision/AI ”holy grail” – Object Recognition

Fei-Fei et al. 2004Everingham et al. 2006-2012

36

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

There are MANY objects; organized HIERARCHICALLY

• Biederman: Recognition by Component, 1987 Eleanor Rosch: Principles of Categorization, 1978

32 6-Apr-20

37 24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu 38

George A. MillerPsychology, Cognitive SciencePrinceton University

G. A. Miller, Communications of the ACM, 1995

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li & L. Fei-Fei. CVPR, 2009.

22,000 categories 15,000,000 images

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu 40

Output:

Scale

T-shirt

Steel drum

Drumstick

Mud turtle

Steel drum

✔ ✗

Output:

Scale

T-shirt

Giant panda

Drumstick

Mud turtle

24-Mar-21

Russakovsky et al. IJCV 2015

The Image Classification Challenge:

1,000 object classes

1,431,167 images

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

0.280.26

0.16

0.12

0.070.036 0.03 0.023

0

0.05

0.1

0.15

0.2

0.25

0.3

2010 2011 2012 2013 2014 2015 2016 2017

Cla

ssif

icat

ion

Err

or

Human Shallow models

Classification Task

Deng et al. CVPR, 2009; Russakovsky et al. IJCV, 2012;

41 24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

1998

2012

LeCun et al.

Krizhevsky et

al.

# of transistors # of pixels used in training

# of transistors # of pixels used in training

107

1014

106

109

GPUs

42

K

InputImage Maps

ConvolutionsSubsampling

Output

Fully Connected

Figure copyright Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, 2012. Reproduced with permission.

24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

GoogLeNet VGG MSRASuperVision

[Krizhevsky NIPS 2012]

Year 2012 Year 2014Year 2010

NEC-UIUC

[Lin CVPR 2011]

[Szegedy arxiv

2014]

[Simonyan arxiv 2014]

43

Year 2015

Dense descriptor grid: HOG, LBP

Coding: local coordinate, super-vector

Pooling, SPM

Linear SVM

Lion image by Swissfrogis licensed under CC BY 3.0

Image

conv-64

conv-64

maxpool

conv-128conv-128

maxpool

conv-256conv-256

maxpool

conv-512conv-512

maxpool

fc-4096

fc-4096

fc-1000

softmax

conv-512conv-512

maxpool

Pooling

Convoluti

on

Softmax

Other

[He ICCV 2015]Figure copyright Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, 2012. Reproduced with permission.

24-Mar-21

A man riding a horse drawn carriage down a street

Horse pulling a cart. A wheel ona cart. A window on a building. Ahorse in a picture. A large whiteumbrella. A woman sitting on abench. Man sitting on amotorcycle. A cart with a cart.

Image Captioning

DenseCaptioning

JKF, CVPR 2016

Prio

rW

ork

Ou

r Recen

t Wo

rk

A man is riding a carriage on astreet. Two people are sitting ontop of the horses. The carriage ismade of wood. The carriage isblack. The carriage has a whitestripe down the side. The buildingin the background is a tan color.

ParagraphCaptioning

KJKF, CVPR 2017

Image Captioning: Richer Descriptions

Results:spatial, comparative, asymmetrical,

verb, prepositional

person person

left of

taller than

ski

wear

shirt

wear

snow

on

Krishna*, Lu*, Bernstein, Fei-Fei, ECCV 2016

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

2000

2013

CVPR topic distribution: 2000 vs. 2013

46 24-Mar-21

The Deep Learning Revolution

Computation DataAlgorithms

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

AI’s Explosive Growth & Impact

Source: The Gradient

Startups Developing AI Systems

Source: Crunchbase, VentureSource, Sand Hill Econometrics

Enterprise Application AI Revenue

Source: Statista

Number of attendanceAt AI conferences

48 24-Mar-21

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

Many Applications of computer vision

49 24-Mar-21

Slide source: World Capital Partners, 2017

50

$Low-cost Burden-free

Mobility Infection

Sleep Diet

Versatile Scalable

How to take care of seniors

while keeping them safe?

14Monitor Patients with

Mild Symptoms

Manage Chronic Conditions

Early Symptom Detection

of COVID-19

CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu

Today’s agenda

• A brief history of computer vision

• CS231n overview

52 24-Mar-21