21
6/1/17 1 Deep neural networks June 1 st , 2017 Yong Jae Lee UC Davis Many slides from Rob Fergus, Svetlana Lazebnik, Jia-Bin Huang, Derek Hoiem Announcements Post quesMons on Piazza for review-session (6/8 lecture) 2 Outline Deep Neural Networks ConvoluMonal Neural Networks (CNNs)

Deep neural networks - Computer Science- UC Davisweb.cs.ucdavis.edu/.../lee_lecture18_deeplearning.pdf · Deep neural networks ... TradiMonal Image CategorizaMon: Training phase Training

Embed Size (px)

Citation preview

Page 1: Deep neural networks - Computer Science- UC Davisweb.cs.ucdavis.edu/.../lee_lecture18_deeplearning.pdf · Deep neural networks ... TradiMonal Image CategorizaMon: Training phase Training

6/1/17

1

Deepneuralnetworks

June1st,2017

YongJaeLeeUCDavis

ManyslidesfromRobFergus,SvetlanaLazebnik,Jia-BinHuang,DerekHoiem

Announcements•  PostquesMonsonPiazzaforreview-session(6/8lecture)

2

Outline•  DeepNeuralNetworks•  ConvoluMonalNeuralNetworks(CNNs)

Page 2: Deep neural networks - Computer Science- UC Davisweb.cs.ucdavis.edu/.../lee_lecture18_deeplearning.pdf · Deep neural networks ... TradiMonal Image CategorizaMon: Training phase Training

6/1/17

2

TradiMonalImageCategorizaMon:Trainingphase

TrainingLabels

Training Images

ClassifierTraining

Training

ImageFeatures

TrainedClassifier

TrainingLabels

Training Images

ClassifierTraining

Training

ImageFeatures

TrainedClassifier

ImageFeatures

Testing

Test Image Outdoor Prediction Trained

Classifier

TradiMonalImageCategorizaMon:TesMngphase

Featureshavebeenkey..

SIFT [Loewe IJCV 04] HOG [Dalal and Triggs CVPR 05]

SPM [Lazebnik et al. CVPR 06] DPM [Felzenszwalb et al. PAMI 10]

Color Descriptor [Van De Sande et al. PAMI 10]

Hand-cra(ed

Page 3: Deep neural networks - Computer Science- UC Davisweb.cs.ucdavis.edu/.../lee_lecture18_deeplearning.pdf · Deep neural networks ... TradiMonal Image CategorizaMon: Training phase Training

6/1/17

3

Whataboutlearningthefeatures?

•  Learnafeaturehierarchyallthewayfrompixelstoclassifier

•  Eachlayerextractsfeaturesfromtheoutputofpreviouslayer

•  Layershave(nearly)thesamestructure

•  Trainalllayersjointly

Layer1 Layer2 Layer3 SimpleClassifier

Image/VideoPixels

LearningFeatureHierarchy

Goal:Learnusefulhigher-levelfeaturesfromimagesFeaturerepresentaMon

Input data

1stlayer“Edges”

2ndlayer“Objectparts”

3rdlayer“Objects”

Pixels

Lee et al., ICML 2009; CACM 2011

Slide: Rob Fergus

LearningFeatureHierarchy

•  Better performance

•  Other domains (unclear how to hand engineer): –  Kinect –  Video –  Multi spectral

•  Feature computation time –  Dozens of features now regularly used –  Getting prohibitive for large datasets (10’s sec /image)

Slide: R. Fergus

Page 4: Deep neural networks - Computer Science- UC Davisweb.cs.ucdavis.edu/.../lee_lecture18_deeplearning.pdf · Deep neural networks ... TradiMonal Image CategorizaMon: Training phase Training

6/1/17

4

“Shallow”vs.“deep”architectures

Hand-designedfeatureextracMon

Trainableclassifier

Image/VideoPixels

ObjectClass

Layer1 LayerN Simpleclassifier

ObjectClass

Image/VideoPixels

Traditional recognition: “Shallow” architecture

Deep learning: “Deep” architecture

BiologicalneuronandPerceptrons

A biological neuron An artificial neuron (Perceptron) - a linear classifier

Simple,Complex,andHyper-complexcells

David H. Hubel and Torsten Wiesel

David Hubel's Eye, Brain, and Vision

Suggested a hierarchy of feature detectors in the visual cortex, with higher level features responding to patterns of activation in lower level cells, and propagating activation upwards to still higher level cells.

video

Page 5: Deep neural networks - Computer Science- UC Davisweb.cs.ucdavis.edu/.../lee_lecture18_deeplearning.pdf · Deep neural networks ... TradiMonal Image CategorizaMon: Training phase Training

6/1/17

5

Hubel/WieselArchitectureandMulM-layerNeuralNetwork

Hubel and Weisel’s architecture Multi-layer Neural Network - A non-linear classifier

Neuron:LinearPerceptron•  Inputsarefeaturevalues•  Eachfeaturehasaweight•  SumistheacMvaMon

•  IftheacMvaMonis:–  PosiMve,output+1–  NegaMve,output-1

Slide credit: Pieter Abeel and Dan Klein

Two-layerperceptronnetwork

Slide credit: Pieter Abeel and Dan Klein

Page 6: Deep neural networks - Computer Science- UC Davisweb.cs.ucdavis.edu/.../lee_lecture18_deeplearning.pdf · Deep neural networks ... TradiMonal Image CategorizaMon: Training phase Training

6/1/17

6

Two-layerperceptronnetwork

Slide credit: Pieter Abeel and Dan Klein

Two-layerperceptronnetwork

Slide credit: Pieter Abeel and Dan Klein

Learningw•  Trainingexamples

•  ObjecMve:amisclassificaMonloss

•  Procedure:

– Gradientdescent/hillclimbing

Slide credit: Pieter Abeel and Dan Klein

Page 7: Deep neural networks - Computer Science- UC Davisweb.cs.ucdavis.edu/.../lee_lecture18_deeplearning.pdf · Deep neural networks ... TradiMonal Image CategorizaMon: Training phase Training

6/1/17

7

Hillclimbing

•  Simple,generalidea:–  Startwherever–  Repeat:movetothebestneighboringstate

–  Ifnoneighborsbeferthancurrent,quit

–  Neighbors=smallperturbaMonsofw

•  What’sbad?–  OpMmal?

Slide credit: Pieter Abeel and Dan Klein

Two-layerperceptronnetwork

Slide credit: Pieter Abeel and Dan Klein

Two-layerperceptronnetwork

Slide credit: Pieter Abeel and Dan Klein

Page 8: Deep neural networks - Computer Science- UC Davisweb.cs.ucdavis.edu/.../lee_lecture18_deeplearning.pdf · Deep neural networks ... TradiMonal Image CategorizaMon: Training phase Training

6/1/17

8

Two-layerneuralnetwork

Slide credit: Pieter Abeel and Dan Klein

NeuralnetworkproperMes•  Theorem(Universalfunc9onapproximators):Atwo-layernetworkwithasufficientnumberofneuronscanapproximateanyconMnuousfuncMontoanydesiredaccuracy

•  Prac9calconsidera9ons:–  Canbeseenaslearningthefeatures–  Largenumberofneurons

•  Dangerforoverfijng

–  Hill-climbingprocedurecangetstuckinbadlocalopMma

Slide credit: Pieter Abeel and Dan Klein Approximation by Superpositions of Sigmoidal Function,1989

MulM-layerNeuralNetwork•  Anon-linearclassifier•  Training:findnetworkweightswtominimizetheerrorbetweentruetraininglabelsandesMmatedlabels

•  MinimizaMoncanbedonebygradientdescentprovidedfisdifferenMable

•  Thistrainingmethodiscalledback-propaga9on

Page 9: Deep neural networks - Computer Science- UC Davisweb.cs.ucdavis.edu/.../lee_lecture18_deeplearning.pdf · Deep neural networks ... TradiMonal Image CategorizaMon: Training phase Training

6/1/17

9

Outline•  DeepNeuralNetworks•  Convolu9onalNeuralNetworks(CNNs)

ConvoluMonalNeuralNetworks(CNN,ConvNet,DCN)

•  CNN=amulM-layerneuralnetworkwith– LocalconnecMvity:•  Neuronsinalayerareonlyconnectedtoasmallregionofthelayerbeforeit

– ShareweightparametersacrossspaMalposiMons:•  Learningshil-invariantfilterkernels

Image credit: A. Karpathy

Neocognitron[Fukushima,BiologicalCyberneMcs1980]

Deformation-Resistant Recognition S-cells: (simple) - extract local features C-cells: (complex) - allow for positional errors

Page 10: Deep neural networks - Computer Science- UC Davisweb.cs.ucdavis.edu/.../lee_lecture18_deeplearning.pdf · Deep neural networks ... TradiMonal Image CategorizaMon: Training phase Training

6/1/17

10

LeNet[LeCunetal.1998]

Gradient-basedlearningappliedtodocumentrecogniMon[LeCun,Bofou,Bengio,Haffner1998] LeNet-1 from 1993

•  StackmulMplestagesoffeatureextractors

•  Higherstagescomputemoreglobal,moreinvariantfeatures

•  ClassificaMonlayerattheend

InputImage

ConvoluMon(Learned)

Non-linearity

SpaMalpooling

ConvoluMonalNeuralNetworks

Featuremaps

WhatisaConvoluMon?•  Weightedmovingsum

Input FeatureAcMvaMonMap

.

.

.

Page 11: Deep neural networks - Computer Science- UC Davisweb.cs.ucdavis.edu/.../lee_lecture18_deeplearning.pdf · Deep neural networks ... TradiMonal Image CategorizaMon: Training phase Training

6/1/17

11

WhyConvoluMon?•  Fewparameters(filterweights)•  Dependenciesarelocal•  TranslaMoninvariance

Input FeatureMap

.

.

.

Input FeatureMap

...

ConvoluMonalNeuralNetworks

InputImage

ConvoluMon(Learned)

Non-linearity

SpaMalpooling

Featuremaps

ConvoluMonalNeuralNetworks

Rectified Linear Unit (ReLU)

slide credit: S. Lazebnik

InputImage

ConvoluMon(Learned)

Non-linearity

SpaMalpooling

Featuremaps

Page 12: Deep neural networks - Computer Science- UC Davisweb.cs.ucdavis.edu/.../lee_lecture18_deeplearning.pdf · Deep neural networks ... TradiMonal Image CategorizaMon: Training phase Training

6/1/17

12

Non-Linearity

•  Per-element(independent)•  OpMons:

–  Tanh–  Sigmoid:1/(1+exp(-x))–  RecMfiedlinearunit(ReLU)

•  Makeslearningfaster•  SimplifiesbackpropagaMon•  AvoidssaturaMonissuesàPreferredopMon

InputImage

ConvoluMon(Learned)

Non-linearity

SpaMalpooling

NormalizaMon

Featuremaps

Maxpooling

ConvoluMonalNeuralNetworks

Max-pooling: a non-linear down-samp Provide translation invariance

SpaMalPooling•  Averageormax•  Non-overlapping/overlappingregions•  Roleofpooling:•  InvariancetosmalltransformaMons•  LargerrecepMvefields(seemoreofinput)

Max

Average

Page 13: Deep neural networks - Computer Science- UC Davisweb.cs.ucdavis.edu/.../lee_lecture18_deeplearning.pdf · Deep neural networks ... TradiMonal Image CategorizaMon: Training phase Training

6/1/17

13

Engineeredvs.learnedfeatures

Image

FeatureextracMon

Pooling

Classifier

Label

Image

ConvoluMon/pool

ConvoluMon/pool

ConvoluMon/pool

ConvoluMon/pool

ConvoluMon/pool

Dense

Dense

Dense

Label Convolutional filters are trained in a supervised manner by back-propagating classification error

Compare:SIFTDescriptor

ImagePixels Apply

orientedfilters

SpaMalpool(Sum)

Normalizetounitlength

FeatureVector

Lowe[IJCV2004]

Compare:SpaMalPyramidMatching

FilterwithVisualWords

MulM-scalespaMalpool(Sum)

TakemaxVWresponse

Globalimage

descriptor

Lazebnik,Schmid,Ponce

[CVPR2006]SIFT

features

Page 14: Deep neural networks - Computer Science- UC Davisweb.cs.ucdavis.edu/.../lee_lecture18_deeplearning.pdf · Deep neural networks ... TradiMonal Image CategorizaMon: Training phase Training

6/1/17

14

PreviousConvnetsuccesses

•  Handwrifentext/digits– MNIST(0.17%error[Ciresanetal.2011])–  Arabic&Chinese[Ciresanetal.2012]

•  SimplerrecogniMonbenchmarks–  CIFAR-10(9.3%error[Wanetal.2013])–  TrafficsignrecogniMon•  0.56%errorvs1.16%forhumans[Ciresanetal.2011]

ImageNetChallenge2012

[Dengetal.CVPR2009]

•  ~14millionlabeledimages,20kclasses

•  ImagesgatheredfromInternet

•  HumanlabelsviaAmazonTurk

•  ImageNetChallenge:1.2milliontrainingimages,1000classes

A. Krizhevsky, I. Sutskever, and G. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, NIPS 2012

AlexNetSimilarframeworktoLeCun’98but:•  Biggermodel(7hiddenlayers,650,000units,60,000,000params)•  Moredata(106vs.103images)•  GPUimplementaMon(50xspeedupoverCPU)•  TrainedontwoGPUsforaweek

A. Krizhevsky, I. Sutskever, and G. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, NIPS 2012

Page 15: Deep neural networks - Computer Science- UC Davisweb.cs.ucdavis.edu/.../lee_lecture18_deeplearning.pdf · Deep neural networks ... TradiMonal Image CategorizaMon: Training phase Training

6/1/17

15

AlexNetforimageclassificaMon

“car”

AlexNet

Fixed input size: 224x224x3

ImageNetClassificaMonChallenge

http://image-net.org/challenges/talks/2016/ILSVRC2016_10_09_clsloc.pdf

AlexNet

IndustryDeployment

•  UsedinFacebook,Google,Microsol•  Startups•  ImageRecogniMon,SpeechRecogniMon,….•  FastattestMme

Taigman et al. DeepFace: Closing the Gap to Human-Level Performance in Face Verification, CVPR’14

Page 16: Deep neural networks - Computer Science- UC Davisweb.cs.ucdavis.edu/.../lee_lecture18_deeplearning.pdf · Deep neural networks ... TradiMonal Image CategorizaMon: Training phase Training

6/1/17

16

VisualizingCNNs•  WhatinputpafernoriginallycausedagivenacMvaMoninthefeaturemaps?

Visualizing and Understanding Convolutional Networks [Zeiler and Fergus, ECCV 2014]

Layer1

Visualizing and Understanding Convolutional Networks [Zeiler and Fergus, ECCV 2014]

Layer2

Visualizing and Understanding Convolutional Networks [Zeiler and Fergus, ECCV 2014]

Page 17: Deep neural networks - Computer Science- UC Davisweb.cs.ucdavis.edu/.../lee_lecture18_deeplearning.pdf · Deep neural networks ... TradiMonal Image CategorizaMon: Training phase Training

6/1/17

17

Layer3

Visualizing and Understanding Convolutional Networks [Zeiler and Fergus, ECCV 2014]

Layer4and5

Visualizing and Understanding Convolutional Networks [Zeiler and Fergus, ECCV 2014]

BeyondclassificaMon•  DetecMon•  SegmentaMon•  Regression•  PoseesMmaMon•  Matchingpatches•  Synthesis

andmanymore…

Page 18: Deep neural networks - Computer Science- UC Davisweb.cs.ucdavis.edu/.../lee_lecture18_deeplearning.pdf · Deep neural networks ... TradiMonal Image CategorizaMon: Training phase Training

6/1/17

18

R-CNN:RegionswithCNNfeatures•  TrainedonImageNetclassificaMon•  FinetuneCNNonPASCAL

RCNN [Girshick et al. CVPR 2014]

LabelingPixels:SemanMcLabels

Fully Convolutional Networks for Semantic Segmentation [Long et al. CVPR 2015]

LabelingPixels:EdgeDetecMon

DeepEdge: A Multi-Scale Bifurcated Deep Network for Top-Down Contour Detection [Bertasius et al. CVPR 2015]

Page 19: Deep neural networks - Computer Science- UC Davisweb.cs.ucdavis.edu/.../lee_lecture18_deeplearning.pdf · Deep neural networks ... TradiMonal Image CategorizaMon: Training phase Training

6/1/17

19

CNNforRegression

DeepPose [Toshev and Szegedy CVPR 2014]

CNNasaSimilarityMeasureforMatching

FaceNet [Schroff et al. 2015] Stereo matching [Zbontar and LeCun CVPR 2015] Compare patch [Zagoruyko and Komodakis 2015]

Match ground and aerial images [Lin et al. CVPR 2015] FlowNet [Fischer et al 2015]

CNNforImageGeneraMon

Learning to Generate Chairs with Convolutional Neural Networks [Dosovitskiy et al. CVPR 2015]

Page 20: Deep neural networks - Computer Science- UC Davisweb.cs.ucdavis.edu/.../lee_lecture18_deeplearning.pdf · Deep neural networks ... TradiMonal Image CategorizaMon: Training phase Training

6/1/17

20

ChairMorphing

Learning to Generate Chairs with Convolutional Neural Networks [Dosovitskiy et al. CVPR 2015]

TransferLearning•  Improvementoflearninginanewtaskthroughthetransferofknowledgefromarelatedtaskthathasalreadybeenlearned.

•  WeightiniMalizaMonforCNN

LearningandTransferringMid-LevelImageRepresentaMonsusingConvoluMonalNeuralNetworks[Oquab et al. CVPR 2014]

Deeplearninglibraries•  Tensorflow•  Caffe•  Torch•  MatConvNet

Page 21: Deep neural networks - Computer Science- UC Davisweb.cs.ucdavis.edu/.../lee_lecture18_deeplearning.pdf · Deep neural networks ... TradiMonal Image CategorizaMon: Training phase Training

6/1/17

21

FoolingCNNs

Intriguing properties of neural networks [Szegedy ICLR 2014]

Whatisgoingon?

x x← x+α ∂E∂x

∂E∂x

http://karpathy.github.io/2015/03/30/breaking-convnets/ Explaining and Harnessing Adversarial Examples [GoodfellowICLR2015]

QuesMons?

SeeyouTuesday!

63