47
OverFeat Integrated Recogni.on, Localiza.on and Detec.on using Convolu.onal Networks Sermanet et. al Presenta.on by Eric Holmdahl

OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

OverFeat  Integrated  Recogni.on,  Localiza.on  and  Detec.on  using  Convolu.onal  Networks  

Sermanet  et.  al  

Presenta.on  by  Eric  Holmdahl  

Page 2: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Roadmap  1.  Goal  2.  Background  3.  Related  Work  4.  Algorithm  Overview  5.  Breakdown  By  Task  

1.  Classifica.on  2.  Localiza.on  3.  Detec.on  

Page 3: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Goal  

Perform  classifica.on,  localiza.on,  and  detec.on  on  the  ImageNet  Dataset  

Page 4: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Classi3ication  Determining  what  is  the  main  object  in  an  image  

Page 5: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Localization  •  Determining  where  an  object  is  located  in  an  image  

Page 6: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Detection  •  Performing  localiza.on  for  all  objects  present  in  an  image  

Page 7: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Background:  Feed  Forward  Neural  Networks  

Page 8: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Background:  Convolutional  Nets  

•  Alterna.ng  convolu.on  and  max  pooling  layers  feed  into  fully  connected  neural  net  

•  Max  pooling:  with  window  size  kxk,  outputs  highest  intensity  value  in  window  size  

•  Convolu.on:  Scanning  window,  shared  weights  within  window  

Page 9: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Related  Work  

   

Page 10: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Krizhevsky  et.  Al:  ImageNet  Classi-ication  With  Deep  Convolutional  Neural  Networks  

Page 11: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Review:  Krizhevsky  Architecture  •  Large  CNNs  used  to  densely  process  images  with  overlapping  windows  

•  ReLU  Nonlinear  neuron  output  •  DropOut  

Page 12: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Krizhevksy  Results  •  Brought  CNNs  to  forefront  of  classifica.on/localiza.on/detec.on  problem  

Page 13: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Giusti  et.  al:  Fast  Image  Scanning  With  Deep  Convolutional  Networks    

Page 14: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Giusti  Fast  Scanning  

•  Problem:  CNNs  perform  a  great  deal  of  redundant  compu.ng  of  convolu.ons  due  to  overlapping  patches  •  Solu.on:  Apply  convolu.on  to  en.re  image  at  once!  

Page 15: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Giusti  et.  al:  Fast  Image  Scanning  With  Deep  Convolutional  Networks  Convolu.onal  Layer:            Max  Pooling  Layer:  

Page 16: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Giusti  et.  al:  Fast  Image  Scanning  With  Deep  Convolutional  Networks    

Page 17: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Giusti  et.  al:  Results  

Provides  massive  improvements  in  speed  for  sliding  window  CNNs!  

Page 18: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Algorithm  Overview  

Page 19: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Algorithm  Overview:  Training  

Train  Classifier  Train  

Localiza.on  Regressor  

Page 20: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Algorithm  Overview:  Training  

Train  Classifier  Train  

Localiza.on  Regressor  

Page 21: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Algorithm  Overview:  Training  

Train  Classifier  Train  

Localiza.on  Regressor  

Page 22: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Algorithm  Overview:  Training  

Train  Classifier  Train  

Localiza.on  Regressor  

•  Input:  Images  with  classifica.on  and  bounding  box  •  Training  objec.ve:  Minimize  l2  norm  between  generated  bounding  box  and  ground  truth  

•  One  regressor  generated  for  each  possible  image  class  

•  Output:  (x,y)  coordinates  of  top  le[,  top  right  corner  of  bounding  box  

   

Page 23: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Algorithm  Overview:  Runtime  

1.  Perform  classifica.on  at  each  loca.on  using  trained  CNN    

Page 24: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Algorithm  Overview:  Runtime  

2.  Perform  localiza.on  on  all  classified  regions  generated  by  classifier  

Page 25: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Algorithm  Overview:  Runtime  

3.  Merge  bounding  boxes  with  sufficient  overlap  from  localiza.on  and  sufficient  confidence  of  being  same  object  from  classifier  

Page 26: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Breakdown  By  Task  

Page 27: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Classi3ication  

Page 28: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

OverFeat  Feature  Extraction  •  First  5  layers  of  Deep  Convolu.onal  Neural  Net:  similar  to  Krizhevsky’s  

•  Images  downsampled  to  256x256  •  No  contrast  normaliza.on,  non-­‐overlapping  pooling  

Page 29: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

OverFeat  Classi3ication:  Dense  Sliding  Window  

Page 30: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Multi-­‐Scale  Classi3ication  •  Classifica.on  performed  at  6  scales  at  test  .me,  but  only  1  scale  at  run.me  

•  Increases  robustness  of  model  

Page 31: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

ConvNets for Detection

● Single output: ○ ○ ○ ○ ○

1x1 output no feature space blue: feature maps green: operation kernel typical training setup

OverFeat • Pierre Sermanet • New York University

Classi3ication:  CNNs  and  Sliding  Windows  

Page 32: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

ConvNets for Detection

● Multiple outputs: ○ ○ ○

2x2 output input stride 2x2 recompute only extra yellow areas

OverFeat • Pierre Sermanet • New York University

Classi3ication:  CNNs  and  Sliding  Windows  

Page 33: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

ConvNets for Detection

● With feature space ○ ○ ○ ○ ○

3 input channels 4 feature maps 2 feature maps 4 feature maps 2 outputs (e.g. 2-class classifier)

OverFeat • Pierre Sermanet • New York University

Classi3ication:  CNNs  and  Sliding  Windows  

Page 34: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Classi3ication:  Results  

Page 35: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Localization  

Page 36: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Training  Localizer  •  Use  same  first  5  layers  as  trained  classifier  •  Remove  fully  connected  layers,  replace  with  regressor  •  Train  again  on  labeled  input  with  bounding  boxes  

Page 37: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Localization:  Fully  Connected  Layers  

Page 38: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Localization:  Bounding  Boxes  Produced  By  Regression  

Page 39: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Localization:  Combing  Predictions  Algorithm:  

Page 40: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Localization:  Results  

Page 41: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Detection  

Page 42: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

ImageNet Challenge 2013

OverFeat • Pierre Sermanet • New York University

● Detection: ○ ○ ○ ○

200 classes Smaller objects than classification/localization Any number of objects (including zero) Penalty for false positives

Page 43: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Differences  Between  Detection  and  Location  

• Can  now  have  many  objects  instead  of  just  one  • Penalized  for  incorrect  guesses  • Need  to  dis.nguish  background  from  objects  

Page 44: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Training  Detector  

• Almost  iden.cal  to  classifica.on/localiza.on  training  • New  class  added  –  background  • Background  class  updated  on  the  fly:  extremely  incorrect  classifica.ons  are  used  to  train  background  class  

Page 45: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Detection  Results  

Page 46: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Conclusion  

OverFeat  provides  a  way  to  extract  powerful  CNN  based  features  for  image  classifica.on,  localiza.on  and  detec.on  with  high  speed  and  precision  

Page 47: OverFeat( - Stanford Universityvision.stanford.edu/teaching/cs231b_spring1415/slides/overfeat_eric.pdfOverFeat • Pierre Sermanet • New York University Classiication:CNNsandSliding(Windows(ConvNets

Thanks!