21
YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi [Website ] [Paper ] [arXiv ] [Reviews ]

YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,

  • Upload
    hanhi

  • View
    246

  • Download
    1

Embed Size (px)

Citation preview

Page 1: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,

YOLO:

You Only Look OnceUnified Real-Time Object Detection

Slides by: Andrea FerriFor: Computer Vision Reading Group (08/03/16)

Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi

[Website] [Paper] [arXiv] [Reviews]

Page 2: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,

INTRODUCTION

Page 3: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,

Nowadays State of the Art approach, are so architected:

Conv

Layer 5C

on

v

layers

RPN RPN Proposals

RPN Proposals

Class probabilities

RoI pooling layer

FC layers

Class scores

Page 4: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,

This complex pipeline means that:

Slow Pipeline

Single Pipelines Hard to Optimize

Need Parallel Training for Components

Page 5: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,

WHAT’S NEW?(In the architecture approach.)

Page 6: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,

Developed as Single Convolutional Network

Reason Globally on the Entire Image

Learns Generalizable Representations

Easy & Fast

Detection as Single Regression Problem

Concepts

Page 7: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,

Unified Detection

Page 8: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,

Divide the image into a SxS grid.

If the center of an object fall into a grid cell, it will be the responsible for the object.

Each grid cell predict:

B bounding boxes;

B confidence scores as C=Pr(Obj)*IOU;

Confidence Prediction is obtained as IOU of predicted box and any ground truth box.

C cond. Class prob. as P=Pr(𝑪𝒍𝒂𝒔𝒔𝒊|Object);

Page 9: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,

We obtain the class-specific confidence score as:

Pr(𝑪𝒍𝒂𝒔𝒔𝒊|Object)*Pr(Object)*IOU=

Pr(𝑪𝒍𝒂𝒔𝒔𝒊)*IOU

Page 10: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,

Design

Page 11: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,

Loss-Function

Page 12: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,

LimitationsStruggle with Small Object.

Loss function threats errors in different boxes ratio at the same.

Struggle with Different aspects and ratios of objects.Loss function is an approximation.

Page 13: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,

EXPERIMENTS(How performs?.)

Page 14: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,

General Comparison

Page 15: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,

Fast R-CNN & YOLO

Page 16: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,

Fast R-CNN & YOLOUsing YOLO accuracy for Big object to avoid detection mistakes into Fast R-CNN:

Page 17: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,

Fast R-CNN & YOLO

Page 18: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,

SUMMARY(Why is an interesting approach.)

Page 19: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,

The fastest general-purpose object

detector in the literature.

Trained on a loss function that

directly corresponds to detection performance.

The entire model is trained jointly.

At least detection at 45fps.

Pros

Page 20: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,

• You Only Look Once: Unified, Real-Time Object Detection,

Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi.

References

Page 21: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,

QUESTIONS?

THANKS !!!