28
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications Tutorial at CVPR 2014 June 23rd, 1:00pm-5:00pm, Columbus, OH

Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications

Tutorial at CVPR 2014 June 23rd, 1:00pm-5:00pm, Columbus, OH

Page 2: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

Instructors:

Shih-Fu Chang John Smith Rogerio Feris Liangliang Cao

Columbia University IBM T. J. Watson Research Center

Page 3: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

1970s

Early Days of Computer Vision

Page 4: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

First Digital Camera (1975)

0.01 Megapixels

23 seconds to record a photo to cassette

Page 5: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

Datasets with 5 or 10 images

Large-Scale Experiment: 800 photos (Takeo Kanade Thesis, 1973)

[D. Marr, 1976]

Page 6: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

Today

Visual Data is Exploding!

Page 7: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

Announcement of Pope Benedict in 2005

Page 8: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

Announcement of Pope Francis in 2013

Rapid proliferation of mobile devices equipped with cameras

Page 9: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

Billions of cell phones equipped with cameras

~500 billion consumer photos are taken each year world-wide ~500 million photos taken per year in NYC alone

Hundreds of millions of Facebook photo uploads per day

Era of Big Visual Data

Page 10: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

Page 11: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

Exciting Time for Computer Vision

+ DATA

+ Computational Processing

+ Advances in Computer Vision and Machine Learning

Major opportunities for systems that automatically extract visual semantics from images and videos

Page 12: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Examples of Practical Application Areas

Page 13: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Examples of Application Areas

Smart Surveillance “Show me all images of people matching the suspect description from time

X to time Y from all cameras in area Z.”

Visual Semantics: Fine-grained person attributes

Slide credit: Rogerio Feris

Page 14: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Examples of Application Areas

Medical Imaging

MRI Brain Axial

DX Torso DX Cervical Spine

PET Color DX Appendage

MRI Knee

Visual Semantics: Medical Image Modality and Anatomy

Slide credit: John Smith

Page 15: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Examples of Application Areas

Astronomy [Cui et al, WACV 2015] http://www.galaxyzoo.org/

Visual Semantics: morphological galaxy attributes

Slide credit: Rogerio Feris

Huge dataset of galaxy images makes manual labeling infeasible

(important to understand star formation, gas fraction, galaxy evolution, …)

Page 16: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Examples of Application Areas

Nature / Ecology

http://www.youtube.com/watch?v=AUL03ivS8bY

http://www.snapshotserengeti.org/

Understanding how competing species coexist is a fundamental theme in ecology, with important implications for biodiversity, and the sustainability of life on Earth

Snapshot Serengeti

Visual Semantics: species of animals from camera traps

Slide credit: Rogerio Feris

Page 17: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Examples of Application Areas

Nature / Ecology

Slide credit: Rogerio Feris

Plant Species

[Kumar et al, ECCV 2012]

Bird Species

http://www.vision.caltech.edu/visipedia/

Understanding of migration, conservation, … Used by botanists, educators, …

Page 18: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Examples of Application Areas

Social Media: Visual Sentiment Analysis

Colorful clouds Misty night

Colorful butterfly Crying Baby

[Borth et al, ACM MM 2013]

Slide credit: Rogerio Feris

Page 19: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Many more applications …

Google Goggles

Amazon

[Kovashka et al, CVPR 2012]

Slide credit: Rogerio Feris

Page 20: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Page 21: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Objectives:

Cover state-of-the-art techniques for learning visual semantics from images and videos

Focus on intuitive, semantic visual representations

Provide tools for scalable learning of semantic models

Cover innovative and practical applications

Provide pointers to related source code and datasets

Page 22: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Part I: Feature Extraction, Coding, and Pooling (Liangliang)

Brief Introduction to local feature descriptors, coding ,and pooling

Focus on modern representations such as Fisher Vector and Sparse Coding

Page 23: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Part I: Feature Extraction, Coding, and Pooling (Liangliang)

Connections to feature learning approaches (e.g., deep convolutional neural networks)

Picture credit: Kai Yu

Page 24: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Part II: Large-Scale Semantic Modeling (John Smith)

Semantic Concept Modeling: Historic Overview

Picture credit: John Smith

Page 25: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Part II: Large-Scale Semantic Modeling (John Smith)

How to deal with class imbalance? How to scale to millions of semantic unit models?

Picture credit: John Smith

Page 26: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Part III: Shifting from naming to describing: semantic attribute models (Rogerio Feris)

Scalable learning with Attribute Models / Zero-Shot Learning

[Lampert et al, CVPR 2009]

Page 27: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Part III: Shifting from naming to describing: semantic attribute models (Rogerio Feris)

Attribute-based Search

Slide credit: Rogerio Feris

Page 28: Learning Visual Semantics: Models, Massive Computation ...mp7.watson.ibm.com/LearningVisualSemantics/slides/FerisIntroduction.pdfLearning Visual Semantics: Models, Massive Computation,

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Part IV: High-level Semantic Modeling: Visual Sentiment Analysis (Shih-Fu Chang)

Semantic models for encoding emotions in social media