ECS 289H: Visual Recognitionweb.cs.ucdavis.edu/~yjlee/teaching/ecs289h-fall2014/introduction.pdf ·...

ECS 289H: Visual RecognitionFall 2014

Yong Jae Lee

Department of Computer Science

Plan for today

• Topic overview

• Introductions

• Course requirements and administrivia

• Syllabus tour (if time permits)

Computer Vision

• Let computers see!

• Automatic understanding of visual data (i.e., images and video)

─ Computing properties of the 3D world from visual data (measurement)

─ Algorithms and representations to allow a machine to recognize objects, people, scenes, and activities (perception and interpretation)

Kristen Grauman

What does recognition involve?

Fei-Fei Li

Detection: are there people?

Activity: What are they doing?

Object categorization

mountain

building

banner

vendor

people

street lamp

Instance recognition

Potala

Palace

A particular

Scene categorization

• outdoor

• city

• …

Attribute recognition

made of fabric

crowded

Why recognition?

• Recognition a fundamental part of perception• e.g., robots, autonomous agents

• Organize and give access to visual content• Connect to information

• Detect trends and themes

• Why now?

Kristen Grauman

Posing visual queries

Kooaba, Bay & Quack et al.

Yeh et al., MIT

Belhumeur et al.

Kristen Grauman

Interactive systems

Shotton et al.

Autonomous agents

Google self-driving car

Mars rover

Exploring community photo collections

Snavely et al.

Simon & SeitzKristen Grauman

Discovering visual patterns

Actions

Categories

Characteristic elements

Wang et al.

Doersch et al.

Lee & Grauman

Recognizing flat, textured objects (e.g. books, CD covers, posters)

Reading license plates, zip codes, checks

Fingerprint recognition

Frontal face detection

What kinds of things work best today?

Kristen Grauman

What are the challenges?

What we see What a computer sees

Challenges: robustness

Illumination Object pose

ViewpointIntra-class appearance

Occlusions

Clutter

Kristen Grauman

Context cues

Kristen Grauman

Challenges: context and human experience

Context cues Function Dynamics

Kristen Grauman

Fei Fei Li, Rob Fergus, Antonio Torralba

Challenges: scale, efficiency

• Half of the cerebral cortex in primates is devoted to processing visual information

6 billion images 70 billion images 1 billion images served daily

10 billion images

100 hours uploaded per minute

Almost 90% of web traffic is visual!

Biederman 1987

Challenges: learning with minimal supervision

MoreLess

Kristen Grauman

Slide from Pietro Perona, 2004 Object Recognition workshop

Inputs in 1963…

L. G. Roberts, Machine Perception

of Three Dimensional Solids,

Ph.D. thesis, MIT Department of

Electrical Engineering, 1963.

Kristen Grauman

Personal photo albums

Surveillance and security

Movies, news, sports

Medical and scientific images

… and inputs today

Svetlana Lazebnik

Understand and organize and index all this data!!

Introductions

Course Information

• Website: https://sites.google.com/a/ucdavis.edu/ecs-289h-visual-recognition/

• Office hours: by appointment (email)

• Email: start subject with “[ECS289H]”

This course

• Survey state-of-the-art research in computer vision

• High-level vision and learning problems

• Understand, analyze, and discuss state-of-the-art techniques

• Identify interesting open questions and future directions

Prerequisites

• Interest in computer vision!

• Courses in computer vision, machine learning, image processing is a plus (but not required)

Requirements

• Class Participation [25%]

• Paper Reviews [25%]

• Paper Presentations (1-2 times) [25%]

• Final Project [25%]

Class participation

• Actively participate in discussions

• Read all assigned papers before each class

Paper reviews

• Detailed review of one paper before each class

• One page in length─ A summary of the paper (2-3 sentences)

─ Main contributions

─ Strengths and weaknesses

─ Experiments convincing?

─ Extensions?

─ Additional comments, including unclear points, open research questions, and applications

• Email by 11:59 pm the day before each class

• Email subject “[ECS 289H] paper review”

Paper presentations

• 1 or 2 times during the quarter

• ~20 minutes long, well-organized, and polished:─ Clear statement of the problem

─ Motivate why the problem is interesting, important, and/or difficult

─ Describe key contributions and technical ideas

─ Experimental setup and results

─ Strengths and weaknesses

─ Interesting open research questions, possible extensions, and applications

• Slides should be clear and mostly visual (figures, animations, videos)

• Look at background papers as necessary

Paper presentations

• Search for relevant material, e.g., from the authors' webpage, project pages, etc.

• Each slide that is not your own must be clearly cited

• Meet with me one week before your presentation with prepared slides

• Email slides on day of presentation

• Email subject "[ECS 289H] presentation slides"

• Skip paper review the day you are presenting

Final project

• One of:─ Design and evaluation of a novel approach

─ An extension of an approach studied in class

─ In-depth analysis of an existing technique

─ In-depth comparison of two related existing techniques

• Work in pairs

• Talk to me if you need ideas

Final project

• Final Project Proposal (5%) due 10/31─ 1 page

• Final Project Presentation (10%) due 12/9, 12/11─ 15 minutes

• Final Project Paper (10%) due TBD─ 6-8 pages

Local Features and Matching

• Local invariant features; detectors and descriptors

• Matching, indexing, bag-of-words

Image classification

• Object and scene classification

• Global and object-based representations

Object detection

• Discriminative methods

• Faces, people, objects

Unsupervised/weakly-supervised discovery

• Clustering, context

• Mid-level visual elements

Segmentation

• Grouping pixels

Human-in-the-loop

• Interactive systems, crowdsourcing

• Active learning

Activity recognition

• What are the humans in the image or video doing?

Human pose

• Finding people and their poses

Attributes

• Describing properties of objects, people, scenes

Image search and mining

• Large-scale instance-based recognition

• Discovering connections

Language and images

• Generating image descriptions

Big data

• Data-driven techniques

• Dataset bias

First-person vision

• Recognition from the first-person view

• Social scene understanding

Deep convolutional neural networks

Coming up

• Carefully read class webpage

• Sign-up for papers you want to present

• Next class: overview of my research

ECS 289H: Visual Recognitionweb.cs.ucdavis.edu/~yjlee/teaching/ecs289h-fall2014/introduction.pdf ·...

Documents

DeCAF: A Deep Convolutional Activation Feature for Generic ...yjlee/teaching/ecs289g-fall2015/decaf.pdf · with deep convolutional neural networks." Advances in neural information

Window-based models for generic object detectionweb.cs.ucdavis.edu/~yjlee/teaching/ecs189g-spring2015/lee_lecture... · Window-based models for generic object detection ... Rapid

Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

Image warping and stitchingweb.cs.ucdavis.edu/~yjlee/teaching/ecs189g-spring2015/lee_lecture1… · Image warping and stitching May 5th, 2015 Yong Jae Lee UC Davis

ECS 289H: Visual Recognitionyjlee/teaching/ecs289h... · ground truth segmentation were known. Quality of Foreground Detection 10-classes subset-highly weighted features. Shape •

Learning to Anonymize Faces for Privacy Preserving Action ...web.cs.ucdavis.edu/~yjlee/projects/eccv2018-privacy.pdfFace Recognition Face recognition is a well-studied problem [59,23]

Real-Time Human Pose Recognition in Parts from Single ...yjlee/teaching/ecs289h-fall2014/kinect.pdfReal-Time Human Pose Recognition in Parts from Single Depth Images Jamie Shotton,

ImageNet Classification with Deep Convolutional Neural ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/yuanzhe.pdf · ImageNet Classification with Deep Convolutional Neural

Cross-Domain Self-supervised Multi-task Feature Learning ...web.cs.ucdavis.edu/~yjlee/projects/cvpr2018.pdfCross-Domain Self-supervised Multi-task Feature Learning using Synthetic

ECS 289G: Visual Recognitionweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/introduction.pdf · webpage, project pages, etc. • Each slide that is not your own must be clearly

Local features: detection and descriptionweb.cs.ucdavis.edu/~yjlee/teaching/ecs174-spring2018/lee_lecture12... · feature descriptor surrounding each interest ... SIFT descriptor

Foreground Focus: Unsupervised Learning from Partially …yjlee/projects/ijcv2009.pdf · ilarity improves unsupervised image clustering, and apply the technique to automatically discover

Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,

lee lecture1 intro - University of California, Davisweb.cs.ucdavis.edu/~yjlee/teaching/ecs174-spring2017/lee... · 2017. 4. 5. · 4/5/2017 3 Computer Vision • Automatic understanding

Poselets Lubomir Bourdev and Jitendra Malik EECS, U.C ...yjlee/teaching/ecs289h... · Bourdev, Lubomir, and Jitendra Malik. "Poselets: Body part detectors trained using 3d human pose

DOCK: Detecting Objects by transferring Common …yjlee/projects/eccv2018-dock.pdfDOCK: Detecting Objects by transferring Common-sense Knowledge 3 detection performance for the target

“Unsupervised Learning of Visual Representations by ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/chenshan.pdfTrain CFN:Jigsaw Puzzle Task - Jigsaw Puzzle Permutations -

Convolutional neural networks IIIweb.cs.ucdavis.edu/~yjlee/teaching/ecs269-fall2018/cnn_basics3.pdf · Convolutional neural networks III October 3rd, 2018. Yong Jae Lee. UC Davis

Interspecies Knowledge Transfer for Facial …web.cs.ucdavis.edu/~yjlee/projects/interspecies_cvpr2017.pdfInterspecies Knowledge Transfer for Facial Keypoint Detection Maheen Rashid

UNBIASED LOOK AT DATASET BIAS - Computer Scienceweb.cs.ucdavis.edu/~yjlee/teaching/ecs289h-fall2014/Pres...UNBIASED LOOK AT DATASET BIAS Antonia Torralba Alexei A. Efros MIT CMU Presented