Learning realistic human actions from...

Learning realistic human actions from movies

by Ivan Laptev, Marcin Marszalek, Cordelia Schmid, and Benjamin Rozenfeld

PRESENTATION BY KERRY SEITZ

The Problem

Recognize natural human actions

Realistic videos

Getting out of a car

Answering a phone

Performing CPRKissing

2[LAPTEV ET AL. 2008]

Challenges

Lack of datasets

Variations in:◦ Expression, posture, motion, and clothing

◦ Camera motion and perspective

◦ Illumination

◦ Occlusion and surroundings

Automatic Annotation of Human Actions

Use movie scripts

Problems◦ No time information

◦ Script and movie don’t always match

◦ Variations in phrasing

Script-to-Video Alignment

Alignment score (a) for each scene◦ Script-subtitle misalignment

◦ a = (# matched words) / (# all words)

Types of errors when a=1◦ Misaligned in time (10%)

◦ Outside the field of view (10%)

◦ Missing in the video (10%)

Text Retrieval of Human Actions

Phrasing variations◦ “Will gets out of the Chevrolet.”

◦ “A black car pulls up. Two army officers get out.”

◦ “Erin exits her new truck.”

False positives◦ “About to sit down, he freezes.”

Keyword search is insufficient!

Train classifier for each action (bag of features model)◦ Words

◦ Adjacent pairs of words

◦ Pairs of words within a window of N words (2 ≤ N ≤ 8)

Regularized perceptron◦ Equivalent to SVM

◦ Trained on manually labeled scene descriptions

◦ Tuned using validation set

The Datasets

Manual and Test Sets◦ Manually annotated scripts

◦ Manually selected visually-correct action samples

Automatic Set◦ Automatically annotated scripts

◦ Automatically selected action samples

◦ a > 0.5

◦ Length < 1,000 frames

KTH Dataset

Action Recognition

Sparse space-time features◦ Compact representation

◦ Tolerant to background clutter, occlusions, and scale changes

Interest point detection – Harris operator◦ Multiple levels of spatio-temporal scales

Interest Point Detection

Features at the Interest points

Histogram of descriptors of space-time volumes◦ Volumes divided into (nx, ny, nt) grid of cuboids

◦ Compute histogram of oriented gradients (HoG)

◦ Compute histogram of optic flow (HoF)

14[IKIZLER ET AL. 2008]

Spatio-Temporal Bag-of-Features

k-means with 4,000 clusters

Different grid sizes

Classify with non-linear SVM

Evaluation ofSpatio-Temporal Grids

Comparison to theState-of-the-Art

KTH Dataset Divided into:◦ Training/validation set (8+8 people)

◦ Test set (9 people)

Use best performing channel combination

Confusion Matrix

Noise in Training Data

Results for Real-World Videos

Examples

Summary

Automatic annotation using movie scripts

Action recognition performs better than state-of-the-art

System tolerant to errors in training data

Future Work

Improve script-to-video alignment

Improve tolerance of classifier◦ Iterative learning

Experiment with other space-time low-level features

Questions?

References

Learning Realistic Human Actions from Movies. I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld. CVPR 2008.

Human Action Recognition with Line and Flow Histograms. N. Ikizler, G. Cinbis, and P. Duygulu. ICPR 2008.

Learning realistic human actions from...

Documents

Visualizing Planar Vector Fields with Normal …graphics.cs.ucdavis.edu/~joy/ecs289h/Papers/ScheuermannPlanarField… · Visualizing Planar Vector Fields with Normal Component Using

Introduction to Information Visualization - Computer …web.cs.ucdavis.edu/~ma/ECS289H/lectures/Intro.pdf · · 2005-01-07ECS289H, Winter 2005 Introduction to Information Visualization

Inner Melbourne Action Plan ‘IMAP’ Making Melbourne more ...imap.vic.gov.au/uploads/Meeting Agendas/2014 August/Attachment … · considering realistic actions to reduce the impact

Realistic modeler

Learning realistic human actions from moviesefros/courses/LBMV09/...Action class detection and recognition in realistic video [ICCV07] Learning realistic human actions from movies

Realistic Rock

ECS 289H: Visual Recognitionyjlee/teaching/ecs289h... · ground truth segmentation were known. Quality of Foreground Detection 10-classes subset-highly weighted features. Shape •

Learning realistic human actions from movies - · PDF fileLearning realistic human actions from movies Ivan Laptev Marcin Marszałek Cordelia Schmid Benjamin Rozenfeld INRIA Rennes,

Realistic faces

Super Realistic

REalistiC Decisions

GENRE UNIT: REALISTIC FICTION - wtps.org UNIT: REALISTIC FICTION ... Realistic Fiction: Realistic fiction, ... c. Pose and response to specific questions by making comments that contribute

Some Observations upon “Realistic” Trajectories in ... · Some Observations upon “Realistic” Trajectories ... Some Observations upon “Realistic” Trajectories in ... of

Intent Driven Adversarial Modeling MODELING AND SIMULATION · 2011. 5. 14. · adversary force actions and reactions to provide a complete and realistic viewpoint. Current methods

Realistic fiction

Spatially Realistic Computational Physiology: Past, …papers.cnl.salk.edu › PDFs › Spatially Realistic...1 SPATIALLY REALISTIC COMPUTATIONAL PHYSIOLOGY: PAST, PRESENT AND FUTURE

Learning Realistic Human Actions from Moviespeople.cs.pitt.edu/~kovashka/cs3710_sp15/actions_nils.pdf · Learning Realistic Human Actions from Movies Ivan Laptev*, ... 2 Action recognition

Simcenter 3D for multiphysics simulation SW Simcenter 3D... · 2020. 5. 27. · Enabling multiphysics analysis Realistic simulation must consider the real-world inter-actions between

P-CORE5AMFD02 5/14 GMH-20190917114644europeanmemoranda.cabinetoffice.gov.uk/files/2019/09/... · 2019. 9. 17. · that Actions and Targets are both realistic and measurable, thus

Being Realistic