46
1 Video Surveillance E6998 -007 Senior/Feris/Tian Behavior Analysis Rogerio Feris IBM TJ Watson Research Center [email protected] http://rogerioferis.com

Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

1 Video Surveillance E6998 -007 Senior/Feris/Tian

Behavior Analysis

Rogerio Feris

IBM TJ Watson Research Center

[email protected]

http://rogerioferis.com

Page 2: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

2 Video Surveillance E6998 -007 Senior/Feris/Tian

Outline

� Motivation

� Action Recognition

• Template-Based Approaches

• State-Space Approaches

� Detecting Suspicious Behavior

Page 3: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

3 Video Surveillance E6998 -007 Senior/Feris/Tian

Motivation

� Action Recognition in Surveillance Video

Detecting people fighting Falling person detection

Page 4: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

4 Video Surveillance E6998 -007 Senior/Feris/Tian

Motivation

� Detecting suspicious behavior

[Boiman and Irani, 2005]

Fence Climbing

Page 5: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

5 Video Surveillance E6998 -007 Senior/Feris/Tian

� Find all locations where objects enter or exit (green)� Find all ‘normal’ routes between these locations- average path and

observed deviations.

Motivation

Page 6: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

6 Video Surveillance E6998 -007 Senior/Feris/Tian

Tracks anomalies (not matching trained routes)

Motivation

Page 7: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

7 Video Surveillance E6998 -007 Senior/Feris/Tian

Motivation

� Long-term reasoning / object interaction

[Ivanov and Bobick, 2000]

Car/person interactions (e.g., car picking up a person)

Page 8: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

8 Video Surveillance E6998 -007 Senior/Feris/Tian

Challenges

� Strong appearance variation in semantically similar events (e.g., people performing actions with different clothing

� Viewpoint Variation

� Duration of the action / frame rate

� Action segmentation – determining beginning and end of the action

Page 9: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

9 Video Surveillance E6998 -007 Senior/Feris/Tian

Outline

� Motivation

� Action Recognition

• Template-Based Approaches

• State-Space Approaches

� Detecting Suspicious Behavior

Page 10: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

10 Video Surveillance E6998 -007 Senior/Feris/Tian

Action Recognition – Template-Based

� Motion History Image (MHI): Scalar-valued image where brighter pixels correspond to more recently moving pixels

Temporal Templates [Bobick and Davis, 1996]

Binary image indicating regions of motion

Page 11: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

11 Video Surveillance E6998 -007 Senior/Feris/Tian

Action Recognition – Template-Based

� Motion History Image (MHI): Scalar-valued image where brighter pixels correspond to more recently moving pixels

Temporal Templates [Bobick and Davis, 1996]

Page 12: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

12 Video Surveillance E6998 -007 Senior/Feris/Tian

Action Recognition – Template-Based

� At the current frame, statistical descriptors based on moments (translation and scale invariant) are extracted from the current MHI and matched against stored exemplars for classification

� Three actions: sitting, arm waving , and crouching. View-based approach to handle camera view changes.

� Problems with ambiguities, occlusions, poor motion segmentation

Temporal Templates [Bobick and Davis, 1996]

Page 13: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

13 Video Surveillance E6998 -007 Senior/Feris/Tian

Action Recognition – Template-Based

Recognizing Action at a Distance [Efros et al, ICCV’03]

� 3-pixel man� Blob tracking

� vast surveillance literature

� 300-pixel man� Limb tracking

� e.g. Yacoob & Black, Rao & Shah, etc.

Page 14: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

14 Video Surveillance E6998 -007 Senior/Feris/Tian

The 30-Pixel Man

Action Recognition – Template-Based

Recognizing Action at a Distance [Efros et al, ICCV’03]

Page 15: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

15 Video Surveillance E6998 -007 Senior/Feris/Tian

Action Recognition – Template-Based

Appearance versus Motion

Recognizing Action at a Distance [Efros et al, ICCV’03]

Page 16: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

16 Video Surveillance E6998 -007 Senior/Feris/Tian

� Tracking • Simple correlation-based tracker

• User-initialized

Figure-centric Representation

Page 17: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

17 Video Surveillance E6998 -007 Senior/Feris/Tian

input sequence

� “Explain” novel motion sequence by matching to previously seen video clips

• For each frame, match based on some temporal extent

Challenge: how to compare motions?

motion analysisrun

walk leftswing

walk rightjog

database

Action Recognition – Template-Based

Recognizing Action at a Distance [Efros et al, ICCV’03]

Page 18: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

18 Video Surveillance E6998 -007 Senior/Feris/Tian

Spatial Motion Descriptor

Image frame Optical flow yxF ,

yx FF , +−+−yyxx FFFF ,,, blurred +−+−

yyxx FFFF ,,,

Page 19: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

19 Video Surveillance E6998 -007 Senior/Feris/Tian

t

…Σ

Sequence A

Sequence B

Temporal extent E

Bframe-to-frame

similarity matrix

A

motion-to-motionsimilarity matrix

A

B

I matrix

E

E

blurry I

E

E

Two ‘person running’ sequences - periodic behavior

Page 20: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

20 Video Surveillance E6998 -007 Senior/Feris/Tian

Action Recognition – Template-Based

Recognizing Action at a Distance [Efros et al, ICCV’03]

� Classification is done for each frame. The spatial-temporal descriptor centered at the current frame is matched against the database ofactions (previously stored spatial-temporal descriptors).

� For each frame of the probe sequence, the maximum score in the corresponding row of the motion-to-motion similarity matrix (between probe and one sequence of the database) will indicate the best match to the spatial-temporal descriptor centered at this frame.

� K-nearest neighbors is used to determine the action.

� Good results were demonstrated in sequences related to tennis, soccer, and dancing.

Page 21: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

21 Video Surveillance E6998 -007 Senior/Feris/Tian

2D Skeleton Transfer

� The database is annotated with 2D joint positions

� After matching, data is transfered to novel sequence

Input sequence:

Transferred 2D skeletons:

Action Recognition – Template-Based

Recognizing Action at a Distance [Efros et al, ICCV’03]

Page 22: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

22 Video Surveillance E6998 -007 Senior/Feris/Tian

Actor Replacement

Show Video GregWordCup.avi

http://graphics.cs.cmu.edu/people/efros/research/action/

Action Recognition – Template-Based

Recognizing Action at a Distance [Efros et al, ICCV’03]

Page 23: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

23 Video Surveillance E6998 -007 Senior/Feris/Tian

Action Recognition – Template-Based

� Proposed for image similarity. Action detection is a particularapplication

Local Self-Similarities [Shechtman and Irani, CVPR’07]

How to measure similarity in these images?

Page 24: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

24 Video Surveillance E6998 -007 Senior/Feris/Tian

Action Recognition – Template-Based

Local Self-Similarities [Shechtman and Irani, CVPR’07]

Page 25: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

25 Video Surveillance E6998 -007 Senior/Feris/Tian

Action Recognition – Template-Based

Local Self-Similarities [Shechtman and Irani, CVPR’07]

The descriptor implicitly handles the similarity between people wearing different clothes. Also, the spatial-temporal log-polar binning allows for better matching under different action durations / frame rate.

Page 26: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

26 Video Surveillance E6998 -007 Senior/Feris/Tian

Action Recognition – Template-Based

� Complex actions performed by different people wearing different clothes with different backgrounds, are detected with no prior learning, based on a single example clip.

Local Self-Similarities [Shechtman and Irani, CVPR’07]

Page 27: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

27 Video Surveillance E6998 -007 Senior/Feris/Tian

Action Recognition – Template-BasedSpatial-Temporal Bag of Words [Niebles et al, CVPR’06]

Page 28: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

28 Video Surveillance E6998 -007 Senior/Feris/Tian

Outline

� Motivation

� Action Recognition

• Template-Based Approaches

• State-Space Approaches

� Detecting Suspicious Behavior

Page 29: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

29 Video Surveillance E6998 -007 Senior/Feris/Tian

Action Recognition – State-Space

Hidden Markov Models [Rabiner, 1989]

Page 30: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

30 Video Surveillance E6998 -007 Senior/Feris/Tian

Action Recognition – State-Space

Hidden Markov Models [Rabiner, 1989]

Page 31: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

31 Video Surveillance E6998 -007 Senior/Feris/Tian

Action Recognition – State-Space

Hidden Markov Models [Rabiner, 1989]

Three Basic Problems:

Forward-Backward Algorithm

Page 32: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

32 Video Surveillance E6998 -007 Senior/Feris/Tian

Action Recognition – State-Space

Hidden Markov Models [Rabiner, 1989]

Three Basic Problems:

Viterbi Algorithm

Page 33: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

33 Video Surveillance E6998 -007 Senior/Feris/Tian

Action Recognition – State-Space

Hidden Markov Models [Rabiner, 1989]

Three Basic Problems:

Baum-Welch Algorithm

Page 34: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

34 Video Surveillance E6998 -007 Senior/Feris/Tian

Action Recognition – State-Space

Hidden Markov Models [Rabiner, 1989]

Action Recognizer:

� Learn an HMM model for each action in the database (e.g., HMM for ‘running’, HMM for ‘fighting’, etc.) – Baum-Welch algorithm

� Given an action sequence, compare it with all HMMs in the database and select the one which best explains the probe sequence – Forward-Backward algorithm

Page 35: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

35 Video Surveillance E6998 -007 Senior/Feris/Tian

Action Recognition – State-Space

� [Yamato et al, 1992] - First application of HMMs for gesture recognition (for recognizing tennis strokes)

� From there on HMMs have been extensively applied in many gesture recognition problems (Sign Language Recognition, Head Gesture, etc.)

� Many variations have been proposed (see e.g., coupled HMMs). More recently, Conditional Random Fields (CRFs)have proven to be very successful to model human motion [Sminchisescu et al, ICCV 2005]

Page 36: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

36 Video Surveillance E6998 -007 Senior/Feris/Tian

Action Recognition – State-Space

Modeling Interactions with Stochastic Grammars [Ivanov and Bobick, 2000]

� Recognize actions with larger temporal range

� Two-Stage Approach:

• Detection of low-level discrete events (e.g., using HMMs or tracking)

• Action Recognition using Stochastic Grammars

Page 37: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

37 Video Surveillance E6998 -007 Senior/Feris/Tian

Action Recognition – State-Space

Modeling Interactions with Stochastic Grammars [Ivanov and Bobick, 2000]

Background: Earley Parsing for Context-free Grammars

� See description in wikipedia

� Three main steps: Prediction, Scanning, Completion

Page 38: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

38 Video Surveillance E6998 -007 Senior/Feris/Tian

Earley Parsing Example

Page 39: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

39 Video Surveillance E6998 -007 Senior/Feris/Tian

Action Recognition – State-Space

Modeling Interactions with Stochastic Grammars [Ivanov and Bobick, 2000]

Probabilistic Earley Parsing

� Production rules are augmented with probabilities

� Parse tree with highest probability is generated [Stolcke, Bayesian Learning of Probabilistic Language Models,1994]

Page 40: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

40 Video Surveillance E6998 -007 Senior/Feris/Tian

Action Recognition – State-Space

Modeling Interactions with Stochastic Grammars [Ivanov and Bobick, 2000]

Car/Person Interaction

Low-level discrete event detection

� Track moving blobs

� Generate events: {person,car}+{enter,found,exit,lost,stopped}

Page 41: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

41 Video Surveillance E6998 -007 Senior/Feris/Tian

Modeling Interactions with Stochastic Grammars [Ivanov and Bobick, 2000]

Page 42: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

42 Video Surveillance E6998 -007 Senior/Feris/Tian

Outline

� Motivation

� Action Recognition

• Template-Based Approaches

• State-Space Approaches

� Detecting Suspicious Behavior

Page 43: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

43 Video Surveillance E6998 -007 Senior/Feris/Tian

Suspicious Behavior

� Problem: given a few “regular” examples, compute the likelihood of a new observation

Detecting Irregularities [Boiman and Irani, ICCV 2005]

Database Query

� Construct the likelihood using chuncks of data from the examples. Large matching chunks imply large likelihood.

Page 44: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

44 Video Surveillance E6998 -007 Senior/Feris/Tian

Suspicious Behavior

� Problem: given a few “regular” examples, compute the likelihood of a new observation

Detecting Irregularities [Boiman and Irani, ICCV 2005]

Database

� Construct the likelihood using chuncks of data from the examples. Large matching chunks imply large likelihood.

Query

Page 45: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

45 Video Surveillance E6998 -007 Senior/Feris/Tian

Suspicious Behavior

Detecting Irregularities [Boiman and Irani, ICCV 2005]

Page 46: Behavior Analysis - Rogerio Ferisrogerioferis.com/ClassMarch31/BehaviorClassMarch31.pdf · Local Self-Similarities [Shechtman and Irani, CVPR’07] The descriptor implicitly handles

46 Video Surveillance E6998 -007 Senior/Feris/Tian

Suspicious Behavior

� [Zhong et al, Detecting Unusual Activity in Video, CVPR’04]

See Also:

� [Stauffer and Grimson, Learning patterns of activity using real-time tracking, 2000]

� [Lei Chen et al, Robust and fast similarity search for moving object trajectories, 2005]

Motion Trajectory Behavior: