13
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Trajectory-based Features Yu-Gang Jiang*, Qi Dai*, Chun Chet Tan**, Xiangyang Xue*, Chong-Wah Ngo** *School of Computer Science, Fudan University, Shanghai **Department of Computer Science, City University of Hong Kong, HK MediaEval 2012 Workshop, Oct 4-5, Pisa, Italy

The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Trajectory-based Features

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Trajectory-based Features

The Shanghai-Hongkong Team at MediaEval2012: Violent

Scene Detection Using Trajectory-based Features

Yu-Gang Jiang*, Qi Dai*, Chun Chet Tan**, Xiangyang Xue*, Chong-Wah Ngo**

*School of Computer Science, Fudan University, Shanghai

**Department of Computer Science, City University of Hong Kong, HK

MediaEval 2012 Workshop, Oct 4-5, Pisa, Italy

Page 2: The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Trajectory-based Features

Outlines• Introduction

• Framework

• Feature Extraction

• Classifiers

• Temporal Smoothing

• Results

• Discussions

• First 20 clips retrieved

Page 3: The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Trajectory-based Features

Introduction• Violent Scene Detection task [1] -

practical challenge, great potential in applications.

• Focus on novel features.

• Top performance in mAP@20, runner-up in mAP@100

[1] C.-H. Demarty, C. Penet, G. Gravier, and M. Soleymani. The MediaEval 2012 Affect Task: Violent Scenes Detection. In MediaEval 2012 Workshop, Pisa, Italy, 2012.

Page 4: The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Trajectory-based Features

Framework

The circled numbers indicate the 5 submitted runs

Feature extraction

Trajectory-based (7 features)

Spatial-temporal interest point

MFCC audio feature

χ2 kernel SVM

Classifiers

SIFT

Concept-based

5

4

Video shots

3

Detection score-level temporal

smoothing

1

All features except

concept-based

χ2 kernel SVM

Temporal feature

smoothing2

Page 5: The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Trajectory-based Features

Feature Extraction• Trajectory-based features [2]:

- dense trajectory, HOG, HOF, MBH [5]

- TrajMF (relative locations and motions between trajectory pairs)

- Trajectory shape feature

• Advantages: robust to camera movement, rich information, implicitly capture object-object and object-background relationships.

[2] Y.-G. Jiang, Q. Dai, X. Xue, W. Liu, and C.-W. Ngo. Trajectory-based modeling of human actions with motion reference points. In ECCV, 2012.

[5] H. Wang, A. Klaser, C. Schmid, and C.-L. Liu. Action recognition by dense trajectories. In CVPR, 2011.

Page 6: The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Trajectory-based Features

Feature Extraction• SIFT [4]

• STIP [3]

• MFCC

• Concept-based Features (10 concepts: blood, carchase, coldarms, fights, fire, firearms, gore, explosions, gunshots, screams)

[3] I. Laptev. On space-time interest points. International Journal of Computer Vision, 64:107-123, 2005.

[4] D. Lowe. Distinctive image features from scale-invariant keypoints. International Journal on Computer Vision, 60:91-110, 2004.

Page 7: The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Trajectory-based Features

Classifiers• BoW representation

• Chi-squared kernel SVMs

• Kernel level early fusion is used to combine multiple features

Page 8: The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Trajectory-based Features

Temporal Smoothing• Feature Smoothing – averaged

features over a three-shot window.

• Score Smoothing – averaged prediction scores over a three-shot window.

Page 9: The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Trajectory-based Features

r3 r2 r5 r4 r10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Mea

n Av

erag

e Pr

ecis

ion

at 2

0

Results (mAP@20)

• Run 5: 7 dense trajectory features

• Run 4: Run 5 + SIFT + STIP + MFCC

• Run 3: Run 4 + concept scores

• Run 2: Run 4 + feature smoothing

• Run 1: Run 4 + score smoothing

Page 10: The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Trajectory-based Features

Results (mAP@100)

• Run 5: 7 dense trajectory features

• Run 4: Run 5 + SIFT + STIP + MFCC

• Run 3: Run 4 + concept scores

• Run 2: Run 4 + feature smoothing

• Run 1: Run 4 + score smoothing

r3 r4 r5 r2 r10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Mea

n Av

erag

e Pr

ecis

ion

at 1

00

Page 11: The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Trajectory-based Features

Discussions• SIFT + STIP + MFCC show insignificant

improvement. TrajMF has encoded the rich information of SIFT and STIP.

• Concept-based scores do not improve the performances - overfitting SVMs due to insufficient training data. In fact, using mid-level concept detectors is a promising direction.

• Score smoothing boosts the performances. Feature smoothing that “blurs” the features across shots might not be a good option.

Page 12: The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Trajectory-based Features

First 20 clips retrieved

Page 13: The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Trajectory-based Features

Thank You