New Challenges in Semantic Concept Detection - NIST · 2007. 11. 28. · s _Ship MapsMaps g er _ TV er TV - c reen Car Anima l t Off i ce Ove r all infAP TV07 Classifiers Average

New Challenges inNew Challenges inSemantic Concept DetectionSemantic Concept Detection

M.-F. Weng, C.-K. Chen, Y.-H. Yang, R.-E. Fan,Y.-T. Hsieh, Y.-Y. Chuang, W. H. Hsu, and C.-J. Lin

National Taiwan University

2

Preliminaries for Semantic Preliminaries for Semantic Concept DetectionConcept Detection• What are preliminaries for building a semantic

concept detection system?–– A lexicon of wellA lexicon of well--defined conceptsdefined concepts–– Training resourcesTraining resources

• Video data• Annotations• Features

–– ToolsTools• Tagging or labeling tools, (e.g., CMU and IBM tools)• Feature extractors• Machine learning tools, (e.g., LIBSVM)• Semantic concept detection tailored tools

3

Semantic ConceptsSemantic Concepts

MediaMill 101

Columbia374

LSCOM 449

TRECVID-2006 LSCOM-Lite (39)

TRECVID-2005 10 Concepts

4

Video Data SetsVideo Data Sets

TV 03

TV 04

TV 05

TV 06

TV 07

Dat

aset

0 50 100 150 200

Video Length (hrs)

devel. set

test set

~190 hours ~190 hours News VideoNews Video

~330 hours ~330 hours MultiMulti--Lang. Lang. Broadcast Broadcast

News VideoNews Video

100 hours 100 hours Sound and Sound and

VisionVision VideoVideo……

5

AnnotationsAnnotations

TV 05

TV 06

TV 07

Dat

aset

0 50 100 150 200Video Length (hrs)

devel. set

test set

NIST Truth Judgments

Common Annotations

6

Features, Detectors, ScoresFeatures, Detectors, Scores

Scores of TV Scores of TV 07 dataset07 dataset

VIREOVIREO--374 374 detectorsdetectors

•Color moment•Wavelet texture•Keypoint feature

VIREO-374

Columbia374 Columbia374 scores of TV scores of TV 06/07 dataset06/07 dataset

Columbia 374 Columbia 374 detectorsdetectors

•EDH•GBR•GCM

Columbia374

Scores of TV Scores of TV 05/06 dataset05/06 dataset

5 sets of 101 5 sets of 101 classifiersclassifiers

•Visual feature•Text feature

MediaMillBaseline

ScoresScoresDetectorsFeatureatures

7

Available ResourcesAvailable Resources

• Concept definition is sufficient • Training resources are plentiful• No feature extractors and tailored tools available

Tailored ToolsMachine Learning ToolsFeature ExtractorsTagging Tools

Tools

FeaturesAnnotationsVideo data

Training Resources

Well-defined concepts

8

The New ChallengesThe New Challenges

• Challenge 1 : Easy and Efficient Tools–– LL datasets, MM concepts, NN features, imply L*M*NL*M*N

classifiers– Each classifier has to consider many parameters–– Time seems very limitedTime seems very limited to validate each parameter

and to train all classifiers• Challenge 2: Resource Exploitation or Reuse

– Resources are precious– Existent resources are potentially usefulpotentially useful for new

dataset– Plentiful resources have not been fully utilizednot been fully utilized

9

Facing the New ChallengesFacing the New Challenges

• Challenge 1:– Extended LIBSVM to improve training efficiency– Developed an efficient and easy-to-use toolkit

tailored for semantic concept detection• Challenge 2:

– Reused classifiers of past data to improve accuracy by late aggregation

– Exploited contextual relationship and temporal dependency from annotations to boost accuracy

10


• Challenge 1 : Easy and Efficient Tools– Extended LIBSVM to improve training efficiency– Developed an efficient and easy-to-use toolkit

tailored for semantic concept detection

• Challenge 2– Reused classifiers of past data to improve

accuracy by late aggregation– Exploited contextual relationship and temporal

dependency from annotations to boost accuracy

11

A Tailored ToolkitA Tailored Toolkit

• We extended LIBSVM in three aspects for semantic concept detection:– Using dense representations– Exploiting parallelism of independent concepts,

features, and SVM model parameters– Narrowing down parameter search to a safe

range

• Overall, training time of our baseline was approximately reduced from 14 days to about 3 daysfrom 14 days to about 3 days

12


• Challenge 1 – Extended LIBSVM to improve training efficiency– Developed an efficient and easy-to-use toolkit tailored for

semantic concept detection

• Challenge 2 : Resource Exploitation or Reuse– Reused classifiers of past data to improve accuracy by

late aggregation– Exploited contextual relationship and temporal


13

Reuse Past DataReuse Past Data

• Early aggregation–– Must reMust re--train classifiers train classifiers –– Cause considerable training timeCause considerable training time

• Late aggregation–– Simple and directSimple and direct–– May be biasedMay be biased

14

Late AggregationLate Aggregation

• We adopt late aggregation to reuse existent classifiers by two strategies:–– Equally Average AggregationEqually Average Aggregation

• Simply average the scores of past and newly trained classifiers

–– ConceptConcept--dependent Weighted Aggregationdependent Weighted Aggregation• Use concept-dependent weights to aggregate

classifiers

15

Aggregation BenefitsAggregation Benefits

0

0.05

0.1

0.15

0.2

0.25

0.3

Flag-U

S

Police

_Sec

urity

Explo

sion_

Fire

Airp

lane

Airp

lane

Chart

s

Milita

ryDe

sert

Dese

rt

Truc

k

Wea

ther

Moun

tain

Peop

le-Ma

rching

Spor

ts

Spor

ts

Boat_

Ship

Maps

Maps

Meeti

ng

Comp

uter_T

V

Comp

uter_T

V--scre

en

scree

n

Car

Anim

al

Offic

e

Wate

rscap

e_Wa

terfro

nt

Overa

ll

infA

P

TV07 Classifiers

Average Aggregation

Weighted Aggregation

Overall Improvement Ratio –

Average Aggregation : 22%

Weighed Aggregation : 30%

16


• Challenge 1 – Extended LIBSVM to improve training efficiency– Developed an efficient and easy-to-use toolkit tailored for

semantic concept detection

• Challenge 2: Resource Exploitation or Reuse– Reused classifiers of past data to improve accuracy by

late aggregation– Exploited contextual relationship and temporal


17

Observation in AnnotationsObservation in AnnotationsA sequence of video shots

A lexiconof concepts

car

outdoor1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1

building 1 0 0 1 1 1 1 0

sky 0 1 1 1 0 1 1 0people 0 0 0 1 0 1 1 0

urban 1 1 0 1 1 1 1 1

Contextual relationshipTemporal Dependency

18

PostPost--processing Frameworkprocessing Framework

VideoSegmentation

FeatureExtraction

ConceptDetection

TemporalFiltering

Shot Ranking

VideoSequence

ConceptReranking

CombinationAnnotation

TemporalDependency

Mining

UnsupervisedContextual

Fusion

TemporalFilter

Design

Mining phase: Processing phase:Detecting phase:

19

Temporal FilteringTemporal Filtering

VideoSegmentation

FeatureExtraction

ConceptDetection

TemporalFiltering

Shot Ranking

VideoSequence

ConceptReranking


TemporalDependency

Mining


Fusion

TemporalFilter

Design


20

Temporal DependencyTemporal Dependency• Different concepts have different levels of

dependency at different temporal distance– E.g., sports, weather, maps, explosion sports, weather, maps, explosion

0

2000

4000

6000

8000

10000

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

sportsweathermapsexplosion

Temporal Distance

Chi-square test

2kχ

21

Temporal FilterTemporal Filter

Xt-1

)|1( 1−= ttlP x

Xt-2

)|1( 2−= ttlP x

Xt-k

)|1( kttlP −= x

Xt Xt+1 Xt+2 Xt+k…

)|1( kttlP += x)|1( 2+= ttlP x)|1( 1+= ttlP x)|1( ttlP x=

…….. …..

w0w1 w1 w2 wkw2wk

( )[ ]∑=

−===d

kkttkt lPwlP

0)|1()1(ˆ x

SVM Classifier

)|1( 11 −− = ttlP x)|1( 22 −− = ttlP x)|1( ktktlP −− = x )|1( ktktlP ++ = x)|1( 22 ++ = ttlP x)|1( 11 ++ = ttlP x)|1( ttlP x=

22

Filtering PredictionFiltering Prediction• A sequence of shots for predicting sportssports

–– Classifier prediction resultsClassifier prediction results

–– After temporal filteringAfter temporal filtering0

0.20.40.60.8

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

00.10.2

0.30.4

0.5

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Misclassification picked upRank rose

23

Concept Concept RerankingReranking

VideoSegmentation

FeatureExtraction

ConceptDetection

TemporalFiltering

Shot Ranking

VideoSequence

ConceptReranking


TemporalDependency

Mining


Fusion

TemporalFilter

Design


24

Initial ranking list produced by a baseline methodTarget concept: ‘boat’ (search or detection)

Initial list

25

training

test

Step 1. Randomly split to training and test sets

Initial list

26

training

test

Step 2. Learn to maintain the ranking orders

Initial list

1. Related concepts1. Related concepts2. Importance of each2. Importance of each

conceptconcept

Related concepts:oceanwaterscape

Shot pairs

27

training

test

Related concepts:oceanwaterscape

Step 3. Context fusion on the test data

Initial list

28

Reranked

test

Initial list training

Step 4. Merge

29

CombinationCombination

VideoSegmentation

FeatureExtraction

ConceptDetection

TemporalFiltering

Shot Ranking

VideoSequence

ConceptReranking


TemporalDependency

Mining


Fusion

TemporalFilter

Design


30

PostPost--processing Benefitsprocessing Benefits

Flag-U

S

Police

_Sec

urity

Explo

sion_

Fire

Milita

ry

Airpla

ne

Wea

ther

Chart

sTr

uck

Moun

tain

Meeti

ng

Peop

le-Ma

rching

Boat_

Ship

Sport

s

Dese

rt

Comp

uter_T

V-sc

reen

Anim

al Car

Maps

Offic

e

Wate

rscap

e_Wa

terfro

nt

Overa

ll0

0.05

0.1

0.15

0.2

0.25

0.3

infA

P

ClassificationConcept RerankingTemporal FilteringParallel Combination

Overall improvement ratio:Overall improvement ratio:

Parallel Combination: +10%Parallel Combination: +10%

+45%

+29%

+37%

+17%

31

ConclusionConclusion• We reduce the training time of detectorsreduce the training time of detectors by

using a tailored toolkit for semantic concept detection

• The proposed aggregation methods reuse the reuse the classifiersclassifiers of past data and can boost the detection accuracy

• Our post-processing approaches exploit exploit existent resourceexistent resource and can further improve detection results

32

Thank You For Your AttentionThank You For Your Attention

Documents

New Challenges in Semantic Concept Detection - NIST · 2007. 11. 28. · s _Ship MapsMaps g er _ TV er TV - c reen Car Anima l t Off i ce Ove r all infAP TV07 Classifiers Average