49
Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign

Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Bridge Semantic Gap: A Large Scale

Concept Ontology for Multimedia

(LSCOM)

Guo-Jun Qi

Beckman Institute

University of Illinois at Urbana-Champaign

Page 2: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

LSCOM (Large Scale Concept

Ontology for Multimedia)

A broadcast news video dataset

200+ news videos/ 170 hours

61,901 shots

Language

◦ English/Arabic/Chinese

Page 3: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Why broadcast News ontology?

Critical mass of users, content providers,

applications

Good content availability (TRECVID LDC

FBIS)

Share Large set of core concepts with

other domains

Page 4: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

LSCOM Provides

Richly annotated video content for accomplishing required access and analysis functions over massive amount of video content

Large scale useful well-defined semantic lexicon

◦ More than 3000 concepts

◦ 374 annotated concepts

◦ Bridging semantic gap from low-level features to high-level concepts

Page 5: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

A LSCOM concept

000 - Parade

Concept ID: 000

Name: Parade

Definition: Multiple units of marchers,

devices, bands, banners or Music.

Labeled: Yes

Page 6: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

LSCOM Hierarchy

http://www.lscom.org/ontology/index.html

Thing

.Individual

..Dangerous_Thing

...Dangerous_Situation

....Emergency_Incident

.....Disaster_Event

......Natural_Disaster

....Natural_Hazard

.....Avalance

.....Earthquake

.....Mudslide

.....Natural_Disaster

.....Tornado

...Dangerous_Tangible_Thing

....Cutting_Device

Page 7: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Definition: What’s the ontology?

(Wikipedia) An ontology is a formal representation

of the knowledge by a set of concepts

within a domain and the relationships

between those concepts. It is used to

reason about the properties of that

domain, and may be used to describe the

domain.

Page 8: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Ontology

Represents the visual knowledge base in a

structure way

◦ Graph structure

◦ Tree (hierarchy) structure

Images/videos can be effectively learned

and retrieved by the coherence between

concepts

◦ Logical coherence

◦ Statistical coherence

Page 9: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

An Ontology Hierarchy: Military

Vehicle

Page 10: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

An example from Wikipedia

Page 11: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Ontology Tree for LSCOM

Page 12: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

A Light Scale Concept Ontology for

Multimedia Understanding

(LSCOM-Lite) The aim is to break the semantic space using

a few concepts (39 concepts).

Selection Criteria

◦ Semantic Coverage

As many as semantic concepts in News videos could be covered by the light concept set.

◦ Compactness

These concept should not semantically overlap.

◦ Modelability These concepts could be modeled with a smaller

semantic gap.

Page 13: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Selected concept dimensions

Divide the semantic space into a multimedia-dimensional space, where each dimension is nearly orthogonal

◦ Program Category

◦ Setting/Scene/Site

◦ People

◦ Objects

◦ Activities

◦ Events

◦ Graphics

Page 14: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Histogram of LSCOM-Lite

Concepts

Page 15: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Some example keyframes

Page 16: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Applications

Application I: Conceptual Fusion (most

basic – early fusion)

Application II: Cross-Category

Classification (inter-class relation)

Application III: Event Dynamic in Concept

Space

Page 17: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Application I: Conceptual Fusion

Video

Concept 1

Concept 2

Concept 3

Concept n

Visual

Features

Classifier

Page 18: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

LSCOM 374 Models

374 LIBSVM models

◦ http://www.ee.columbia.edu/ln/dvmm/columbi

a374/

◦ Feature used (MPEG-7 descriptors)

Color Moments

Edge Histogram

Wavelet Texture

◦ LIBSVM – a library for support vector

machine at

http://www.csie.ntu.edu.tw/~cjlin/libsvm/

Page 19: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Application II: cross-category

classification with concept transfer

G.-J. Qi et al. Towards Cross-Category

Knowledge Propagation for Learning

Visual Concepts, in CVPR 2011

Page 20: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Instance-Level Concept Correlation

+1

-1

+1

-1

Mountain Castle

Mountain and castle

Page 21: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Transfer Function

Mountain, Castle

Mountain

Castle

None of them

Page 22: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Model Concept Relations

Page 23: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Automatically construct ontology in

a data-driven manner

Page 24: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

An application III – Event Dynamics

in Concept Space

Page 25: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Event Detection with Concept

Dynamics

W. Jiang et al, Semantic event detection

based on visual concept prediction, ICME,

Germany, 2008.

Page 26: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Open Problems

Cross-Dataset Gap

◦ Generalize LSCOM dataset to other dataset (e.g., non-

news video dataset)

Cross-Domain Gap

◦ Text script associated with news videos

Can help information extraction for visual concepts?

Automatic ontology construction

◦ Task dependent v.s. task independent

◦ Data driven v.s. preliminary knowledge (e.g., WordNet)

◦ Incorporate prior human knowledge (logic relation

etc.)

Page 27: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

TRECVID Competition

Task 1: High-Level Feature Extraction

◦ Input: subshot

◦ Output: detection results for 39 LSCOM-Lite

concepts in the subshot

Page 28: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

High-Level Feature Extraction

Each concept assumed to be binary

(absent or present) in each subshot

Submission: Find subshots that contain a

certain concept, rank them by the

detection confidence score, and submit

the top 2000.

Evaluations: NIST evaluated 20 medium

frequent concepts from 39 concepts using a

50% random samples of all the submission pools

Page 29: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

20 Evaluated Concepts

Page 30: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Evaluation Metric: Average Precision

Relevant subshots should be ranked

higher than the irrelevant ones.

R is the number of relevant images in total,

Rj is the number of relevant images in top

j images, Ij indicates if the jth image is

irrelevant or not.

1

1Average Precision

Nj

j

j

RI

R j

Page 31: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Results

Page 32: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

TRECVID Competition

Task II: Video Search

◦ Input: text-based 24 topics

◦ Output: relevant subshots in the database

Page 33: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Topics to search

Page 34: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Topics to search (cont’d)

Page 35: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Topics to search

Page 36: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Three Types of Search Systems

Page 37: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Results: Automatic Runs

Page 38: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Results: Manual Runs

Page 39: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Results: Interactive Runs

Page 40: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Machine Problem 7: Shot Boundary

Detection in Videos

Page 41: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Goals

Detect the abrupt content changes

between consecutive frames.

◦ Scene changes

◦ Scene cuts

Page 42: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Steps

Step 1: Measuring the change of content

between video frames

◦ Visual/Acoustic measurements

Step 2: Compare the content distance

between successive frames. If the

distance is larger than a certain threshold,

then a shot boundary may exist.

Page 43: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Measuring Content based on Visual

Information

256 dimensional Color Histogram

◦ In RGB space, normalize the r, g, b in [0,1]

◦ Color space

nr

ng

8X8 histogram

Page 44: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Color Histograms Divide each image into four parts, each

part has a 8X8 histogram, and 256 dim

features in total.

Page 45: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Acoustic Features

12 cepstral coefficients

Energy (sum of square of raw signals)

Zero crossing rates (ZCR)

ZCR = sum(|sign(S(2:N))-sign(S(1:N-1))|)

Hints: normalize energy to avoid it over-

dominating when computing distances

between successive frames

Page 46: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Datasets

Two videos of little over one minute

Manually label the shot boundary

Page 47: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

What to submit

Source code

Report

◦ compare shot boundary detection results

returned by your algorithm with the manually

labeled boundaries

◦ Compare

◦ Explain your choice of threshold

◦ Explain the differences between the acoustic-

based and visual-based detection results

Page 48: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Where and when to submit

Email to [email protected]

Due: May 2nd

Page 49: Guo-Jun Qi Beckman Institute University of Illinois at …ece417/LectureNotes/ECE417-LSCOM.pdfText script associated with news videos Can help information extraction for visual concepts?

Thanks! Q&A