29
Video Table-of- Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

Video Table-of-Contents: Construction and Matching

Master of Philosophy3rd Term Presentation

- Presented by Ng Chung Wing

Page 2: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

OutlineOverview of ResearchPrevious WorkADVISE

Advanced Digital Video Information Segmentation Engine

Future Work and Conclusion

Page 3: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

Overview of ResearchSituation

A large volume of video contents on the Internet

Problems Not enough information to describe the

video contents Difficult to search for videos with similar

contents

Page 4: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

Overview of ResearchWeb-based Video Retrieval System

Provides a Video Table-of-Contents to describe the structure of video

Applies Tree Matching Algorithms to measure the similarity between videos

Allows retrieval of similar videos according to tree matching results

Page 5: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

Overview of Research (Cont’d)

In the last semester Definition of Video Tree Structure Video Tree Matching Algorithms

In this semester ADVISE

Generation of Video Tree Structure Web-based presentation of the structure as a

Video Table-of-Contents

In the coming semester Video Retrieval

Page 6: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

Review on Previous Work: Video Tree Structure

Decompose a video into 5 levels: Video Frames Video Shots Video Groups Video Scenes Whole Video

Hierarchical Representation of a Video

Page 7: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

Review on Previous Work: Video Tree Structure (Cont’d)

Example:

Group 1 Group 3Group 2

Scene 1 Scene 2

Video

Shots:

Page 8: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

Review on Previous Work: Video Tree Structure (Cont’d)

4 levels tree structureRegarded as: Video Table-of-Contents (V-ToC)

Page 9: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

Review on Previous Work: Video Tree Matching Algorithms

Measure the similarity of videos Matching on their video tree structures

Two approaches: Ordered Tree Matching Algorithm

Constrained by temporal ordering Non-ordered Tree Matching Algorithm

Not constrained by temporal ordering

Video feature used Color histograms of video frames

Page 10: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

ADVISEAdvanced Digital Video Information Segmentation Engine3 modules:

Generates V-ToC to describe videos

Presents V-ToC on the Internet using XML

Allows video customization according to the V-ToC with SMIL

ADVISE

Video StructureConstruction

(Scenes, Groups andShots Detections)

Video StructureConstruction

(Scenes, Groups andShots Detections)

Web Server

Generation ofSMIL

presentation

Generation ofSMIL

presentation

Online User Terminal

1. Described thevideo content to

online user 3. ReturnCustomized SMIL

Video to user

2. SubmitSelectionRequest

Presention ofV-ToC in XML

Presention ofV-ToC in XML

Page 11: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

ADVISE:Video Structure Construction

Video Shots Detection Color histogram based method with

weighted regions 5 regional and 1 overall color histograms for

each video frame Catch local color features in video frame Different weights to regions

according to importance

6

Page 12: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

ADVISE:Video Structure Construction (Cont’d)

Calculate the frame-to-frame color difference:

where Histi,t(k) denotes the k-th color value in the histogram for region i in frame t. WRDX are weights to regions.

66525

43211

1,,

)())(

)()()(()( )( DifferenceColor Frames

escolor valu allfor , )( (t))( Region in Difference

RDtoRD

RDt

titii

WtRDWtRD

tRDtRDtRDWtRDFD

kkHist(k)HistRDi

Page 13: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

ADVISE:Video Structure Construction (Cont’d)

Find the sudden change in color contents as the video shot boundaries

Need a threshold to determine the shot boundaries:

Not suitable to assign a fixed threshold to different videos

Use adaptive threshold

boundaryshot aNot Threshold )( DifferenceColor Frame

occursboundary Shot Threshold )( DifferenceColor Frame

t

t

FD

FD

Page 14: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

ADVISE:Video Structure Construction (Cont’d)

Employed an entropic thresholding method

Divide the frame-to-frame histograms difference into two populations at a threshold point

Measure the entropies of the populations Find the maximum sum of two entropies at

different threshold point

Distribution of histogram differences:

0 Max. Difference

Optimal Threshold(with most informative entropies)

Shot Breaks

Non-Shot Breaks

Page 15: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

ADVISE:Video Structure Construction (Cont’d)

Video Groups Formation For each shot s, compare its key frame (the

first frame) with the key frame from most recent shot in group g.

66525

43211,

,,

)())(

)()()(()(

escolor valu allfor , )(

RDgstoRDgs

gsgsgsRDgstgts

tgitsigsi

W,ttRDW,ttRD

,ttRD,ttRD,ttRDW,ttRDFD

kkHist(k)Hist),t(tRD

Page 16: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

ADVISE:Video Structure Construction (Cont’d)

After comparing all groups, we assign the shot to a group if:

Difference is smallest amongst groups Difference is smaller than the calculated threshold The shot is temporally not far apart from the group

Page 17: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

ADVISE:Video Structure Construction (Cont’d)

Scene B

Scene A Scene B

aTime : b c d

Case (i) : c<m<n<d m

GroupG1

n

Scene A

a b c d

Scene B Assign G1 to B

Case (ii) : c<m<d & d<n m

GroupG1

n

Scene A

a b c

Scene B Assign G1 to B and extend B to n

n

Case (iii) : d<m

m

GroupG1

n

Scene A

a b c d

m n

Scene C Create Scene C and assign G1 to C

Video Scenes Formation Construct a continuous

video sequence from video groups

Video scenes For each group with the first

and the last shots m and n If m is within a scene, add

the group to the scene and extend the scene to n if necessary.

Case (i) & (ii) If m is not within any scene,

add it to a new scene Case (iii)

Page 18: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

ADVISE:Video Structure Construction (Cont’d)

User Interface of implemented system

Page 19: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

ADVISE:Video Structure Construction (Cont’d)

Experiments: Evaluate 4 different settings of video

structure construction Single color histogram, and fixed threshold Single color histogram, and adaptive threshold Weighted regional color histograms, and fixed

threshold Weighted regional color histograms, and adaptive

threshold Compare the generated video structure with

human judgments

Page 20: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

ADVISE:Video Structure Construction (Cont’d)

The setting with Weighted regional color histograms Adaptive threshold

generates the most accurate video structure.

Page 21: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

ADVISE:XML Presentation

4 benefits to store the video structure in XML

Nested hierarchy Fit into our video tree structure

Plain-text format Easy to search and modify

Extensibility Available to extend the video structure

Application to the Internet

Page 22: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

ADVISE:XML Presentation (Cont’d)

Defined XML grammar for video structure in DTD

<?xml version="1.0"?><!ELEMENT advise (video+)><!ELEMENT video (scene+)> <!ATTLIST video src CDATA #REQUIRED><!ELEMENT scene (group+)> <!ATTLIST scene id CDATA #REQUIRED><!ELEMENT group (shot+)> <!ATTLIST group id CDATA #REQUIRED><!ELEMENT shot (time+, keyframe+)> <!ATTLIST shot id CDATA #REQUIRED><!ELEMENT time EMPTY> <!ATTLIST time value CDATA #REQUIRED><!ELEMENT keyframe EMPTY> <!ATTLIST keyframe img CDATA #REQUIRED>

<?xml version="1.0"?><!DOCTYPE advise SYSTEM "./toc.dtd"><advise><video src="rstp:// source video on server"><scene id="1"> <group id="1"> <shot id="1"> <time value="0"/> <keyframe img="./sh_1.jpg"/> </shot> <shot id="2"> <time value="11"/> <keyframe img="./sh_2.jpg"/> </shot> </group></scene></video></advise>

DTD for XML Video Tree Structure XML Video Tree Structure

Page 23: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

ADVISE:XML Presentation (Cont’d)

Web-based presentation of XML using XSL

Transformation to HTML Sorting and filtering of XML data<xsl:for-each select="advise/video/scene/group/shot" order-by="../@id">

<tr class="nfont"> <th><xsl:value-of select="../../@id"/></th> <th><xsl:value-of select="../@id"/></th> <th><xsl:value-of select="@id"/></th> <th align="left"><img width="55" height="45"><xsl:attribute name="src"><xsl:value-of select="keyframe/@img"/></xsl:attribute></img> at <xsl:value-of select="time/@value"/> sec</th></tr></xsl:for-each>

Page 24: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

ADVISE:XML Presentation (Cont’d)

An example XML presentation

V-ToC describes the structure of a video

Page 25: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

ADVISE:SMIL Generation

Video customization Allow user to pick some video segments that

they are interested from the V-ToC

SMIL Designed for performing synchronized

multimedia presentation on the Internet Use RealPlayer to browse Benefits

Easy to generate because of the XML plain-text property

Dynamically adapt to different network and client condition

Page 26: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

ADVISE:SMIL Generation (Cont’d)

Defined a SMIL template:<?xml version="1.0"?><smil><head> ... Define the Layout </head><body> <seq> <par> <video src="rtsp:// source video on server" clip-begin="3s" clip-end="15s" region="video" fill="freeze"/> <textstream src="desc.rt" clip-begin="3s" clip-end="15s" region="description" fill="freeze"/> <img src="./sh_2.jpg" region="keyframe" fill="freeze"/> </par> <par> <video src="rtsp:// source video on server" clip-begin="35s" clip-end="50s" region="video" fill="freeze"/> <textstream src="desc.rt" clip-begin="35s" clip-end="50s" region="description" fill="freeze"/> <img src="./sh_4.jpg" region="keyframe" fill="freeze"/> </par> </seq> </body></smil>

Page 27: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

ADVISE:SMIL Generation (Cont’d)

Customized SMIL Video Presentation

Script on Web Server1. Interpret request2. Select video segments

according to XML video structure

3. Generate customizedSMIL presentation

User Interface for Customization

Submit request

Return SMIL

Page 28: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

Future WorkVideo retrieval system framework

Integrates ADVISE and video tree matching algorithms

Explore the capability of using the video tree matching on video retrieval

Video clustering

Efficient retrieval of video using XML Hierarchy of V-ToC Textual search of video information

Page 29: Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing

ConclusionOverview of the research on video retrieval system

Based on the structure of video (V-ToC)

Described ADVISE Generates video tree structure (V-ToC) Provides V-ToC in XML as descriptions of

videos on the Internet Enables video customization based on V-

ToC using SMIL