19
M4 – WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia Lab University of Geneva 30-01-03

M4 – WP3 Multimodal integrationspandh.dcs.shef.ac.uk/projects/m4/project_only/... · M4 – WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: M4 – WP3 Multimodal integrationspandh.dcs.shef.ac.uk/projects/m4/project_only/... · M4 – WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia

M4 – WP3

Multimodal integrationProgress report

Viper group

Computer Vision and Multimedia Lab

University of Geneva30-01-03

Page 2: M4 – WP3 Multimodal integrationspandh.dcs.shef.ac.uk/projects/m4/project_only/... · M4 – WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 2

Progress report

UniGEInformation retrieval setup / extensionVideo data processingInformation management framework

WP3:IssuesStatus – deliverable

Page 3: M4 – WP3 Multimodal integrationspandh.dcs.shef.ac.uk/projects/m4/project_only/... · M4 – WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 3

Information retrieval setup (initial)

Segmentation

Event definition

A/V/textinput

URLisation

SQLDB

Time codes

URLs

Characterisation

Feature definition

Feature files

GIFT indexing

Keyframes

Index file

GIFT

Text

QBEquery

Textquery

Interface

MRML

Query client

Page 4: M4 – WP3 Multimodal integrationspandh.dcs.shef.ac.uk/projects/m4/project_only/... · M4 – WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 4

Information retrieval setup (planned)

Segmentation

Event definition

A/V/textinput

URLisation

SQLDB

Time codes

URLs

Characterisation

Feature definition

Feature files

GIFT indexing

Keyframes

Index file

Text

QBEquery

Textquery

Interface

MRML

Query client

Text

Audioquery

GIFT

Page 5: M4 – WP3 Multimodal integrationspandh.dcs.shef.ac.uk/projects/m4/project_only/... · M4 – WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 5

Video processing (1)

OVAL :Video Access LibraryC++ Video Object ModelAccepts plugin for specific formats

MPEG-1 : Dali from CornellLibDV, « XML » video plugin

Provides a generic APIOpen, Close, GetProp streamGetFrame(s)Specific (MPEG: getMV, getDCT)

Do not accomodate Image Processing functionalitiesUse of Matlab Mex with persistent memory

Page 6: M4 – WP3 Multimodal integrationspandh.dcs.shef.ac.uk/projects/m4/project_only/... · M4 – WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 6

Video Processing (2)

Video segmentationClassical techniques

Based on spatio-temporal features (ongoing)Mixed colour/motion information

Need to be extended to event-based segmentationIntegration of M4 features

Video characterisationEstimation on feature pattern model (motion)

Support Vector RegressionNon-linear Prediction of Chaotic Times Series using SVM, NNSP’97(Mukherjee, Osuna, Girosi)

Predicting Time Series with SVM, ICANN’97 (Muller, Smola, Schölkopf, Vapnik)

Page 7: M4 – WP3 Multimodal integrationspandh.dcs.shef.ac.uk/projects/m4/project_only/... · M4 – WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 7

Video Similarity Measure

Problems: S(V1 , V1) ≠ 0S(V1 , V2) ≠ S(V2 , V1)

Artificial symetrizationD (V1 , V2) = 0.5*[S(V1 , V2) + S(V2 , V1) ]

)(1),( 221 1VEVVS V−=

Page 8: M4 – WP3 Multimodal integrationspandh.dcs.shef.ac.uk/projects/m4/project_only/... · M4 – WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 8

Video Classification

Distance matrix computed with prediction error D(Vi , Vj )For all pair of video <i,j> in the given database

Di,j = D(Vi , Vj )

Curvilinear Component Analysis is applied on D⇒ gives a 2-dimensionnal mapping of the feature space

Page 9: M4 – WP3 Multimodal integrationspandh.dcs.shef.ac.uk/projects/m4/project_only/... · M4 – WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 9

Preliminary experiment

29 video shots containing mainly Tv news and sport activities

Page 10: M4 – WP3 Multimodal integrationspandh.dcs.shef.ac.uk/projects/m4/project_only/... · M4 – WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 10

Page 11: M4 – WP3 Multimodal integrationspandh.dcs.shef.ac.uk/projects/m4/project_only/... · M4 – WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 11

Ongoing…

Text retrievalInclusion within GIFTMultimodal embedding (visual+text query)Query expansion (eg using WordNet)

Event characterisationHigh level modelFeature-based inference

⇒Characterisation of well-known events⇒Suitable for restricted contexts (M4)

Page 12: M4 – WP3 Multimodal integrationspandh.dcs.shef.ac.uk/projects/m4/project_only/... · M4 – WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 12

Information management

MRML : Going toward version 2.0More multimedia

More like an XML protocol (as defined by W3C - XMLP)Trully multimedia / multimodal

⇒ Spec proposal release mid-Feb

⇒ Expected validation software: this summer

DEVA (Annotation model)Based on RDF and Dublin Core (XML)

DAML+OIL (OWL) compatible

Makes existing software available (Xerces, Jena,…)Allows multiple extensions (WordNet,…)

Page 13: M4 – WP3 Multimodal integrationspandh.dcs.shef.ac.uk/projects/m4/project_only/... · M4 – WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 13

WP3: Initial work plan

Page 14: M4 – WP3 Multimodal integrationspandh.dcs.shef.ac.uk/projects/m4/project_only/... · M4 – WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 14

WP3: Delivrables

D3,1: Report on baseline information access methodsm12 (Feb 2003)

Technical doc of the working system in place

D3,2: Report on methods for multimodal integration and NLPm24 (Feb 2004)

Define intuitive way for meeting data querying and retrieval

D3,3: Final report on multimodal information accessm36 (Feb 2005)Technical doc of the meeting manager

Page 15: M4 – WP3 Multimodal integrationspandh.dcs.shef.ac.uk/projects/m4/project_only/... · M4 – WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 15

D3.1

Gathered basic informationGroup-based

Template sent by next weekActivity-basedDescription of what you can contribute in one field

Response by Feb 20thFill in where you feel is relevant

Edited by End of FebSmoothed out gaps…

Sent to Steve by Mid March

Page 16: M4 – WP3 Multimodal integrationspandh.dcs.shef.ac.uk/projects/m4/project_only/... · M4 – WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 16

WP3: Issues

Visual data is not usable aloneNeed for text transcitpsUse of « external » data

Need for common format for data exchangeAnnotation (explicit)Processing results

Increase collaborationIntegration

Page 17: M4 – WP3 Multimodal integrationspandh.dcs.shef.ac.uk/projects/m4/project_only/... · M4 – WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 17

WP3 breakdown

Year 1 (-> 03/2003)Emphasis on multimedia information processing and retrieval

Image, Video : Visual + Motion

Audio (speech), TextFramework: Architecture, integration

Year 2: (-> 03/2004)Emphasis on multimodal interaction (query processing)Information from text, speech (text?), gesture,...

Natural language processing

Year 3: (-> 03/2005)Emphasis on data summarisation

Video, dialogs, documents

Page 18: M4 – WP3 Multimodal integrationspandh.dcs.shef.ac.uk/projects/m4/project_only/... · M4 – WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 18

????

Page 19: M4 – WP3 Multimodal integrationspandh.dcs.shef.ac.uk/projects/m4/project_only/... · M4 – WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 19

CBIR server

TCP/IP

Client

Multimedia data

http server

socket

MRMLlayer

QBE query formulator

(eg PHP interface)

Existing tool

socket

MRMLlayer

Tool plugin(eg GIMP

plugin)

MRMLlayer

socket

Assessor(eg Viper evaluation

script)

Open socket

GIFT

plugins

MRML

PluginX

PluginY

Multimedia feature storage

MRML logging

Multimedia data

Online Offline

Feature extraction

…features

URL abstraction

(temporary local copy)

QueriesResponseRelevance feedback

The framework