Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
M4 – WP3
Multimodal integrationProgress report
Viper group
Computer Vision and Multimedia Lab
University of Geneva30-01-03
S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 2
Progress report
UniGEInformation retrieval setup / extensionVideo data processingInformation management framework
WP3:IssuesStatus – deliverable
S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 3
Information retrieval setup (initial)
Segmentation
Event definition
A/V/textinput
URLisation
SQLDB
Time codes
URLs
Characterisation
Feature definition
Feature files
GIFT indexing
Keyframes
Index file
GIFT
Text
QBEquery
Textquery
Interface
MRML
Query client
S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 4
Information retrieval setup (planned)
Segmentation
Event definition
A/V/textinput
URLisation
SQLDB
Time codes
URLs
Characterisation
Feature definition
Feature files
GIFT indexing
Keyframes
Index file
Text
QBEquery
Textquery
Interface
MRML
Query client
Text
Audioquery
GIFT
S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 5
Video processing (1)
OVAL :Video Access LibraryC++ Video Object ModelAccepts plugin for specific formats
MPEG-1 : Dali from CornellLibDV, « XML » video plugin
Provides a generic APIOpen, Close, GetProp streamGetFrame(s)Specific (MPEG: getMV, getDCT)
Do not accomodate Image Processing functionalitiesUse of Matlab Mex with persistent memory
S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 6
Video Processing (2)
Video segmentationClassical techniques
Based on spatio-temporal features (ongoing)Mixed colour/motion information
Need to be extended to event-based segmentationIntegration of M4 features
Video characterisationEstimation on feature pattern model (motion)
Support Vector RegressionNon-linear Prediction of Chaotic Times Series using SVM, NNSP’97(Mukherjee, Osuna, Girosi)
Predicting Time Series with SVM, ICANN’97 (Muller, Smola, Schölkopf, Vapnik)
S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 7
Video Similarity Measure
Problems: S(V1 , V1) ≠ 0S(V1 , V2) ≠ S(V2 , V1)
Artificial symetrizationD (V1 , V2) = 0.5*[S(V1 , V2) + S(V2 , V1) ]
)(1),( 221 1VEVVS V−=
S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 8
Video Classification
Distance matrix computed with prediction error D(Vi , Vj )For all pair of video <i,j> in the given database
Di,j = D(Vi , Vj )
Curvilinear Component Analysis is applied on D⇒ gives a 2-dimensionnal mapping of the feature space
S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 9
Preliminary experiment
29 video shots containing mainly Tv news and sport activities
S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 10
S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 11
Ongoing…
Text retrievalInclusion within GIFTMultimodal embedding (visual+text query)Query expansion (eg using WordNet)
Event characterisationHigh level modelFeature-based inference
⇒Characterisation of well-known events⇒Suitable for restricted contexts (M4)
S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 12
Information management
MRML : Going toward version 2.0More multimedia
More like an XML protocol (as defined by W3C - XMLP)Trully multimedia / multimodal
⇒ Spec proposal release mid-Feb
⇒ Expected validation software: this summer
DEVA (Annotation model)Based on RDF and Dublin Core (XML)
DAML+OIL (OWL) compatible
Makes existing software available (Xerces, Jena,…)Allows multiple extensions (WordNet,…)
S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 13
WP3: Initial work plan
S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 14
WP3: Delivrables
D3,1: Report on baseline information access methodsm12 (Feb 2003)
Technical doc of the working system in place
D3,2: Report on methods for multimodal integration and NLPm24 (Feb 2004)
Define intuitive way for meeting data querying and retrieval
D3,3: Final report on multimodal information accessm36 (Feb 2005)Technical doc of the meeting manager
S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 15
D3.1
Gathered basic informationGroup-based
Template sent by next weekActivity-basedDescription of what you can contribute in one field
Response by Feb 20thFill in where you feel is relevant
Edited by End of FebSmoothed out gaps…
Sent to Steve by Mid March
S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 16
WP3: Issues
Visual data is not usable aloneNeed for text transcitpsUse of « external » data
Need for common format for data exchangeAnnotation (explicit)Processing results
Increase collaborationIntegration
S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 17
WP3 breakdown
Year 1 (-> 03/2003)Emphasis on multimedia information processing and retrieval
Image, Video : Visual + Motion
Audio (speech), TextFramework: Architecture, integration
Year 2: (-> 03/2004)Emphasis on multimodal interaction (query processing)Information from text, speech (text?), gesture,...
Natural language processing
Year 3: (-> 03/2005)Emphasis on data summarisation
Video, dialogs, documents
S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 18
????
S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 19
CBIR server
TCP/IP
Client
Multimedia data
http server
…
socket
MRMLlayer
QBE query formulator
(eg PHP interface)
Existing tool
socket
MRMLlayer
Tool plugin(eg GIMP
plugin)
MRMLlayer
socket
Assessor(eg Viper evaluation
script)
Open socket
GIFT
plugins
MRML
PluginX
PluginY
…
Multimedia feature storage
MRML logging
Multimedia data
Online Offline
Feature extraction
…features
URL abstraction
(temporary local copy)
QueriesResponseRelevance feedback
The framework