M4 – WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia Lab University of Geneva 30-01-03
Progress report � UniGE � Information retrieval setup / extension � Video data processing � Information management framework � WP3: � Issues � Status – deliverable 2 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003
Information retrieval setup (initial) Event definition Feature definition A/V/text Feature files input Characterisation Index file Segmentation GIFT indexing GIFT URLisation MRML Keyframes Text QBE query query URLs Interface Text SQL DB Query client Time codes 3 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003
Information retrieval setup (planned) Event definition Feature definition A/V/text Feature files input Characterisation Index file Segmentation GIFT indexing Keyframes GIFT URLisation MRML Text Text QBE Audio query query query URLs Interface Text SQL DB Query client Time codes 4 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003
Video processing (1) OVAL :Video Access Library � C++ Video Object Model � Accepts plugin for specific formats � MPEG-1 : Dali from Cornell � LibDV, « XML » video plugin � Provides a generic API � Open, Close, GetProp stream � GetFrame(s) � Specific (MPEG: getMV, getDCT) � Do not accomodate Image Processing functionalities � Use of Matlab Mex with persistent memory 5 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003
Video Processing (2) � Video segmentation � Classical techniques � Based on spatio-temporal features (ongoing) � Mixed colour/motion information � Need to be extended to event-based segmentation � Integration of M4 features � Video characterisation � Estimation on feature pattern model (motion) � Support Vector Regression � Non-linear Prediction of Chaotic Times Series using SVM, NNSP’97 (Mukherjee, Osuna, Girosi) � Predicting Time Series with SVM, ICANN’97 (Muller, Smola, Schölkopf, Vapnik) 6 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003
Video Similarity Measure ( , ) 1 ( ) S V V E 1 V = − 1 2 2 V � Problems: S ( V 1 , V 1 ) ≠ 0 S ( V 1 , V 2 ) ≠ S ( V 2 , V 1 ) � Artificial symetrization D ( V 1 , V 2 ) = 0.5*[ S ( V 1 , V 2 ) + S ( V 2 , V 1 ) ] 7 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003
Video Classification � Distance matrix computed with prediction error D ( V i , V j ) � For all pair of video <i,j> in the given database D i,j = D ( V i , V j ) � Curvilinear Component Analysis is applied on D ⇒ gives a 2-dimensionnal mapping of the feature space 8 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003
Preliminary experiment � 29 video shots containing mainly Tv news and sport activities 9 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003
10 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003
Ongoing… � Text retrieval � Inclusion within GIFT � Multimodal embedding (visual+text query) � Query expansion (eg using WordNet) � Event characterisation � High level model � Feature-based inference ⇒ Characterisation of well-known events ⇒ Suitable for restricted contexts (M4) 11 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003
Information management � MRML : Going toward version 2.0 � More multimedia � More like an XML protocol (as defined by W3C - XMLP) � Trully multimedia / multimodal ⇒ Spec proposal release mid-Feb ⇒ Expected validation software: this summer � DEVA (Annotation model) � Based on RDF and Dublin Core (XML) � DAML+OIL (OWL) compatible � Makes existing software available (Xerces, Jena,…) � Allows multiple extensions (WordNet,…) 12 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003
WP3: Initial work plan 13 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003
WP3: Delivrables � D3,1: Report on baseline information access methods � m12 (Feb 2003) � Technical doc of the working system in place � D3,2: Report on methods for multimodal integration and NLP � m24 (Feb 2004) � Define intuitive way for meeting data querying and retrieval � D3,3: Final report on multimodal information access � m36 (Feb 2005) � Technical doc of the meeting manager 14 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003
D3.1 � Gathered basic information � Group-based � Template sent by next week � Activity-based � Description of what you can contribute in one field � Response by Feb 20th � Fill in where you feel is relevant � Edited by End of Feb � Smoothed out gaps… � Sent to Steve by Mid March 15 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003
WP3: Issues � Visual data is not usable alone � Need for text transcitps � Use of « external » data � Need for common format for data exchange � Annotation (explicit) � Processing results � Increase collaboration � Integration 16 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003
WP3 breakdown � Year 1 (-> 03/2003) � Emphasis on multimedia information processing and retrieval � Image, Video : Visual + Motion � Audio (speech), Text � Framework: Architecture, integration � Year 2: (-> 03/2004) � Emphasis on multimodal interaction (query processing) � Information from text, speech (text?), gesture,... � Natural language processing � Year 3: (-> 03/2005) � Emphasis on data summarisation � Video, dialogs, documents 17 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003
� ???? 18 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003
The framework Client CBIR server QBE query MRML soc GIFT Feature layer formulator ket extraction (eg PHP Op interface) Queries en Response soc Existing tool Relevance M ket PluginX feedback R Tool plugin MRML so M (eg GIMP layer ck L PluginY plugin) et … Assessor MRML so … plugins (eg Viper layer ck T evaluation et CP fe script) / at ur IP es Multimedia MRML feature … Multimedia logging storage data URL abstraction http server Offline Online (temporary local copy) Multimedia data 19 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003
Recommend
More recommend