CDVP & TRECVID-2003 News Story Segmentation Task Csaba Czirjek, - PowerPoint PPT Presentation

Center for Digital Video Processing C e n t e r f o r D I g I t a l V I d e o P r o c e s s I n g CDVP & TRECVID-2003 News Story Segmentation Task Csaba Czirjek, Gareth J.F. Jones, Seán Marlow, Noel Murphy, Noel E. O’Connor, Neil O’Hare, Alan F. Smeaton TREC-2003 (Neil O’Hare) - 1 -

Center for Digital Video Processing Contents C e n t e r f o r D I g I t a l V I d e o P r o c e s s I n g • Introduction – Structure of News Broadcast – System Overview • Story Segmentation System – Feature Extraction Process – Combination of Features using Support Vector Machine – Submitted Runs • Results • Conclusions TREC-2003 (Neil O’Hare) - 2 -

Center for Digital Video Processing Structure of a News Broadcast C e n t e r f o r D I g I t a l V I d e o P r o c e s s I n g • We assume stories are delimited by shots of the anchorperson • Features of Anchor shots : – All anchor shots within a broadcast taken from the same camera setup – filmed with a static camera, with little object motion – anchor shots in a single broadcast are visually similar to each other TREC-2003 (Neil O’Hare) - 3 -

Center for Digital Video Processing Structure of a News Broadcast C e n t e r f o r D I g I t a l V I d e o P r o c e s s I n g Commercial Break Anchorperson Shots News Report Shots TREC-2003 (Neil O’Hare) - 4 -

Center for Digital Video Processing System Overview C e n t e r f o r D I g I t a l V I d e o P r o c e s s I n g • We use TRECVID 2003 common shot boundary provided by CLIPS-IMAG • Extracted features combined to detect anchor shots • Story boundaries logged at the start of anchor shots • Aim is to extract features that are robust to changes across broadcasters (eg faces, motion, shot length) • This would give a generic news segmentation system TREC-2003 (Neil O’Hare) - 5 -

Center for Digital Video Processing System Overview C e n t e r f o r D I g I t a l V I d e o P r o c e s s I n g News Stories Shot Level News Story Feature Extraction Detection 1 Shot Clustering 2 30 Minute News Program 3 Face Detection Shot Boundary Detection 4 Motion Activity Analysis x 2 5 Support Vector Machine Shot Length 6 Donated by CLIPS-IMAG 7 8 Text Segmentation Donated by StreamSage TREC-2003 (Neil O’Hare) - 6 -

Center for Digital Video Processing Feature Extraction 1 - Shot Clustering C e n t e r f o r D I g I t a l V I d e o P r o c e s s I n g • Shots are clustered based on visual similarity (colour histogram) • anchor shots grouped together • anchor clusters identified using heuristics: – tend to be dispersed throughout the broadcast – average length longer than others – anchor shots are very similar to each other: they form ‘tighter’ clusters TREC-2003 (Neil O’Hare) - 7 -

Center for Digital Video Processing Feature Extraction 2 - Face Detection C e n t e r f o r D I g I t a l V I d e o P r o c e s s I n g • Coarse to fine approach to extract candidate regions: – Skin like pixels identified based on colour – Morphological filtering used to obtain smoothed areas of connected pixels – Shape and size heuristics remove candidate face regions • Candidates passed to a Principle Component Analysis (PCA) module for final classification • Every 12th frame (I-frames) used for processing TREC-2003 (Neil O’Hare) - 8 -

Center for Digital Video Processing Face Detection C e n t e r f o r D I g I t a l V I d e o P r o c e s s I n g Original video file Face Database skin filtering + morphological adjustment size/shape heuristics PCA 0. 7 0. 8 0. 5 0. 2 For every 12 th frame Filtered image after Image after applying Detected faces with morphological adjustment size/shape heuristics confidence score TREC-2003 (Neil O’Hare) - 9 -

Center for Digital Video Processing Feature Extraction 3 - Activity Measure C e n t e r f o r D I g I t a l V I d e o P r o c e s s I n g • Motion Activity analysis based on MPEG-1 motion vectors • Every P-frame is analysed • We count the number of zero length motion vectors in a P-frame (excluding I-blocks) • Activity measure: No. of zero length vectors Total No. of macroblocks TREC-2003 (Neil O’Hare) - 10 -

Center for Digital Video Processing Feature Extraction 3 - Activity Measure C e n t e r f o r D I g I t a l V I d e o P r o c e s s I n g • Two separate shot level measures used: – least active P-frame is used to represent the shot – All motion vectors across a shot are added to form a cumulative motion vector. Activity measure then calculated using cumulative motion vector cumulative frame: frame a frame b frame a + frame b 0,0 1,1 -5,9 0,-1 0,1 -3,5 0,1 1,0 -2,4 0,0 0,0 4,3 + 3,0 0,0 0,0 = 3,0 0,0 4,3 -2,1 1,-1 1,0 -2,1 0,1 0,1 -4,2 1,0 1,1 TREC-2003 (Neil O’Hare) - 11 -

Center for Digital Video Processing Feature Extraction 4 - Shot Length C e n t e r f o r D I g I t a l V I d e o P r o c e s s I n g • Shot length used as a feature • Measured in frames TREC-2003 (Neil O’Hare) - 12 -

Center for Digital Video Processing Feature Extraction 5 - Text Analysis C e n t e r f o r D I g I t a l V I d e o P r o c e s s I n g • To allow us to complete the required runs, we used text analysis provided by StreamSage • StreamSage text output used as binary feature TREC-2003 (Neil O’Hare) - 13 -

Center for Digital Video Processing Combination of Features - SVM C e n t e r f o r D I g I t a l V I d e o P r o c e s s I n g • Extracted features combined using Support Vector Machine • Trained on 10 hours of the TRECVID 2003 development set (5 CNN, 5 ABC) • Resulting SVM classifier detects anchor shots • Story boundaries are logged at the beginning of anchor shots TREC-2003 (Neil O’Hare) - 14 -

Center for Digital Video Processing Submitted Runs C e n t e r f o r D I g I t a l V I d e o P r o c e s s I n g • 3 Required Runs – A/V only system - generic system for ABC and CNN ( DCU03_REQ_AV ) – A/V + text - generic system for ABC and CNN ( DCU03_REQ_AV_TEXT) – Text only - text Analysis provided by StreamSage ( DCU03_REQ_TEXT_ONLY) • 2 Additional Optional Runs – Specialised systems for ABC and CNN. Separate SVMs for each broadcaster ( DCU03_OPT_AV ) – Clustering algorithm in isolation ( DCU03_OPT_CLUSTER) TREC-2003 (Neil O’Hare) - 15 -

Center for Digital Video Processing DCU Results C e n t e r f o r D I g I t a l V I d e o P r o c e s s I n g System ID Recall Precision DCU03_REQ_AV 0.328 0.409 DCU03_REQ_AV_TEXT 0.294 0.453 DCU03_REQ_TEXT_ONLY 0.049 0.208 DCU03_OPT_AV 0.313 0.453 DCU03_OPT_CLUSTER 0.364 0.304 TREC-2003 (Neil O’Hare) - 16 -

Center for Digital Video Processing Overall Results - All Groups C e n t e r f o r D I g I t a l V I d e o P r o c e s s I n g 1 0.9 0.8 0.7 DCU Fudan 0.6 IBM Precision kddi 0.5 NUS StreamSage 0.4 UCF Iowa 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Recall TREC-2003 (Neil O’Hare) - 17 -

CDVP & TRECVID-2003 News Story Segmentation Task Csaba Czirjek, - PowerPoint PPT Presentation

Center for Digital Video Processing C e n t e r f o r D I g I t a l V I d e o P r o c e s s I n g CDVP &

Learning From Video Browse Behavior Learning From Video Browse Behavior TRECVID 2009 TRECVID

George Awad National Institute of Standards and Technology Dakota Consulting, Inc 2 TRECVID

CMU @ TRECVID Event Detection @ Ming-yu Chen & Alex Hauptmann School of Computer Science

Columbia HLF: TRECVID2006 TRECVID TRECVID TRECVID 2005 2005 2005 (development)

Event Detection in Airport Surveillance The TRECVid 2008 Evaluation The TRECVid 2008 Evaluation

TRECVID 2008 CBCD TRECVID 2008. CBCD MCG-ICT-CAS MCG-ICT-CAS Sheng Tang Yongdong Zhang Ke Gao

TRECVID 2010 K TRECVID 2010 Known item Search it S h by NUS by NUS Xiangyu Chen, Jin Yuan

Adaptive Feature Discovery for TRECVID Broadcast News Video Story Segmentation @TRECVID Workshop

Norwegian Air Shuttle ASA (NAS) Q4 2003 and FY 2003 24-26 February 2004 Agenda _ Introduction

CNGS Horns : Status 8 nov. 2003 NBI 2003 KEK, Japan 7-11 nov. 2003 NBI 2003 - CNGS Horns

2003 AGM AGM P P RESENTATION 2003 RESENTATION 15th July 2003 2003 AGM Presentation

Keppel Land Keppel Land Interim Results 2003 Interim Results 2003 24 July 2003 24 July 2003

TRECVID 2018 Video to Text Description Asad A. Butt NIST George Awad NIST; Dakota Consulting,

TRECVID-2005 Low-level (camera motion) feature task Wessel Kraaij TNO & Tzveta Ianeva

Conclusions TRECVID 2009 Conclusions TRECVID 2009 Multi Multi- -frame is true performance

TRECVID 2014 INSTANCE RETRIEVAL AN INTRODUCTION . Wessel Kraaij TNO, Radboud University

Lecture 4: SVM I Princeton University COS 495 Instructor: Yingyu Liang Review: machine learning

Play Tes)ng CS 4730 Computer Game Design Credit:

Pr ProTrac acer er: T : Towar ards Pr ds Prac ac-c -cal Pr al Provenanc enance T e

Introduc)on to the Applica)ons Area within the IETF Murray

Regularization The problem of overfitting Machine Learning Example: Linear regression (housing

Detec%ng Wildlife in Uncontrolled Outdoor Video using Convolu%onal Neural Networks Connor Bowley

Introduc)on to Scien)fic and Technical compu)ng SSC 335/394, 2011 Victor

ClassBench-ng: Recasting ClassBench After a Decade of Network Evolution Jiri Matousek 1 , Gianni

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

CDVP & TRECVID-2003 News Story Segmentation Task Csaba Czirjek, - PowerPoint PPT Presentation

Center for Digital Video Processing C e n t e r f o r D I g I t a l V I d e o P r o c e s s I n g CDVP &

Learning From Video Browse Behavior Learning From Video Browse Behavior TRECVID 2009 TRECVID

George Awad National Institute of Standards and Technology Dakota Consulting, Inc 2 TRECVID

CMU @ TRECVID Event Detection @ Ming-yu Chen &amp; Alex Hauptmann School of Computer Science

Columbia HLF: TRECVID2006 TRECVID TRECVID TRECVID 2005 2005 2005 (development)

Event Detection in Airport Surveillance The TRECVid 2008 Evaluation The TRECVid 2008 Evaluation

TRECVID 2008 CBCD TRECVID 2008. CBCD MCG-ICT-CAS MCG-ICT-CAS Sheng Tang Yongdong Zhang Ke Gao

TRECVID 2010 K TRECVID 2010 Known item Search it S h by NUS by NUS Xiangyu Chen, Jin Yuan

Adaptive Feature Discovery for TRECVID Broadcast News Video Story Segmentation @TRECVID Workshop

Norwegian Air Shuttle ASA (NAS) Q4 2003 and FY 2003 24-26 February 2004 Agenda _ Introduction

CNGS Horns : Status 8 nov. 2003 NBI 2003 KEK, Japan 7-11 nov. 2003 NBI 2003 - CNGS Horns

2003 AGM AGM P P RESENTATION 2003 RESENTATION 15th July 2003 2003 AGM Presentation

Keppel Land Keppel Land Interim Results 2003 Interim Results 2003 24 July 2003 24 July 2003

TRECVID 2018 Video to Text Description Asad A. Butt NIST George Awad NIST; Dakota Consulting,

TRECVID-2005 Low-level (camera motion) feature task Wessel Kraaij TNO &amp; Tzveta Ianeva

Conclusions TRECVID 2009 Conclusions TRECVID 2009 Multi Multi- -frame is true performance

TRECVID 2014 INSTANCE RETRIEVAL AN INTRODUCTION . Wessel Kraaij TNO, Radboud University

Lecture 4: SVM I Princeton University COS 495 Instructor: Yingyu Liang Review: machine learning

Play Tes)ng CS 4730 Computer Game Design Credit:

Pr ProTrac acer er: T : Towar ards Pr ds Prac ac-c -cal Pr al Provenanc enance T e

Introduc)on to the Applica)ons Area within the IETF Murray

Regularization The problem of overfitting Machine Learning Example: Linear regression (housing

Detec%ng Wildlife in Uncontrolled Outdoor Video using Convolu%onal Neural Networks Connor Bowley

Introduc)on to Scien)fic and Technical compu)ng SSC 335/394, 2011 Victor

ClassBench-ng: Recasting ClassBench After a Decade of Network Evolution Jiri Matousek 1 , Gianni

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

CMU @ TRECVID Event Detection @ Ming-yu Chen & Alex Hauptmann School of Computer Science

TRECVID-2005 Low-level (camera motion) feature task Wessel Kraaij TNO & Tzveta Ianeva