VIREO-TNO @ TRECVID 2014 Zero-Shot Event Detection and Recounting - PowerPoint PPT Presentation

VIREO-TNO @ TRECVID 2014 Zero-Shot Event Detection and Recounting Speaker: Maaike de Boer (TNO) Yi-Jie Lu 1 , Hao Zhang 1 ,Chong-Wah Ngo 1 Maaike de Boer 2 , John Schavemaker 2 , Klamer Schutte 2 , Wessel Kraaij 2 1 VIREO Group, City University of Hong Kong, Hong Kong 2 Netherlands Organization for Applied Scientific Research (TNO), Netherlands

Outline  0-Shot System – System Overview – Findings  MER System – System Workflow – Results

 Semantic Query Generation (SQG) – Given an event query , SQG translates the query description into a representation of semantic concepts Semantic Query < Objects > • Bike 0.60 SQG • Motorcycle 0.60 • Mountain bike 0.60 < Actions > • Bike trick 1.00 Event Query • Ridding bike 0.62 $ (Attempting a Bike Trick) ₤ UCF101 • $ Flipping bike 0.61 Research Collection < Scenes > • ￥ Parking lot 0.01 ImageNet ƒ € HMDB51 TRECVID SIN Relevant Concepts Relevance Score Concept Bank Concept Bank

 Concept Bank – Research collection (497 concepts) – ImageNet ILSVRC’12 (1000 concepts) – SIN’14 (346 concepts) $ ₤ UCF101 $ Research Collection ￥ ImageNet ƒ € HMDB51 TRECVID SIN Concept Bank

 Event Search – Ranking according to the SQ and concept responses q Semantic Query < Objects > • Bike 0.60 s  Event Search qc • Motorcycle 0.60 i i • Mountain bike 0.60 < Actions > • Bike trick 1.00 • Ridding bike 0.62 • Video Ranking Flipping bike 0.61 < Scenes > • Parking lot 0.01 c Concept Response i

 SQG Experiments – Exact matching vs. WordNet/ConceptNet matching – How many concepts are used to represent an event? – To further improve the weighting:  TF-IDF  Term specificity

 Exact matching vs. WordNet matching Exact matching but 0.5 only retains the top 0.45 few concepts 0.4 0.35 Average Precision 0.3 Exact Matching 0.25 0.2 7% 0.15 0.1 0.05 0 WordNet Event ID WordNet ExactMatching EM-TOP

 Amount of concepts used to represent event Hit the best MAP by only retaining the Top 8 concepts 0.08 0.07 Mean Average Precision 0.06 0.05 0.04 0.03 0.02 0.01 0 1 6 11 16 21 26 Top k Concepts MAP(all)

Insights 0.5 0.45 0.4 0.35 Average Precision 0.3 Paddle wheel Trick Wheel 0.25 Person riding 21 0.2 Jumping 0.15 Car wheel 0.1 Potter wheel 0.05 0 1 6 11 16 21 26 Top k Concepts Event 21: Attempting a bike trick

Insights Bee house (ImageNet) 0.5 Cutting (research collection) 0.45 Cutting down tree (research collection) 0.4 0.35 Average Precision 0.3 0.25 0.2 Bee (ImageNet) 0.15 31 0.1 Honeycomb (ImageNet) 0.05 0 1 6 11 16 21 26 Top k Concepts Event 31: Beekeeping

Insights 0.5 Dog show (research collection) 0.45 0.4 0.35 Average Precision 0.3 0.25 0.2 23 Brush dog (research collection) 0.15 0.1 0.05 0 1 6 11 16 21 26 Top k Concepts Event 23: Dog show

 Improvements by TF-IDF and word specificity Method MAP (on MED14-Test) Exact Matching Only 0.0306 Exact Matching + TF 0.0420 Exact Matching + TFIDF 0.0495 Exact Matching + TFIDF + Word Specificity 0.0502 0.06 0.05 0.04 0.03 0.02 0.01 0 EM Only EM + TF EM + TFIDF EM + TFIDF + Spec.

Findings 1. Exact matching performs better than matching with WordNet and/or ConceptNet 2. Performance is even better by only retaining the top few exactly matched concepts 3. Adding both TF-IDF and Word Specificity increases performance

 Why ontology-based mapping would not work? A sample query in TRECVID 2009

 Why ontology-based mapping would not work? red wolf ImageNet kit fox cat Concept horse “dog” mammal SIN Dog Show carnivore animal

 Why ConceptNet mapping would not work? desires driver tailgating car engine food bus helmet parking lot Tailgating team uniform portable shelter

Findings  It is difficult to – harness the ontology-based mapping while constraining the mapping by event context

 In the Ad-Hoc event “Extinguishing a Fire” – Key concepts are missing:  Fire extinguisher  Firefighter

Findings  It is reasonable to – Scale up the number of concepts, thus increasing the chance of exact matching

MED14-Eval-Full Results  PS 000Ex – Automatic semantic query generation and search – Fusion of 0-Shot and OCR system – Achieves the MAP of 5.2  AH 000Ex – System is the same as in PS 000Ex – Achieves the MAP of 2.6 – Performance drops due to the lack of key concepts

MER System  In algorithm design, we aim to optimize – Concept-to-event relevancy – Evidence diversity – Viewing time of evidential shots

MER System  In algorithm design, we aim to optimize – Concept-to-event relevancy  First, we require that candidate shots are relevant to the event;  Second, we do concept-to-shot alignment. – Evidence diversity – Viewing time of evidential shots

MER System  In algorithm design, we aim to optimize – Concept-to-event relevancy  First, we require that candidate shots are relevant to the event;  Second, we do concept-to-shot alignment. – Evidence diversity  In concept-to-shot alignment, we recount each shot with a unique concept different from other shots. – Viewing time of evidential shots

MER System  In algorithm design, we aim to optimize – Concept-to-event relevancy  First, we require that candidate shots are relevant to the event;  Second, we do concept-to-shot alignment. – Evidence diversity  In concept-to-shot alignment, we recount each shot with a unique concept different from other shots. – Viewing time of evidential shots  Select only the three most confident shots as key evidence  Basically, each shot is in about 5 seconds

 Key Evidence Localization Extract keyframes uniformly

 Key Evidence Localization Concept Reponses $ ₤ UCF101 $ Research Collection Apply concept detectors ￥ ImageNet ƒ € HMDB51 TRECVID SIN Concept Bank

 Key Evidence Localization Choose keyframes that are most relevant to this event • All concepts in semantic query are taken into account by calculating s  the weighted sum wr i i

 Key Evidence Localization Expand keyframes to shots

 Key Evidence Localization The top 3 shots are selected as key evidences

 Key Evidence Localization The rests are non-key evidences

 Concept-to-Shot Alignment Semantic Query < Objects > • Bike Key • Motorcycle • Mountain bike < Actions > • Bike trick Key • Ridding bike Key • Flipping bike Non-Key < Scenes > • Parking lot Ridding bike Bike trick Bike trick Bike Bike Ridding bike The top concept in the key evidence is selected as the representative concept * We choose unique concept for each shot

MER14 Results The percentage of strongly agree 30% 30% 25% 25% 20% 20% 15% 15% 10% 10% 5% 5% 0% 0% Team2 VIREO Team4 Team3 Team6 Team1 Team5 VIREO Team1 Team2 Team3 Team4 Team6 Team5 (a) Evidence quality (b) Event query quality

MER14 Results The percentage of both agree and strongly agree 70% 90% 80% 60% 70% 50% 60% 40% 50% 40% 30% 30% 20% 20% 10% 10% 0% 0% Team2 VIREO Team4 Team1 Team6 Team5 Team3 Team1 Team2 Team3 VIREO Team4 Team5 Team6 (a) Evidence quality (b) Event query quality

Summary  0-Shot System – The simple exact matching performs the best – The quality of concepts selected to represent an event is more important than quantity – It’s an open problem of how to harness the ontology- based mapping

Summary  MER System – In key evidence localization, we emphasize the event relevancy first, then the hot concepts – We recommend three shots as key evidences and each in about 5 seconds

Thanks!

VIREO-TNO @ TRECVID 2014 Zero-Shot Event Detection and Recounting - PowerPoint PPT Presentation

VIREO-TNO @ TRECVID 2014 Zero-Shot Event Detection and Recounting Speaker: Maaike de Boer (TNO) Yi-Jie Lu 1 , Hao Zhang 1 ,Chong-Wah Ngo 1 Maaike de Boer 2 , John Schavemaker 2 , Klamer Schutte 2 , Wessel Kraaij 2 1 VIREO Group, City University of

TRECVID 2014 INSTANCE RETRIEVAL AN INTRODUCTION . Wessel Kraaij TNO, Radboud University

TRECVID-2005 Low-level (camera motion) feature task Wessel Kraaij TNO & Tzveta Ianeva

AN INTRODUCTION . Wessel Kraaij TNO, Radboud University Nijmegen Paul Over NIST TRECVID

TRECVID 2015 INSTANCE RETRIEVAL INTRODUCTION AND TASK OVERVIEW Wessel Kraaij TNO; Radboud

AN INTRODUCTION . Wessel Kraaij TNO, Radboud University Nijmegen Paul Over NIST 2 TRECVID

FP7 SOCRATES Hans van den Berg (TNO, NL) Ljupco Jorguseski (TNO, NL) Neil Scully (Vodafone, UK)

TNO 2013 approach to TRECVID MED Klamer Schutte , Henri Bouma, George Azzopardi, Martijn Spitters,

TRECVID-2006 High-Level Feature task: Overview Wessel Kraaij TNO & Paul Over NIST

TRECVID-2005 High-Level Feature task: Overview Wessel Kraaij TNO & Paul Over NIST

What is TRECVID? Workshop series (2001 present) http://trecvid.nist.gov to promote

Learning From Video Browse Behavior Learning From Video Browse Behavior TRECVID 2009 TRECVID

George Awad National Institute of Standards and Technology Dakota Consulting, Inc 2 TRECVID

CMU @ TRECVID Event Detection @ Ming-yu Chen & Alex Hauptmann School of Computer Science

Columbia HLF: TRECVID2006 TRECVID TRECVID TRECVID 2005 2005 2005 (development)

Event Detection in Airport Surveillance The TRECVid 2008 Evaluation The TRECVid 2008 Evaluation

TRECVID 2008 CBCD TRECVID 2008. CBCD MCG-ICT-CAS MCG-ICT-CAS Sheng Tang Yongdong Zhang Ke Gao

Value Proposition How Would You Define YOUR Value Proposition? F oc us c ome s Target fr om c

CPE/CSC 481: Knowledge-Based Systems Franz J. Kurfess Computer Science Department California

Welcome to Intelligent Robotics McGill COMP 765 Sept 5 th , 2017 Robotics Today The

Logistics Projects Interpolation of Quaternions Reminder that proposal is due before

Introductions Instructor : Prof. Kristen Grauman TA : Dongguang You 1 Today Course

Can Some Programming Languages Be Considered Harmful? S.Janssens U.P.Schultz V.Zaytsev

Proposal to the Quekett Microscopical Club The Arthur Earland Collection of Foram inifera Slides

Mesozoic Cenozoic Bolides! Asteroid: A rocky or metallic minor planet, or planetoid Comet: an

VIREO-TNO @ TRECVID 2014 Zero-Shot Event Detection and Recounting - PowerPoint PPT Presentation

VIREO-TNO @ TRECVID 2014 Zero-Shot Event Detection and Recounting Speaker: Maaike de Boer (TNO) Yi-Jie Lu 1 , Hao Zhang 1 ,Chong-Wah Ngo 1 Maaike de Boer 2 , John Schavemaker 2 , Klamer Schutte 2 , Wessel Kraaij 2 1 VIREO Group, City University of

TRECVID 2014 INSTANCE RETRIEVAL AN INTRODUCTION . Wessel Kraaij TNO, Radboud University

TRECVID-2005 Low-level (camera motion) feature task Wessel Kraaij TNO &amp; Tzveta Ianeva

AN INTRODUCTION . Wessel Kraaij TNO, Radboud University Nijmegen Paul Over NIST TRECVID

TRECVID 2015 INSTANCE RETRIEVAL INTRODUCTION AND TASK OVERVIEW Wessel Kraaij TNO; Radboud

AN INTRODUCTION . Wessel Kraaij TNO, Radboud University Nijmegen Paul Over NIST 2 TRECVID

FP7 SOCRATES Hans van den Berg (TNO, NL) Ljupco Jorguseski (TNO, NL) Neil Scully (Vodafone, UK)

TNO 2013 approach to TRECVID MED Klamer Schutte , Henri Bouma, George Azzopardi, Martijn Spitters,

TRECVID-2006 High-Level Feature task: Overview Wessel Kraaij TNO &amp; Paul Over NIST

TRECVID-2005 High-Level Feature task: Overview Wessel Kraaij TNO &amp; Paul Over NIST

What is TRECVID? Workshop series (2001 present) http://trecvid.nist.gov to promote

Learning From Video Browse Behavior Learning From Video Browse Behavior TRECVID 2009 TRECVID

George Awad National Institute of Standards and Technology Dakota Consulting, Inc 2 TRECVID

CMU @ TRECVID Event Detection @ Ming-yu Chen &amp; Alex Hauptmann School of Computer Science

Columbia HLF: TRECVID2006 TRECVID TRECVID TRECVID 2005 2005 2005 (development)

Event Detection in Airport Surveillance The TRECVid 2008 Evaluation The TRECVid 2008 Evaluation

TRECVID 2008 CBCD TRECVID 2008. CBCD MCG-ICT-CAS MCG-ICT-CAS Sheng Tang Yongdong Zhang Ke Gao

Value Proposition How Would You Define YOUR Value Proposition? F oc us c ome s Target fr om c

CPE/CSC 481: Knowledge-Based Systems Franz J. Kurfess Computer Science Department California

Welcome to Intelligent Robotics McGill COMP 765 Sept 5 th , 2017 Robotics Today The

Logistics Projects Interpolation of Quaternions Reminder that proposal is due before

Introductions Instructor : Prof. Kristen Grauman TA : Dongguang You 1 Today Course

Can Some Programming Languages Be Considered Harmful? S.Janssens U.P.Schultz V.Zaytsev

Proposal to the Quekett Microscopical Club The Arthur Earland Collection of Foram inifera Slides

Mesozoic Cenozoic Bolides! Asteroid: A rocky or metallic minor planet, or planetoid Comet: an

TRECVID-2005 Low-level (camera motion) feature task Wessel Kraaij TNO & Tzveta Ianeva

TRECVID-2006 High-Level Feature task: Overview Wessel Kraaij TNO & Paul Over NIST

TRECVID-2005 High-Level Feature task: Overview Wessel Kraaij TNO & Paul Over NIST

CMU @ TRECVID Event Detection @ Ming-yu Chen & Alex Hauptmann School of Computer Science