Fi Fine ne-gr grained ained Vid Video eo-Te Text Re Retrieval - PowerPoint PPT Presentation

Fi Fine ne-gr grained ained Vid Video eo-Te Text Re Retrieval wi with th Hier Hierar arch chic ical al Gr Graph Re Reasoning Shizhe Chen 1 , Yida Zhao 1 , Qin Jin 1 , Qi Wu 2 1 Renmin University of China , 2 University of Adelaide 1

Vi Video-Te Text Cr Cros oss-mod modal Re Retrieval • Task: using sentences to retrieve videos • Sentences contain richer and more structured details than keywords 2

文本视频跨模态检索：动机 Mot Motivation on • Understanding fine-grained semantics in the query sentence • Hierarchical sentence structures • Event Action-action relationships • Actions Action-entity relationships • Entities • Fine-grained local components & how they compose to the event 3

文本视频跨模态检索：动机 Motivation Mot on • Understanding fine-grained semantics in the query sentence • Hierarchical sentence structures • Event Action-action relationships • Actions Action-entity relationships • Entities • Fine-grained local components & how they compose to the event • Limitations of previous works • Global matching: one vector • hard to capture fine-grained details • Local matching: word level • Cannot express complex relationships among words 4

文本视频跨模态检索：模型 Th The Pr Proposed Me Method od • Hierarchical Graph Reasoning Model (HGR) • Hierarchical Textual Encoding • Hierarchical Video Encoding • Multi-level Video-Text Matching 5

文本视频跨模态检索：模型 Hier Hierar archic hical al Te Textual Enc Encodi ding ng Semantic Role Graph Node Initialization Attention-based Graph Reasoning contextual word embedding o Capture interactive context via Event attentive relational GCN node: o Factorize relational matrix max pooling Action Entity 6

Hier Hierar archic hical al Vi Video Enc Encodi ding ng • Video contain multiple aspects • Objects, actions, events • Challenging to parse directly as texts, which require object detection, tracking, action segmentation etc. • Different weights for each level • Use different level of text as guidance to learn diverse video representation 7

Mu Multi-le level el Cr Cros oss-mod modal Ma Matching • Multi-level fusion • Event Level • Global Matching • Cosine similarity • Action & Entity Levels • Local Matching • Weakly supervised attentive alignment • Training objective • contrastive ranking loss 8

文本视频跨模态检索 : 实验 Expe Experimental Se Settings • Datasets dataset Train Validation Test # sent/video MSR-VTT 6573 497 2990 20 TGIF 79451 10651 11310 1 VATEX 25991 1500 1500 10 Youtube2Text - - 670 41.5 • Evaluation metric • R@K: K={1, 5, 10} • MedR (median rank) & MnR (mean rank) 9

文本视频跨模态检索 : 实验 Expe Experimental Re Results • In-domain cross-modal retrieval • HGR model achieves consistent improvements on three datasets MSR-VTT dataset 10

文本视频跨模态检索 : 实验 Expe Experimental Re Results • In-domain cross-modal retrieval: ablation study • Textual encoding • Graph attention & Semantic role awareness • Video encoding • Different video weights at each level MSR-VTT dataset 11

文本视频跨模态检索 : 实验 Expe Experimental Re Results • Cross-dataset video-text retrieval • Train on MSRVTT and test on Youtube2Text • HGR model also has better generalization performances In-domain Cross-dataset 12

文本视频跨模态检索 : 实验 Expe Experimental Re Results • Fine-grained binary selection • Evaluation models’ fine-grained textual discrimination abilities • Better performances, especial incomplete events a man is positive a dog hits a man’s hands cutting pizza . positive with its paws while standing. pizza is negative negative a dog hits a man’s hands . cutting a man . 13

Con Conclusion on • Contributions • Decompose video and text at event, action and entity levels for multi-level cross-modal matching • Utilize attention-based graph reasoning on textual semantic role graph to generate hierarchical embeddings • Results on in-domain, cross-dataset and fine-grained binary selection demonstrate the advantages of our model • Future work • Improve video encoding with multi-modalities and fine-grained spatial- temporal information Codes are released at: https://github.com/cshizhe/hgr_v2t 14

Fi Fine ne-gr grained ained Vid Video eo-Te Text Re Retrieval - PowerPoint PPT Presentation

Fi Fine ne-gr grained ained Vid Video eo-Te Text Re Retrieval wi with th Hier Hierar arch chic ical al Gr Graph Re Reasoning Shizhe Chen 1 , Yida Zhao 1 , Qin Jin 1 , Qi Wu 2 1 Renmin University of China , 2 University of Adelaide

Fine Grained Access Control Fine-Grained Access Control Fine Grained Access Control

Fi Fine ne-gr grained ained Vid Video eo-Te Text Re Retrieval wi with th Hier Hierar

Fine-Grained Access Control Fine Grained Access Control Fine-grained access control examples:

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

Fine-Grained Geographic Communication (Geocast) Nexus Workshop Frank Drr 23.07.2003 1

Average-Case Fine-Grained Hardness Marshall Ball Alon Rosen Manuel Sabin Prashant Nalini

Fine-grained Visual Analysis: From Classification to Retrieval Yi-Zhe Song SketchX Lab, CVSSP,

Mechanized Verification of Fine-grained Concurrent Programs Ilya Sergey Aleks Nanevski

Junfeng Fan ESAT/COSIC ECC implementation methods Multi-core systems Coarse-Grained

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

Combining Data-Intense and Compute-Intense Methods for Fine-Grained Morphological Analyses Petra

Fine-Grained Power Modeling for Smartphones Using System Call Tracing Based on paper and

Fine-Grained Tracking of Grid Infections Ashish Gehani SRI Basim Baig, Salman Mahmood, Dawood

Addressing Inter-Class Similarity in Fine-Grained Visual Classification Abhimanyu Dubey

Fine-grained Image Recognition Lei Wang VILA group School of Computing and Information

On the Correctness Criteria of Fine-Grained Access Control in Relational Databases Qihua Wang,

Is Democracy Possible? Nir Oren n.oren @abdn.ac.uk University of Aberdeen March 30, 2012 Nir

NUTRITION DURING CANCER TREATMENT TODAYS OUTLINE Eating Healthfully During Cancer Treatment

Discovering Correlation Jill illes V s Vreeken 5 5 June 2015 2015 Questions of the day What

No design system is or should be perfect. That which is overdesigned, too highly specific,

BELL RINGER The table below shows some of the properties of the elements cobalt and nickel. A

What is the Expected Return on a Stock? Ian Martin Christian Wagner May, 2018 Martin &

Value Creation Through Constructive Activism Q Q4 2019 Shareholder Update Call February 25, 2020

On Market-Making and Delta-Hedging 1 Market Makers 2 Market-Making and Bond-Pricing On

Fi Fine ne-gr grained ained Vid Video eo-Te Text Re Retrieval - PowerPoint PPT Presentation

Fi Fine ne-gr grained ained Vid Video eo-Te Text Re Retrieval wi with th Hier Hierar arch chic ical al Gr Graph Re Reasoning Shizhe Chen 1 , Yida Zhao 1 , Qin Jin 1 , Qi Wu 2 1 Renmin University of China , 2 University of Adelaide

Fine Grained Access Control Fine-Grained Access Control Fine Grained Access Control

Fi Fine ne-gr grained ained Vid Video eo-Te Text Re Retrieval wi with th Hier Hierar

Fine-Grained Access Control Fine Grained Access Control Fine-grained access control examples:

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

Fine-Grained Geographic Communication (Geocast) Nexus Workshop Frank Drr 23.07.2003 1

Average-Case Fine-Grained Hardness Marshall Ball Alon Rosen Manuel Sabin Prashant Nalini

Fine-grained Visual Analysis: From Classification to Retrieval Yi-Zhe Song SketchX Lab, CVSSP,

Mechanized Verification of Fine-grained Concurrent Programs Ilya Sergey Aleks Nanevski

Junfeng Fan ESAT/COSIC ECC implementation methods Multi-core systems Coarse-Grained

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

Combining Data-Intense and Compute-Intense Methods for Fine-Grained Morphological Analyses Petra

Fine-Grained Power Modeling for Smartphones Using System Call Tracing Based on paper and

Fine-Grained Tracking of Grid Infections Ashish Gehani SRI Basim Baig, Salman Mahmood, Dawood

Addressing Inter-Class Similarity in Fine-Grained Visual Classification Abhimanyu Dubey

Fine-grained Image Recognition Lei Wang VILA group School of Computing and Information

On the Correctness Criteria of Fine-Grained Access Control in Relational Databases Qihua Wang,

Is Democracy Possible? Nir Oren n.oren @abdn.ac.uk University of Aberdeen March 30, 2012 Nir

NUTRITION DURING CANCER TREATMENT TODAYS OUTLINE Eating Healthfully During Cancer Treatment

Discovering Correlation Jill illes V s Vreeken 5 5 June 2015 2015 Questions of the day What

No design system is or should be perfect. That which is overdesigned, too highly specific,

BELL RINGER The table below shows some of the properties of the elements cobalt and nickel. A

What is the Expected Return on a Stock? Ian Martin Christian Wagner May, 2018 Martin &amp;

Value Creation Through Constructive Activism Q Q4 2019 Shareholder Update Call February 25, 2020

On Market-Making and Delta-Hedging 1 Market Makers 2 Market-Making and Bond-Pricing On

What is the Expected Return on a Stock? Ian Martin Christian Wagner May, 2018 Martin &