Leveraging Multimodal LDA for Hyperlinking Anca Roxana Simon, Ronan - PowerPoint PPT Presentation

Introduction Joining Audio and Visual Informations Data and Evaluation Results and Analysis Conclusion Leveraging Multimodal LDA for Hyperlinking Anca Roxana Simon, Ronan Sicre, R´ emi Bois, Guillaume Gravier, Pascale S´ ebillot IRISA – France 1 / 13

Introduction Joining Audio and Visual Informations Data and Evaluation Results and Analysis Conclusion Plan Introduction 1 2 / 13

Introduction Joining Audio and Visual Informations Data and Evaluation Results and Analysis Conclusion Hyperlinking: Linking video fragments For machines and for humans ◮ “Advanced tasks” (e.g., video summarization) ◮ Media workers, companies (e.g., analytics) ◮ Generic user (e.g., recommendation) For machines ◮ Near-duplicates (can be used for clustering or automatic summarization) ◮ Fragments that are part of the timeline (i.e. related events that happened just before or just after) For humans ◮ Diverse targets to cover the potential interests of the user 2 / 13

Introduction Joining Audio and Visual Informations Data and Evaluation Results and Analysis Conclusion Plan Joining Audio and Visual Informations 2 3 / 13

Introduction Joining Audio and Visual Informations Data and Evaluation Results and Analysis Conclusion Latent Dirichlet Allocation The idea ◮ Latent topics are extracted from a collection ◮ A document is represented by its topics probabilities ◮ Topics distributions can be compared ◮ Documents that do not share vocabulary can have a high similarity 3 / 13

Introduction Joining Audio and Visual Informations Data and Evaluation Results and Analysis Conclusion Building conjointly two modalities Using both audio and visual informations ◮ Idea: From comparable documents in two languages, build topics in both languages conjointly ◮ We use audio and visual informations as two different languages and build cross-modality topics ◮ For each visual topic, there exists a corresponding audio topic 4 / 13

Introduction Joining Audio and Visual Informations Data and Evaluation Results and Analysis Conclusion Exemples of mappings between audio and visual Most probable words from topic n ◦ 3 in our model: Audio love home feel day life baby made thing la Visual singer microphone sax concert master-of-ceremonies cornet flute trombone banjo Most probable words from topic n ◦ 25 in our model: Audio years technology computer find key future power machine speed science Visual equipment machine tape-player computer appliance-recording memory-tape CD-player 5 / 13

Introduction Joining Audio and Visual Informations Data and Evaluation Results and Analysis Conclusion From visual to audio Objective By learning this mapping, we can apply the usual topic similarities (i.e. audio → audio or visual → visual). We can also apply cross-modality similarities (i.e. audio → visual or visual → audio). New kinds of links Cross-modality similarities correspond to: ◮ Seeing more about what is said ◮ Hearing more about what is shown 6 / 13

Introduction Joining Audio and Visual Informations Data and Evaluation Results and Analysis Conclusion Plan Data and Evaluation 3 7 / 13

Introduction Joining Audio and Visual Informations Data and Evaluation Results and Analysis Conclusion Our system What we used ◮ Automatic transcriptions from LIMSI ◮ Visual concepts from Leuven Method and Reranking Run1 Visual similarity (no topics) with visual reranking (top 50) Run2 Audio to visual with visual reranking (top 50) Run3 Visual to audio with ngram reranking (top 50) Run4 Rank Aggregation Reranking ◮ CNN trained on ImageNet ILSVRC (VGG 16) for visual reranking ◮ Unigram, bigram and trigram cosine similarity for ngram reranking 7 / 13

Introduction Joining Audio and Visual Informations Data and Evaluation Results and Analysis Conclusion Plan Results and Analysis 4 8 / 13

Introduction Joining Audio and Visual Informations Data and Evaluation Results and Analysis Conclusion Near-Median scores but hard to compare Minimum 25% 50% 75% Maximum Prec 10 0.017 0.198 0.275 0.524 0.608 Run1 0.207 Run2 0.017 Run3 0.224 Run4 0.156 Table: Results for our four runs 8 / 13

Introduction Joining Audio and Visual Informations Data and Evaluation Results and Analysis Conclusion Some of our relevant targets (RUN3) Anchor 85 ◮ Talks about the Ireland saying “No” to the Lisbon Treaty ◮ Europe is not happy, Mandelson (UK politician) is blamed by Nicolas Sarkozy but Gordon Brown supports Mandelson Target 3 ◮ Almost identical content (another news show 3 hours later) Target 8 ◮ Explanation of the successive difficulties of the EU in the ratification of treaties ◮ Focuses on times when referendum were used as opposed to parliamentary ratification 9 / 13

Introduction Joining Audio and Visual Informations Data and Evaluation Results and Analysis Conclusion Some of our non-relevant targets Anchor 85 ◮ Talks about the Ireland saying “No” to the Lisbon Treaty ◮ Europe is not happy, Mandelson (UK politician) is blamed by Nicolas Sarkozy but Gordon Brown supports him Target 6 ◮ The UK Parliament debates on the answer that should be given to Ireland: push them to do another referendum or don’t pressure them ◮ Gordon Brown is in favor of pressuring them while the opposition calls for inaction 10 / 13

Introduction Joining Audio and Visual Informations Data and Evaluation Results and Analysis Conclusion Suggestions for the evaluation What we think ◮ Almost identical targets should be identified ◮ There should be several Turkers by anchor/target pair What we know ◮ There would be a low inter-annotator agreement 11 / 13

Introduction Joining Audio and Visual Informations Data and Evaluation Results and Analysis Conclusion Plan Conclusion 5 12 / 13

Introduction Joining Audio and Visual Informations Data and Evaluation Results and Analysis Conclusion Strengths and weaknesses Strengths ◮ Brings more diversity ◮ A new way to exploit cross-modality ◮ More control over link creation Weaknesses ◮ Works badly on some anchors (e.g., visual → audio showing an anchorman) 12 / 13

Introduction Joining Audio and Visual Informations Data and Evaluation Results and Analysis Conclusion Push the community for more diversity 13 / 13

Leveraging Multimodal LDA for Hyperlinking Anca Roxana Simon, Ronan - PowerPoint PPT Presentation

Introduction Joining Audio and Visual Informations Data and Evaluation Results and Analysis Conclusion Leveraging Multimodal LDA for Hyperlinking Anca Roxana Simon, Ronan Sicre, R emi Bois, Guillaume Gravier, Pascale S ebillot IRISA

SALT LAKE LEGAL DEFENDER (LDA) AND SOCIAL SERVICES Who we are, what we do, court system and how LDA

Multimodal Machine Learning Louis-Philippe (LP) Morency CMU Multimodal Communication and Machine

Multimodal Machine Learning Louis-Philippe (LP) Morency CMU Multimodal Communication and Machine

IRISA @ TRECVID2017 Beyond Crossmodal and Multimodal Models Task: Video Hyperlinking Mikail

Understanding Landscape Visualisation for Visual Impact Assessments Lock, David.J. 1 1 LDA Design,

Your local partner of choice THE ENGCO GROUP ENGCO Group consists of six companies: ENGCO, Lda

LDA 1 [Credits: Mike Smith, Las Vegas Sun 2013] LDA 2 [Credits: IITD Library] 4 5 6 In

PLUGIN CLASSIFIERS: NAIVE BAYES, LDA, PLUGIN CLASSIFIERS: NAIVE BAYES, LDA, LOGISTIC REGRESSION

SVD-LDA: Topic Modeling for Full-Text Recommender Systems Sergey Nikolenko Steklov Mathematical

Linking words to topics Pavel Oleinikov Associate Director DataCamp Topic Modeling in R LDA

Multimodal Corridor Planning & Engineering Analysis Project A1A MULTIMODAL CORRIDOR PLANNING

MULTIMODAL OPTIMIZATION MIKE PREUSS. Multimodal Optimization 1 2014-09-14 Mike Preuss

Video Hyperlinking TRECVid 2015 Roeland Ordelman, Robin Aly

CMU-SMU@TRECVID 2015: Video Hyperlinking Zhiyong Cheng 1 , Xuanchong Li 2 , Jialie Shen 1 ,

Formal editing: jEdit-MMT 1. Developer defines new logic navigation, hyperlinking,

Methods/Software as Standards e.g., LDA Lead: All Participants: Andre Skupin, Margaret

3. Preference Learning Techniques 4. Complexity of Preference Learning 5. Conclusions 1 ECAI

RegML 2020 Class 4 Regularization for multi-task learning Lorenzo Rosasco UNIGE-MIT-IIT

Incorporating Engineering Knowledge Max Yi Ren, Panos Y. Papalambros University of Michigan, Ann

Financing on-site power in Myanmar On-site power for off-grid telecom tower Grid-tied rooftop

Representing, Eliciting, and Reasoning with Preferences ICAPS-09 Tutorial Ronen Brafman Carmel

Representing, Eliciting, and Reasoning with Preferences AAAI-07 Tutorial Forum Ronen Brafman

Differentiation Advantage 4/17/2008

THE USE OF PERSONAL HEALTH INFORMATION FOR RESEARCH: A TALE OF THREE PROVINCES A N I TA F I N E

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Leveraging Multimodal LDA for Hyperlinking Anca Roxana Simon, Ronan - PowerPoint PPT Presentation

Introduction Joining Audio and Visual Informations Data and Evaluation Results and Analysis Conclusion Leveraging Multimodal LDA for Hyperlinking Anca Roxana Simon, Ronan Sicre, R emi Bois, Guillaume Gravier, Pascale S ebillot IRISA

SALT LAKE LEGAL DEFENDER (LDA) AND SOCIAL SERVICES Who we are, what we do, court system and how LDA

Multimodal Machine Learning Louis-Philippe (LP) Morency CMU Multimodal Communication and Machine

Multimodal Machine Learning Louis-Philippe (LP) Morency CMU Multimodal Communication and Machine

IRISA @ TRECVID2017 Beyond Crossmodal and Multimodal Models Task: Video Hyperlinking Mikail

Understanding Landscape Visualisation for Visual Impact Assessments Lock, David.J. 1 1 LDA Design,

Your local partner of choice THE ENGCO GROUP ENGCO Group consists of six companies: ENGCO, Lda

LDA 1 [Credits: Mike Smith, Las Vegas Sun 2013] LDA 2 [Credits: IITD Library] 4 5 6 In

PLUGIN CLASSIFIERS: NAIVE BAYES, LDA, PLUGIN CLASSIFIERS: NAIVE BAYES, LDA, LOGISTIC REGRESSION

SVD-LDA: Topic Modeling for Full-Text Recommender Systems Sergey Nikolenko Steklov Mathematical

Linking words to topics Pavel Oleinikov Associate Director DataCamp Topic Modeling in R LDA

Multimodal Corridor Planning &amp; Engineering Analysis Project A1A MULTIMODAL CORRIDOR PLANNING

MULTIMODAL OPTIMIZATION MIKE PREUSS. Multimodal Optimization 1 2014-09-14 Mike Preuss

Video Hyperlinking TRECVid 2015 Roeland Ordelman, Robin Aly

CMU-SMU@TRECVID 2015: Video Hyperlinking Zhiyong Cheng 1 , Xuanchong Li 2 , Jialie Shen 1 ,

Formal editing: jEdit-MMT 1. Developer defines new logic navigation, hyperlinking,

Methods/Software as Standards e.g., LDA Lead: All Participants: Andre Skupin, Margaret

3. Preference Learning Techniques 4. Complexity of Preference Learning 5. Conclusions 1 ECAI

RegML 2020 Class 4 Regularization for multi-task learning Lorenzo Rosasco UNIGE-MIT-IIT

Incorporating Engineering Knowledge Max Yi Ren, Panos Y. Papalambros University of Michigan, Ann

Financing on-site power in Myanmar On-site power for off-grid telecom tower Grid-tied rooftop

Representing, Eliciting, and Reasoning with Preferences ICAPS-09 Tutorial Ronen Brafman Carmel

Representing, Eliciting, and Reasoning with Preferences AAAI-07 Tutorial Forum Ronen Brafman

Differentiation Advantage 4/17/2008

THE USE OF PERSONAL HEALTH INFORMATION FOR RESEARCH: A TALE OF THREE PROVINCES A N I TA F I N E

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Multimodal Corridor Planning & Engineering Analysis Project A1A MULTIMODAL CORRIDOR PLANNING