Sequence to Sequence Video to Text Subhashini Venugopalan, Marcus - PowerPoint PPT Presentation

Jul 03, 2023 •222 likes •474 views

Sequence to Sequence Video to Text Subhashini Venugopalan, Marcus Rohrbach, Jeff Donahue Raymond Mooney, Trevor Darrell, Kate Saenko Outline Objective Experimental Setup Current model. A Simple Extension. How is information

Sequence to Sequence Video to Text Subhashini Venugopalan, Marcus Rohrbach, Jeff Donahue Raymond Mooney, Trevor Darrell, Kate Saenko
Outline ● Objective ● Experimental Setup ● Current model. ● A Simple Extension. ● How is information distributed within the video ? ● Does model capture temporal information ? ● Conclusions & Future Work
Objective Generate video descriptions.
Experimental Setup Code: Forked from author’s github account Frame Sampling: 1 in 10 (unless otherwise mentioned) Network Architecture: VGG CNN + 2 layer LSTM Dataset : MSVD Youtube dataset (Avg Length 10.2 s, #sentences per video = 41) Vocabulary : MSVD + MPII-MD + MVAD Performance Metric: METEOR Evaluation Tool: coco_evaluation
Forward Model Able to learn abstract attributes like young etc to reasonable extent. ● Able to capture main content of video in most cases. ● PROBLEMS: Long sentences repeat words multiple times leading to lower quality sentences ● - The boys are playing with a group of a group of a group of people is sitting on a group of a group of people are watching a gym - A woman is cutting a piece of a piece of a pair of a pair of a pair. - A man is cutting a large of a large large large large floor.
Backward Model ● Process frames in reverse order !! ● Seems to perform better than forward model on validation set but almost similar performance on test set. ● How to choose best backward model ?
Bidirectional Model ● Motivated from Bidirectional N gram models used for Language Modelling in NLP ● Combine forward and backward models. - How do we select forward and backward model ? - Combining strategy ? - How are weights selected ?
Your description ??
FORWARD: The boys are playing with a group of a group of a group of people is sitting on a group of a group of people are watching a gym !! BACKWARD: Two boys are dancing. BIDIRECTIONAL: The boys are playing. LABEL: Three men are dancing in beach towels. This eg shows utility of Bidirectional Model.
Your description ??
FORWARD: A man is using a piece of a sharp. BACKWARD: A person is cutting a piece of a brush. BIDIRECTIONAL: A man is cutting a piece of a brush. LABEL: A person is performing some card tricks. All Fail :(
How is information distributed within video ? Conjecture: Central part of video contains more relevant information than frames at beginning and end for most videos
Does Model Capture Temporal Information ?
Conclusions ● Bidirectional model is more powerful than forward or backward model. ● Frames at start and end contain less information.
Future Work ● Try combining bidirectional with optical flow model. ● Try using gaussian sampling centred on video’s centre ● Is it more suitable for specific kinds of videos ? Like generating sports commentary ?
References Sequence to Sequence Video to Text - Subhashini Venugopalan, Marcus Rohrbach, Jeff Donahue, Raymond Mooney, Trevor Darrell, Kate Saenko
Thank You :)

Recommend

Sequence to Sequence Video to Text Venugopalan et al. Given a variable-length sequence of

Garrett Bingham Sequence to Sequence Video to Text Venugopalan et al. Given a variable-length sequence of video frames, Prev: Title Slide generate a variable-length natural language Problem description of the video. Next: Motivation

850 views • 15 slides

Sequence to Sequence models: Connectionist Temporal Classification 1 Sequence-to-sequence

Deep Learning Sequence to Sequence models: Connectionist Temporal Classification 1 Sequence-to-sequence modelling Problem: A sequence goes in A different sequence comes out E.g. Speech recognition:

1.86k views • 172 slides

Where do the improvements come from in sequence-to-sequence neural TTS? Oliver Watts Gustav

Where do the improvements come from in sequence-to-sequence neural TTS? Oliver Watts Gustav Eje Henter Jason Fong Cassia Valentini-Botinhao Input feature Text TEXT extraction analysis com- Input layer Hidden layers Output layer

795 views • 48 slides

SEQ 3 : Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive

SEQ 3 : Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive Sentence Compression Christos Baziotis, Ion Androutsopoulos, Ioannis Konstas, Alexandros Potamianos Ed nburgh NLP University of Edinburgh Natural

772 views • 51 slides

Asynchronous sequence circuits An asynchronous sequence machine is a sequence circuit without

Asynchronous sequence circuits An asynchronous sequence machine is a sequence circuit without flip-flops Asynchronous sequence machines are based on combinational gates with feedback Upon analysis it is assumed : Only one signal at a

1.34k views • 66 slides

Neural AMR : Sequence-to-Sequence Models for Parsing and Generation annis Konstas joint work

Neural AMR : Sequence-to-Sequence Models for Parsing and Generation annis Konstas joint work with Srinivasan Iyer, Mark Yatskar, Yejin Choi, Luke Zettlemoyer AMR graph Generate from AMR graph text Decoder Encoder text Attention AMR

1.32k views • 116 slides

Hybrid Sequence Encoder Of Collaborative Experts For Video Retrieval Kaixu Cui, Hui Liu, Cheng

Hybrid Sequence Encoder Of Collaborative Experts For Video Retrieval Kaixu Cui, Hui Liu, Cheng Wang, Yudong Jiang Introduction 1. Hybrid Sequence Encoder 2. Datasets Fusion 3. Caption Ensemble Hybrid Sequence Encoder 1. Dual Encoding [Dong

130 views • 9 slides

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Deep Learning Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem: A sequence goes in A different sequence comes out E.g. Speech recognition: Speech goes in, a word

1.77k views • 162 slides

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Deep Learning Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem: A sequence goes in A different sequence comes out E.g. Speech recognition: Speech goes in, a word

2.23k views • 167 slides

Sequence 7 January 2019 OSU CSE 1 Sequence The Sequence component family allows you to

Sequence 7 January 2019 OSU CSE 1 Sequence The Sequence component family allows you to manipulate strings of entries of any (arbitrary) type through direct access by position, similar to an array Another generic type like Queue and Set

533 views • 38 slides

Sequence-to-Sequence Learning as Beam-Search Optimization Sam Wiseman and Alexander M. Rush

Sequence-to-Sequence Learning as Beam-Search Optimization Sam Wiseman and Alexander M. Rush Seq2Seq as a General-purpose NLP/Text Generation Tool Machine Translation ???? Luong et al. [2015] Question Answering ? Conversation ? Parsing Vinyals et

878 views • 59 slides

7. Video databases Video data representations Video = time-ordered sequence of correlated

7. Video databases Video data representations Video = time-ordered sequence of correlated images ( frames ) Video signal representations originate from TV technology; different standards in USA (NTSC) and Europe (PAL, SECAM) 25-30

571 views • 16 slides

SEQUENCE ANALYSIS The term " sequence analysis " in biology implies subjecting a DNA or

Sequence Analysis SEQUENCE ANALYSIS The term " sequence analysis " in biology implies subjecting a DNA or peptide sequence to sequence alignment, sequence databases, repeated sequence searches, or other bioinformatics methods on a

818 views • 20 slides

Sequence to Sequence models: Connectionist Temporal Classification 5 March 2018 1

Deep Learning Sequence to Sequence models: Connectionist Temporal Classification 5 March 2018 1 Sequence-to-sequence modelling Problem: A sequence 1 goes in A different sequence 1 comes out

1.44k views • 141 slides

Use Cases and Interaction Diagrams (Sequence and Collaboration) Hours 6, 7, 9, and 10 Scenarios

Use Cases and Interaction Diagrams (Sequence and Collaboration) Hours 6, 7, 9, and 10 Scenarios The starting point # A scenario is a sequence of steps # Scenarios are presented in text format # No UML diagram contains the text # Drives use

259 views • 6 slides

16 Applications 1: Monolingual Sequence-to-sequence Prob- lems Up until now, we have largely

16 Applications 1: Monolingual Sequence-to-sequence Prob- lems Up until now, we have largely used machine translation as an example of sequence-to-sequence learning tasks. However, as mentioned at the beginning of the course,

216 views • 8 slides

Introduction to sequence to sequence models N ATURAL LAN GUAGE GEN ERATION IN P YTH ON

Introduction to sequence to sequence models N ATURAL LAN GUAGE GEN ERATION IN P YTH ON Biswanath Halder Data Scientist Sequence to sequence generation Output a sequence given a sequence as input. Fixed length input. Fixed length output.

884 views • 39 slides

Machine Translation/ Sequence-to-sequence Models Graham Neubig Site

CS11-737 Multilingual NLP Machine Translation/ Sequence-to-sequence Models Graham Neubig Site http://demo.clab.cs.cmu.edu/11737fa20/ Language Models Language models are generative models of text s ~ P(x) The Malfoys! said Hermione.

584 views • 37 slides

Files l Mostly handle like any sequential data type A sequence of characters if a text file, or

Starting chapter 5 Files l Mostly handle like any sequential data type A sequence of characters if a text file, or a sequence of bytes if a binary file l First open file, and say purpose read or write inputFile = open('mydata.txt', 'r')

625 views • 16 slides

Text Classification and Sequence Labeling Graham Neubig Text Classification

CMU CS11-737: Multilingual NLP Text Classification and Sequence Labeling Graham Neubig Text Classification Given an input text X , predict an output label y Topic Classification food food politics politics I like

792 views • 53 slides

Modeling language as a sequence of tokens CMSC 470 Marine Carpuat Beyond MT: Encoder-Decoder

Modeling language as a sequence of tokens CMSC 470 Marine Carpuat Beyond MT: Encoder-Decoder can be used as Conditioned Language Models P(Y|X) to generate text Y based on some input X Given some text, how to segment it into a sequence of

447 views • 43 slides

CS145: INTRODUCTION TO DATA MINING Sequence Data: Similarity Search Instructor: Yizhou Sun

CS145: INTRODUCTION TO DATA MINING Sequence Data: Similarity Search Instructor: Yizhou Sun yzsun@cs.ucla.edu November 27, 2017 Methods to be Learnt Vector Data Set Data Sequence Data Text Data Logistic Regression; Nave Bayes for Text

614 views • 46 slides

CSE 421 Algorithms Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming

CSE 421 Algorithms Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming Algorithm 2 Sequence Alignment Goal: position characters in two strings to best line up identical/similar ones with one another We can do

473 views • 28 slides

CSE 427 Comp Bio Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming

CSE 427 Comp Bio Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming Algorithm 2 Sequence Alignment Goal: position characters in two strings to best line up identical/similar ones with one another We can do this

941 views • 56 slides

Sequence to Sequence Video to Text Subhashini Venugopalan, Marcus - PowerPoint PPT Presentation

Sequence to Sequence Video to Text Subhashini Venugopalan, Marcus Rohrbach, Jeff Donahue Raymond Mooney, Trevor Darrell, Kate Saenko Outline Objective Experimental Setup Current model. A Simple Extension. How is information

Sequence to Sequence Video to Text Venugopalan et al. Given a variable-length sequence of

Sequence to Sequence models: Connectionist Temporal Classification 1 Sequence-to-sequence

Where do the improvements come from in sequence-to-sequence neural TTS? Oliver Watts Gustav

SEQ 3 : Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive

Asynchronous sequence circuits An asynchronous sequence machine is a sequence circuit without

Neural AMR : Sequence-to-Sequence Models for Parsing and Generation annis Konstas joint work

Hybrid Sequence Encoder Of Collaborative Experts For Video Retrieval Kaixu Cui, Hui Liu, Cheng

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Sequence 7 January 2019 OSU CSE 1 Sequence The Sequence component family allows you to

Sequence-to-Sequence Learning as Beam-Search Optimization Sam Wiseman and Alexander M. Rush

7. Video databases Video data representations Video = time-ordered sequence of correlated

SEQUENCE ANALYSIS The term &quot; sequence analysis &quot; in biology implies subjecting a DNA or

Sequence to Sequence models: Connectionist Temporal Classification 5 March 2018 1

Use Cases and Interaction Diagrams (Sequence and Collaboration) Hours 6, 7, 9, and 10 Scenarios

16 Applications 1: Monolingual Sequence-to-sequence Prob- lems Up until now, we have largely

Introduction to sequence to sequence models N ATURAL LAN GUAGE GEN ERATION IN P YTH ON

Machine Translation/ Sequence-to-sequence Models Graham Neubig Site

Files l Mostly handle like any sequential data type A sequence of characters if a text file, or

Text Classification and Sequence Labeling Graham Neubig Text Classification

Modeling language as a sequence of tokens CMSC 470 Marine Carpuat Beyond MT: Encoder-Decoder

CS145: INTRODUCTION TO DATA MINING Sequence Data: Similarity Search Instructor: Yizhou Sun

CSE 421 Algorithms Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming

CSE 427 Comp Bio Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming

SEQUENCE ANALYSIS The term " sequence analysis " in biology implies subjecting a DNA or