Scene Graph Parsing as Dependency Parsing Author: Yu-Siang Wang , - PowerPoint PPT Presentation

Scene Graph Parsing as Dependency Parsing Author: Yu-Siang Wang , Chenxi Liu, Xiaohui Zeng, Alan Yuille Conference: North American Chapter of the Association for Computational Linguistics, 2018 1

Outline Introduction ● Method ● Experiments ● Conclusion ● 2

Introduction Introduction ● Method ● Experiments ● Conclusion ● 3

Introduction Many multimodal tasks fit into this picture ● A young boy wearing Intermediate black shirt is in front Representation of a goal 4

Image Generation from Text A young boy wearing Intermediate black shirt is in front Representation of a goal 5

Image Captioning A young boy wearing Intermediate black shirt is in front Representation of a goal 6

Image Retrieval A young boy wearing Intermediate black shirt is in front Representation of a goal 7

Neural Network Embedding Neural network embeddings often used as the intermediate representation ● Pro: easy training; similarity with cosine distance ● Con: no explicit structure; no easy interpretability ● A young boy wearing 1.2, -1.3, 4.6, …, -3.7 black shirt is in front 2.3, -2.2, -2.6,…, 5.3 of a goal 3.8, -7.4,-5.9 …, -3.2 8

Scene Graph More recently, people start exploring a more explainable representation ● Has 3 types of nodes: object, attribute, relation ● A young boy wearing black shirt is in front of a goal 9 Ref: Johnson et al., Image Retrieval Using Scene Graph, CVPR 2015

Our Goal Parsing from sentence to scene graph (i.e., scene graph parsing) ● A young boy wearing black shirt is in front of a goal 10

Previous Work: Separated Two-stage Standard Heuristic rules; Dependency Simple classifier Parsing a young boy wearing Ref: Anderson et al., SPICE: Semantic Propositional Image Caption Evaluation, ECCV 2016 black shirt is in front of a man 11

Our Work: End-to-end One-stage a young boy wearing Ref: Anderson et al., SPICE: Semantic Propositional Image Caption Evaluation, ECCV 2016 black shirt is Customized in front of a Dependency man Equivalent Parsing 12

Method Introduction ● Method ● Experiments ● Conclusion ● 13

Scene Graph Node-centric View 14

Pushing Labels from Node to Arc Node-centric View Edge-centric View Object node to attribute node Object node to relation node Relation node to object node Equivalent Different colors are different arc labels ● Under the edge-centric view, scene graphs begin to look like dependency parses ● 15

Review of Dependency Parsing 3. Pick a System (e.g. 2. Define a Label Space! 1. Get a Corpus! Arc-Hybrid) and its Actions! NSUBJ LEFT NMOD RIGHT CASE SHIFT DET ... ... 16

How we do Scene Graph Parsing? 3. Pick a System (e.g. 2. Define a Label Space! 1. Get a Corpus! Arc-Hybrid) and its Actions! ? ? ? 17

How we do Scene Graph Parsing? 3. Pick a System (e.g. 2. Define a Label Space! 1. Get a Corpus! Arc-Hybrid) and its Actions! ? ? ? 18

Visual Genome In Visual Genome, every image is annotated with 30 regions on average ● Every region is annotated with a (region) description and a (region) scene graph ● kid sit on ground A young boy wearing black A kid is sitting on the ground shirt is in front of a goal 19 Ref: Krishna et al., Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations, IJCV 2017

Alignment Strategy To mimic a dependency parsing training corpus, we need alignment between ● nodes in the scene graph and words in the sentence We propose a two-round alignment strategy: ● Within each round, object, attribute, relation nodes are aligned in this order ○ First round is more “conservative” (word-by-word match) ○ Second round is more “aggressive” (synonyms match) ○ 20

Alignments made in Round 1 a young boy wearing black shirt is in front of a goal ROOT 21 21

Alignments made in Round 2 a young boy wearing black shirt is in front of a goal ROOT 22 22

Alignment Result a young boy wearing black shirt is in front of a goal ROOT 23

How we do Scene Graph Parsing? 3. Pick a System (e.g. 2. Define a Label Space! 1. Get a Corpus! Arc-Hybrid) and its Actions! ? ? 24

Regular Labels 1. ATTR 2. SUBJ 3. OBJT Object to Attribute Object to Relation Relation to Object SUBJ OBJT SUBJ ATTR OBJT ATTR a young boy wearing black shirt is in front of a goal ROOT 25

Auxiliary Labels 1. ATTR 2. SUBJ 3. OBJT 4. CONT 5. BEGN Object to Attribute Object to Relation Relation to Object Phrase ROOT to Obj without Head BEGN SUBJ OBJT SUBJ ATTR OBJT CONT CONT ATTR a young boy wearing black shirt is in front of a goal ROOT 26

How we do Scene Graph Parsing? 3. Pick a System (e.g. 2. Define a Label Space! 1. Get a Corpus! Arc-Hybrid) and its Actions! ? BEGN SUBJ OBJT CONT ATTR 27

Transition-Based Arc-Hybrid System Ref: Kuhlmann et al., Dynamic programming algorithms for transition-based dependency parsers, ACL 2011 28

Transition-Based Arc-Hybrid System Ref: Kuhlmann et al., Dynamic programming algorithms for transition-based dependency parsers, ACL 2011 29

Augmented Arc-Hybrid We augment Arc-Hybrid with one more action that is REDUCE ● This is because we don’t require every word to have a head (e.g. “is”) ● 30

How we do Scene Graph Parsing? 3. Define Actions in a System 2. Define a Label Space! 1. Get a Corpus! (e.g. Arc-Hybrid)! BEGN LEFT SUBJ RIGHT OBJT SHIFT CONT REDUCE ATTR 31

Detailed Architecture 1.Initialization Step Stack Buffer Action 0 a young boy wearing black shirt is in front of a goal ROOT SHIFT 1 a young boy wearing black shirt is in front of a goal ROOT REDUCE 2 young boy wearing black shirt is in front of a goal ROOT SHIFT 3 young boy wearing black shirt is in front of a goal ROOT LEFT( ATTR ) 4 boy wearing black shirt is in front of a goal ROOT SHIFT 2. Predict the next action to take 32 Ref: Kiperwasser and Goldberg, Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations, TACL 2016

Detailed Architecture Step Stack Buffer Action 0 a young boy wearing black shirt is in front of a goal ROOT SHIFT a 1 young boy wearing black shirt is in front of a goal ROOT REDUCE 2 young boy wearing black shirt is in front of a goal ROOT SHIFT 3 young boy wearing black shirt is in front of a goal ROOT LEFT( ATTR ) 4 boy wearing black shirt is in front of a goal ROOT SHIFT 2 fully connected layers BiLSTM a young boy wearing black shirt is in front of a goal ROOT 33 Ref: Kiperwasser and Goldberg, Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations, TACL 2016

Step Stack Buffer Action 0 a young boy wearing black shirt is in front of a goal ROOT SHIFT a young boy wearing black shirt is in front of a goal ROOT 34

Step Stack Buffer Action 0 a young boy wearing black shirt is in front of a goal ROOT SHIFT 1 a young boy wearing black shirt is in front of a goal ROOT REDUCE a young boy wearing black shirt is in front of a goal ROOT 35

Step Stack Buffer Action 0 a young boy wearing black shirt is in front of a goal ROOT SHIFT 1 a young boy wearing black shirt is in front of a goal ROOT REDUCE 2 young boy wearing black shirt is in front of a goal ROOT SHIFT a young boy wearing black shirt is in front of a goal ROOT 36

Step Stack Buffer Action 0 a young boy wearing black shirt is in front of a goal ROOT SHIFT 1 a young boy wearing black shirt is in front of a goal ROOT REDUCE 2 young boy wearing black shirt is in front of a goal ROOT SHIFT 3 young boy wearing black shirt is in front of a goal ROOT LEFT( ATTR ) ATTR a young boy wearing black shirt is in front of a goal ROOT 37

Step Stack Buffer Action 0 a young boy wearing black shirt is in front of a goal ROOT SHIFT 1 a young boy wearing black shirt is in front of a goal ROOT REDUCE 2 young boy wearing black shirt is in front of a goal ROOT SHIFT 3 young boy wearing black shirt is in front of a goal ROOT LEFT( ATTR ) 4 boy wearing black shirt is in front of a goal ROOT SHIFT ATTR a young boy wearing black shirt is in front of a goal ROOT 38

Step Stack Buffer Action 1 a young boy wearing black shirt is in front of a goal ROOT REDUCE 2 young boy wearing black shirt is in front of a goal ROOT SHIFT 3 young boy wearing black shirt is in front of a goal ROOT LEFT( ATTR ) 4 boy wearing black shirt is in front of a goal ROOT SHIFT 5 boy wearing black shirt is in front of a goal ROOT SHIFT ATTR a young boy wearing black shirt is in front of a goal ROOT 39

Step Stack Buffer Action 2 young boy wearing black shirt is in front of a goal ROOT SHIFT 3 young boy wearing black shirt is in front of a goal ROOT LEFT( ATTR ) 4 boy wearing black shirt is in front of a goal ROOT SHIFT 5 boy wearing black shirt is in front of a goal ROOT SHIFT 6 boy wearing black shirt is in front of a goal ROOT SHIFT ATTR a young boy wearing black shirt is in front of a goal ROOT 40

Scene Graph Parsing as Dependency Parsing Author: Yu-Siang Wang , - PowerPoint PPT Presentation

Scene Graph Parsing as Dependency Parsing Author: Yu-Siang Wang , Chenxi Liu, Xiaohui Zeng, Alan Yuille Conference: North American Chapter of the Association for Computational Linguistics, 2018 1 Outline Introduction Method

Graph Based Dependency Parsing Wei Qiu December 15, 2011 . . . . . . Graph Based

Robust Incremental Neural Semantic Graph Parsing Jan Buys and Phil Blunsom Dependency Parsing vs

Dependency Parsing II CMSC 470 Marine Carpuat Graph-based Dependency Parsing Slides credit:

Scene Graphs Scene Representation How does one describe the objects in a 3D scene? Scene

Marina Valeeva Outline 2 1. Introduction What is Dependency Parsing? What is a

Dependency Parsing & Feature-based Parsing Ling571 Deep Processing Techniques for NLP

a better and faster way Shu Kong CS, ICS, UCI Image Understanding --> Scene Parsing Scene

Natural Language Processing Other Syntactic Models Parsing IV Dan Klein UC Berkeley Dependency

Dependency Parsing CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre, Dan

Dependency Parsing 2 CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre,

Algorithms for NLP CS 11-711 Fall 2020 Lecture 14: Graph-based dependency parsing Emma

Scene Representation How does one describe the objects in a Scene Graphs 3D scene? Scene

Introduction to Bottom-Up Parsing Shift-reduce parsing The LR parsing algorithm

Statistical Parsing Dependency parsing ar ltekin University of Tbingen Seminar fr

Lecture 19: Dependency Grammars and Dependency Parsing Julia Hockenmaier juliahmr@illinois.edu

Dependency Grammars and Parsing CMSC 473/673 UMBC Outline Review: PCFGs and CKY Dependency

Kursusgang 1 Oversigt: Kurset - Indhold: HCI disciplinen - Forml og evaluering

Constructing black holes and black hole microstates String theory and the fuzzball proposal Cl

The ICS-FORTH RDFSuite: Managing Voluminous RDF Description Bases

BATTLING WATERLOO Talent, quality of place and the 10-minute city Presented at the 11 th

IPIN competition 2016 - Track 4 Sponsored by Indoor Mobile Robot Positioning

Comparative Visualization Eduard Grller Institute of Computer Graphics and Algorithms Vienna

Noble Names, Branching Processes, and Fixation Probabilities Joachim Hermisson Mathematics &

Information and its sources What should we believe? v Traditions v Authorities v Experiences v

Scene Graph Parsing as Dependency Parsing Author: Yu-Siang Wang , - PowerPoint PPT Presentation

Scene Graph Parsing as Dependency Parsing Author: Yu-Siang Wang , Chenxi Liu, Xiaohui Zeng, Alan Yuille Conference: North American Chapter of the Association for Computational Linguistics, 2018 1 Outline Introduction Method

Graph Based Dependency Parsing Wei Qiu December 15, 2011 . . . . . . Graph Based

Robust Incremental Neural Semantic Graph Parsing Jan Buys and Phil Blunsom Dependency Parsing vs

Dependency Parsing II CMSC 470 Marine Carpuat Graph-based Dependency Parsing Slides credit:

Scene Graphs Scene Representation How does one describe the objects in a 3D scene? Scene

Marina Valeeva Outline 2 1. Introduction What is Dependency Parsing? What is a

Dependency Parsing &amp; Feature-based Parsing Ling571 Deep Processing Techniques for NLP

a better and faster way Shu Kong CS, ICS, UCI Image Understanding --&gt; Scene Parsing Scene

Natural Language Processing Other Syntactic Models Parsing IV Dan Klein UC Berkeley Dependency

Dependency Parsing CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre, Dan

Dependency Parsing 2 CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre,

Algorithms for NLP CS 11-711 Fall 2020 Lecture 14: Graph-based dependency parsing Emma

Scene Representation How does one describe the objects in a Scene Graphs 3D scene? Scene

Introduction to Bottom-Up Parsing Shift-reduce parsing The LR parsing algorithm

Statistical Parsing Dependency parsing ar ltekin University of Tbingen Seminar fr

Lecture 19: Dependency Grammars and Dependency Parsing Julia Hockenmaier juliahmr@illinois.edu

Dependency Grammars and Parsing CMSC 473/673 UMBC Outline Review: PCFGs and CKY Dependency

Kursusgang 1 Oversigt: Kurset - Indhold: HCI disciplinen - Forml og evaluering

Constructing black holes and black hole microstates String theory and the fuzzball proposal Cl

The ICS-FORTH RDFSuite: Managing Voluminous RDF Description Bases

BATTLING WATERLOO Talent, quality of place and the 10-minute city Presented at the 11 th

IPIN competition 2016 - Track 4 Sponsored by Indoor Mobile Robot Positioning

Comparative Visualization Eduard Grller Institute of Computer Graphics and Algorithms Vienna

Noble Names, Branching Processes, and Fixation Probabilities Joachim Hermisson Mathematics &amp;

Information and its sources What should we believe? v Traditions v Authorities v Experiences v

Dependency Parsing & Feature-based Parsing Ling571 Deep Processing Techniques for NLP

a better and faster way Shu Kong CS, ICS, UCI Image Understanding --> Scene Parsing Scene

Noble Names, Branching Processes, and Fixation Probabilities Joachim Hermisson Mathematics &