A Fast and Accurate One-Stage Approach to Visual Grounding - PowerPoint PPT Presentation

Oct 09, 2023 •193 likes •341 views

A Fast and Accurate One-Stage Approach to Visual Grounding Zhengyuan Yang Boqing Gong Liwei Wang Wenbing Huang Dong Yu Jiebo Luo Presenter: Tianlang Chen Visual grounding Grounding a language query onto a region of the image

A Fast and Accurate One-Stage Approach to Visual Grounding Zhengyuan Yang Boqing Gong Liwei Wang Wenbing Huang Dong Yu Jiebo Luo Presenter: Tianlang Chen
Visual grounding • Grounding a language query onto a region of the image • Grounding a language query onto a region of the image Phrase localization – Referring expression comprehension – Query: bottom right grass
Existing framework • Two-stage framework ✔ Query: center building
Existing framework • Performance is capped by the region candidates • Slow in speed
One-stage visual grounding • One-stage approach • Generally applicable for sub-tasks in grounding
Why one-stage visual grounding • No region candidates -> 7~20% higher in accuracy • One-stage -> 10x faster
Architecture overview • Encoder • Fusion module • Grounding module
Architecture • Encoder • Fusion module • Grounding module • Visual encoder: DarkNet53+FPN • Language encoder: Bert, LSTM, FV • Spatial encoder: location related queries
Architecture • Encoder • Fusion module • Grounding module • Image-level fusion • Image-level fusion – Multiple resolutions – Three parts of input features
Architecture • Encoder • Fusion module • Grounding module • Output format: box + confidence
Datasets • Phrase localization: Flickr 30K Entities • Referring expression comprehension: ReferItGame the black backpack on the bottom right Flickr 30K Entities ReferItGame
Comparison to other methods
Qualitative results ● Reasons of improvement Two- gt stage Pred. Ours • Union of multiple objects • Stuff as opposed to things • Challenging regions
A Fast and Accurate One-Stage Approach to Visual Grounding Code & models: https://github.com/zyang-ur/onestage_grounding Poster: #26 Contact: zyang39@cs.rochester.edu

Recommend

Visual Grounding of Learned Physical Models ICML 2020 Yunzhu Li Toru Lin* Kexin Yi* Daniel M.

Visual Grounding of Learned Physical Models ICML 2020 Yunzhu Li Toru Lin* Kexin Yi* Daniel M. Bear Daniel L.K. Yamins Jiajun Wu Joshua B. Antonio Torralba Tenenbaum http://visual-physics-grounding.csail.mit.edu/ (* indicates equal

695 views • 51 slides

JIT-Assisted Fast-Forward Embedding and Instrumentation to Enable Fast, Accurate, and Agile

JIT-Assisted Fast-Forward Embedding and Instrumentation to Enable Fast, Accurate, and Agile Simulation Berkin Ilbeyi and Christopher Batten Computer Systems Laboratory School of Electrical and Computer Engineering Cornell University

541 views • 24 slides

Grounding distributional semantics in the visual world Marco Baroni Center for Mind/Brain

Grounding distributional semantics in the visual world Marco Baroni Center for Mind/Brain Sciences University of Trento VL 15 Lisbon, Portugal In collaboration with: Angeliki Lazaridou Nghia The Pham, Marco Marelli, Raquel Fernandez,

531 views • 29 slides

Drive-Thru: Drive-Thru: Fast, Accurate Evaluation of Fast, Accurate Evaluation of Storage Power

Drive-Thru: Drive-Thru: Fast, Accurate Evaluation of Fast, Accurate Evaluation of Storage Power Management Storage Power Management Daniel Peek Jason Flinn University of Michigan 1 Power Management Needed Power Management Needed

587 views • 34 slides

Grounding word representations in the visual world Marco Baroni Center for Mind/Brain Sciences

Grounding word representations in the visual world Marco Baroni Center for Mind/Brain Sciences University of Trento LEAR (Grenoble) July 2015 In collaboration with: Angeliki Lazaridou Nghia The Pham, Marco Marelli, Raquel Fernandez,

669 views • 32 slides

Visually Grounded, Task-oriented Dialogue Elia Bruni Outline Language grounding Visual dialogue

Visually Grounded, Task-oriented Dialogue Elia Bruni Outline Language grounding Visual dialogue Q A Appendix: Current and future work Q A 2 Distributional semantics We found a cute, hairy wampimuk sleeping behind the tree 3

382 views • 20 slides

Visual Litter Surveys Where we came from where we are now and a new approach to Visual Litter

Visual Litter Surveys Where we came from where we are now and a new approach to Visual Litter Surveys Morris A. Enyeart, Ed.D. Digital Drifting LLC January 30, 2018 Trad aditions A Are C Chan anging Visual Litter Survey walking

750 views • 27 slides

Dune Grounding Issues Impedance Concerns T. Shaw 11APR2018 Grounding Plan Grounding Plan

Dune Grounding Issues Impedance Concerns T. Shaw 11APR2018 Grounding Plan Grounding Plan can be found in DUNE docdb 285 https://docs.dunescience.org:440/cgi- bin/ShowDocument?docid=285 Need to consider the resistivity of concrete

366 views • 15 slides

Bio Detectors Accurate and precise Stable system Fast and visible response Versatile

Introduction Bio Detectors Accurate and precise Stable system Fast and visible response Versatile Our Idea Our Idea Fast Visible Output Accurate Detection

692 views • 39 slides

Independent Parallel Runway Visual Approach in Beijing Capital Airport SUMMARY Basic

Independent Parallel Runway Visual Approach in Beijing Capital Airport SUMMARY Basic Concepts of Visual Approach The Promoting Process Of Visual Approach (including The Preparation) The Implementation Process Of Visual Approach

368 views • 16 slides

Delegation Sketch : a Parallel Design with Support for Fast and Accurate Concurrent Operations

EuroSys 2020 Delegation Sketch : a Parallel Design with Support for Fast and Accurate Concurrent Operations Charalampos Stylianopoulos, Ivan Walulya, Magnus Almgren, Olaf Landsiedel and Marina Papatriantafilou Chalmers University of Technology,

315 views • 14 slides

DeepLumen Fast and Accurate Segmentation of Coronary Arteries for Improved Cardiovascular Care

DeepLumen Fast and Accurate Segmentation of Coronary Arteries for Improved Cardiovascular Care Kersten Petersen, Michiel Schaap, David Lesage, Matthew Lee, and Leo Grady GTC 2017 How do we find the right treatment for patients with symptoms of

710 views • 55 slides

Grounding LING 575: Spoken Dialog Systems May 12 th , 2016 1 What is Grounding? Spoken Dialog

Grounding LING 575: Spoken Dialog Systems May 12 th , 2016 1 What is Grounding? Spoken Dialog is special way of communication It is the result of a joint collaboration Achieving a common ground of mutually believed facts of what is being

1.07k views • 73 slides

Fast Discriminative Visual Codebooks using Randomized Clusering Forests Frank Moosmann, Bill

Fast Discriminative Visual Codebooks using Randomized Clusering Forests Frank Moosmann, Bill Triggs, and Frederic Jurie Presented by: Andrew F. Dreher CS 395T - Spring 2007 Contributions 1) Creating visual words using classification

430 views • 24 slides

Fast and Accurate Load Balancing for Geo-Distributed Storage Systems Kirill L. Bogdanov 1 Waleed

Fast and Accurate Load Balancing for Geo-Distributed Storage Systems Kirill L. Bogdanov 1 Waleed Reda 1,2 Gerald Q. Maguire Jr. 1 Dejan Kostic 1 Marco Canini 3 1 KTH Royal Institute of Technology 2 Universit Catholique de Louvain 3 KAUST

959 views • 33 slides

Fast and Accurate Inference for the Smoothing Parameter in Semiparametric Models Alex Trindade

Fast and Accurate Inference for the Smoothing Parameter in Semiparametric Models Alex Trindade Dept. of Mathematics & Statistics, Texas Tech University Joint work with Rob Paige , Missouri University of Science and Technology Funded in part by

409 views • 19 slides

Accurate and Fast Evaluation of Elementary Symmetric Functions Stef Graillat LIP6/PEQUAN -

Accurate and Fast Evaluation of Elementary Symmetric Functions Stef Graillat LIP6/PEQUAN - Universit Pierre et Marie Curie (Paris 6) - CNRS Joint work with Hao Jiang and Roberto Barrio 21st IEEE International Symposium on Computer Arithmetic

294 views • 28 slides

Burning on the GPU: Fast and Accurate Chemical Kinetics GPU Technology Conference April 7, 2016

Funded by: U.S. Department of Energy Vehicle Technologies Program Program Manager: Gurpreet Singh & Leo Breton Burning on the GPU: Fast and Accurate Chemical Kinetics GPU Technology Conference April 7, 2016 Russell Whitesides Session

813 views • 29 slides

Fast and Accurate Memristor- Based Algorithms for Social Network Analysis Sucheta Soundarajan

Fast and Accurate Memristor- Based Algorithms for Social Network Analysis Sucheta Soundarajan Yanzhi Wang Overview of Memristors Invented by HP Labs in 2008 Resistance changes if voltage greater than V thresh is applied

452 views • 11 slides

TUNING SLIDE Fast and Accurate Microarchitectural Simulation with ZSim Daniel Sanchez, Nathan

TUNING SLIDE Fast and Accurate Microarchitectural Simulation with ZSim Daniel Sanchez, Nathan Beckmann, Anurag Mukkara, Po-An Tsai MIT CSAIL MICRO-48 Tutorial December 5, 2015 Welcome! Agenda 4 8:30 9:10 Intro and Overview 9:10

1.53k views • 124 slides

Fast and Accurate Metadata Authoring Using Ontology-Based Recommendations S100 Martnez-Romero,

Fast and Accurate Metadata Authoring Using Ontology-Based Recommendations S100 Martnez-Romero, M. , OConnor, M. J., Shankar, R., Panahiazar, M., Willrett, D., Egyedi, A. L., Gevaert, O., Graybeal, J., Musen, M. A. Stanford University What

374 views • 25 slides

What do passengers think? 12 March 2019 Approach and coverage Longitudinal approach, exploring

The structure of the rail industry. What do passengers think? 12 March 2019 Approach and coverage Longitudinal approach, exploring how perspectives change over the course of deliberation and discussion Stage 1: Pre-task Stage 2: Mini Focus

428 views • 17 slides

Fast K-Means with Accurate Bounds James Newling & Franc ois Fleuret Idiap Research

Fast K-Means with Accurate Bounds James Newling & Franc ois Fleuret Idiap Research Institute Computer Vision and Learning Group & EPFL June 20th, 2016 COLE POLYTECHNIQUE FDRALE DE LAUSANNE K -Means Problem Statement and

831 views • 67 slides

Compressed Factorization: Fast and Accurate Low-Rank Factorization of Compressively-Sensed Data

Compressed Factorization: Fast and Accurate Low-Rank Factorization of Compressively-Sensed Data Vatsal Sharan* , Kai Sheng Tai*, Peter Bailis & Gregory Valiant Stanford University Poster 187 Learning(from(compressed(data

253 views • 8 slides