a fast and accurate one stage approach to visual grounding
play

A Fast and Accurate One-Stage Approach to Visual Grounding - PowerPoint PPT Presentation

A Fast and Accurate One-Stage Approach to Visual Grounding Zhengyuan Yang Boqing Gong Liwei Wang Wenbing Huang Dong Yu Jiebo Luo Presenter: Tianlang Chen Visual grounding Grounding a language query onto a region of the image


  1. A Fast and Accurate One-Stage Approach to Visual Grounding Zhengyuan Yang Boqing Gong Liwei Wang Wenbing Huang Dong Yu Jiebo Luo Presenter: Tianlang Chen

  2. Visual grounding • Grounding a language query onto a region of the image • Grounding a language query onto a region of the image Phrase localization – Referring expression comprehension – Query: bottom right grass

  3. Existing framework • Two-stage framework ✔ Query: center building

  4. Existing framework • Performance is capped by the region candidates • Slow in speed

  5. One-stage visual grounding • One-stage approach • Generally applicable for sub-tasks in grounding

  6. Why one-stage visual grounding • No region candidates -> 7~20% higher in accuracy • One-stage -> 10x faster

  7. Architecture overview • Encoder • Fusion module • Grounding module

  8. Architecture • Encoder • Fusion module • Grounding module • Visual encoder: DarkNet53+FPN • Language encoder: Bert, LSTM, FV • Spatial encoder: location related queries

  9. Architecture • Encoder • Fusion module • Grounding module • Image-level fusion • Image-level fusion – Multiple resolutions – Three parts of input features

  10. Architecture • Encoder • Fusion module • Grounding module • Output format: box + confidence

  11. Datasets • Phrase localization: Flickr 30K Entities • Referring expression comprehension: ReferItGame the black backpack on the bottom right Flickr 30K Entities ReferItGame

  12. Comparison to other methods

  13. Qualitative results ● Reasons of improvement Two- gt stage Pred. Ours • Union of multiple objects • Stuff as opposed to things • Challenging regions

  14. A Fast and Accurate One-Stage Approach to Visual Grounding Code & models: https://github.com/zyang-ur/onestage_grounding Poster: #26 Contact: zyang39@cs.rochester.edu

Recommend


More recommend