CVPR 17 Paper presentation 2018. 11. 01. Taeun Hwang ( ) CS688: - PowerPoint PPT Presentation

Deep Sketch Hashing: Fast Free-hand Sketch-Based Image Retrieval CVPR ‘17 Paper presentation 2018. 11. 01. Taeun Hwang ( 황태운 ) CS688: Web-Scale image Retrieval

Review ● SuBiC: A supervised, structured binary code for image search[ICCV 2017] presented by Huisu Yun ● Very long Raw feature vectors  binary code ● Code length in the SuBiC : KM ● actual storage can be easily reduce to M log 2 K ● One hot code block  M additions for distance computing 2

Contents ● Introduction ● Main Idea ● Method ● Experiment & result 3

Introduction 4

Introduction ● Sketch-Based Image Retrieval ● Image retrieval given freehand sketches illustration of the SBIR 5

Challenges in SBIR ● Geometric distortion between Sketch and Natural image ● IE) backgrounds, various viewpoints… sketch natural image ● Searching efficiency of SBIR ● Most SBIR tech are based on applying NN ● Computational complexity O(Nd) ● Inappropriate for Large-scale SBIR 6

Main Idea ● Geometric distortion ● diminish the geometric distortion using “sketch - tokens” ● Speeds up SBIR by embedding sketches and natural images into two sets of compact binary codes ● In Large-scale SBIR, heavy continuous-valued distance computation is decrease 7

DSH: Method Deep Sketch Hashing(DSH): Fast Free-hand Sketch-Based Image Retrieval 8

Sketch token : background ● Sketch tokens: A learned mid-level representation for contour and object detection [JJ Lim et al., CVPR ’13] ● Sketch-token : Hand-drawn contours in images 9

Sketch token : background ● Sketch-tokens have similar stroke patterns and appearance to free-hand sketches ● Reflect only essential edges of natural images without detailed texture information ● In this work : used for diminish geometric distortion between sketch and real image 10

Network structure ● Inputs of DSH 11

Network structure ● Semi-heterogeneous Deep Architecture ● Discrete binary code learning Semi-heterogeneous Deep Architecture 12

Network structure ● C1-Net (CNN) for Natural image ● C2-Net (CNN) for sketch and sketch-token 13

Semi-heterogeneous Deep Architecture ● Cross-weight Late-fusion Net 14

Semi-heterogeneous Deep Architecture ● Cross-weight Late-fusion Net Connect the last pooling and fc layer with Cross-weight [S Rastegar et al., CVPR’16] Maximize the mutual inform across both modalities , while the information from each individual net is also preserved 15

Semi-heterogeneous Deep Architecture ● Cross-weight Late-fusion Net Late-fuse C1-Net and C2-Net into a unified binary coding layer hash_C1 the learned codes can fully benefit from both natural images and their corresponding sketch-tokens 16

Semi-heterogeneous Deep Architecture ● Shared-weight Sketch Net 17

Semi-heterogeneous Deep Architecture ● Shared-weight Sketch Net Siamese architecture for C2-Net(Top) and C2-Net(Middle) consider the similar characteristics and implicit correlation s existing between sketch-tokens and free-hand sketches 18

Semi-heterogeneous Deep Architecture ● Shared-weight Sketch Net Binary coding layer hash_C2 hash codes of free-hand sketches learned shared-weight net will decrease the geometric difference between images and sketches during SBIR. 19

Semi-heterogeneous Deep Architecture ● Result : Deep hash function B B S = sign(F 2 (A)) B I =sign(F 1 (B, C)) A = weights of C2(Top) : Sketch B, C = weights of C2(Middle),C1 : Sketch-token, natural image 20

Discrete binary code learning ● There are two loss function ● Cross-view Pairwise Loss ● Semantic Factorization Loss 21

Discrete binary code learning ● Cross-view Pairwise Loss ● denotes the cross-view similarity between sketch and natural image ● The binary codes of natural images and sketches from the same category will be pulled as close as possible (pushed far away otherwise) 22

Discrete binary code learning ● Semantic Factorization Loss : Word embedding model Y : label matrix ● Consider preserving the intra-set semantic relationships for both the image set and the sketch set ● Using Word2Vector, consider distance of label’s semantic 23

Discrete binary code learning ● Semantic Factorization Loss : Word embedding model Y : label matrix ● The semantic embedding of “cheetah” will be closer to “tiger” but further from “dolphin” 24

Discrete binary code learning ● Final Objective Function ● Cross-view Pairwise Loss + Semantic Factorization Loss 25

Optimization (training) ● The objective function is non-convex and non- smooth, which is in general an NP-hard problem due to the binary constraints ● Solution : sequentially update parameters ● param : D, BI, BS and deep hash functions F1, F2 The illustration of DSH alternating optimization scheme 26

Test ● Given sketch query B S = sign(F 2 (A)) ● Compare the distance with B I ’s in retrieval database 27

Result 28

Experiments ● Data set ● TU-Berlin Extension, Sketchy ● All image has relatively complex backgrounds ● Top-20 retrieval results (Red box : false positive) 29

Result ● Comparison on other SBIR methods 30

End 31

CVPR 17 Paper presentation 2018. 11. 01. Taeun Hwang ( ) CS688: - PowerPoint PPT Presentation

Deep Sketch Hashing: Fast Free-hand Sketch-Based Image Retrieval CVPR 17 Paper presentation 2018. 11. 01. Taeun Hwang ( ) CS688: Web-Scale image Retrieval Review SuBiC: A supervised, structured binary code for image

CVPR 2 CVPR 2019 Tracking and Detection Challenge 16/06/2019 www.motchallenge.net CVPR 2019

How to be a good citizen of the CVPR community June 22 nd , 2018 CVPR 2018 Why?

3D Scene Understanding for Vision, Graphics, and Robocs CVPR 2020 Workshop, Virtual, June 15th,

CVPR is a contemporary art exhibition -- Garbage is a source for impact -- 3D Scene Understanding

GANerated Hands for Real-Time 3D Hand Tracking from Monocular RGB Franziska Mueller, et, al. CVPR

SketchNet: Sketch Classification with Web Images[CVPR `16] CS688 Paper Presentation 1 Doheon Lee

Hashing for Multi-Label Image Retrieval Fang Zhao, et al. CVPR 2015 Presenter: MinKu Kang 1

Completing 3D Object Shape from One Depth Image (CVPR 2015) Jason Rock, Tanmay Gupta, Justin

Outline Outline I. Introduction II. Smart Camera Architectures 1. Wireless Smart Camera 2.

Deep Image-Text Embeddings Learning Deep Structure-Preserving Image-Text Embeddings (CVPR 2016)

Review SketchNet: Sketch Classification with Web Images [CVPR `16] (Speaker. Doheon Lee)

You Only Look Once: Unified, Real-Time Object Detection Redmon et al., CVPR 2016 Mincheul Kang

Computation, and Innovative Applications Tutorial at CVPR 2014 June 23rd, 1:00pm-5:00pm, Columbus,

Context Encoding for Semantic Segmentation CVPR 2018, Salt Lake City Hang Zhang 1,2 , Kristin

Towards the next generation of image guidance for endoscopic procedures CVPR Workshop on 3D

Learning to Synthesize Motion Blur CVPR 2019 Tim Brooks and Jon Barron Research Motion During

Grieg Seafood ASA

Investing in A Stronger Kentucky Cleaning Up, Rather than Piling On Tax Breaks for Powerful

Systemtap FrOSCon (25. August 2013) Stefan Seyfried Linux Consultant & Trainer B1 Systems

MySQL: An Ecosystem, Not Just a Company Michael Monty Widenius Chief Executive Officer,

Who am I? Co founder and CTO of FNF 10 years of systems/storage/network/security

BIM and Project Controls: An overview of 5D BIM Ian Levers Agenda An introduction to 5D

3. To strengthen the leadership skills of young people within the Independent Living movement; 4.

Why Don t They Understand? Chamber630 Women in Business March 1, 2019 RPC Leadership

CVPR 17 Paper presentation 2018. 11. 01. Taeun Hwang ( ) CS688: - PowerPoint PPT Presentation

Deep Sketch Hashing: Fast Free-hand Sketch-Based Image Retrieval CVPR 17 Paper presentation 2018. 11. 01. Taeun Hwang ( ) CS688: Web-Scale image Retrieval Review SuBiC: A supervised, structured binary code for image

CVPR 2 CVPR 2019 Tracking and Detection Challenge 16/06/2019 www.motchallenge.net CVPR 2019

How to be a good citizen of the CVPR community June 22 nd , 2018 CVPR 2018 Why?

3D Scene Understanding for Vision, Graphics, and Robocs CVPR 2020 Workshop, Virtual, June 15th,

CVPR is a contemporary art exhibition -- Garbage is a source for impact -- 3D Scene Understanding

GANerated Hands for Real-Time 3D Hand Tracking from Monocular RGB Franziska Mueller, et, al. CVPR

SketchNet: Sketch Classification with Web Images[CVPR `16] CS688 Paper Presentation 1 Doheon Lee

Hashing for Multi-Label Image Retrieval Fang Zhao, et al. CVPR 2015 Presenter: MinKu Kang 1

Completing 3D Object Shape from One Depth Image (CVPR 2015) Jason Rock, Tanmay Gupta, Justin

Outline Outline I. Introduction II. Smart Camera Architectures 1. Wireless Smart Camera 2.

Deep Image-Text Embeddings Learning Deep Structure-Preserving Image-Text Embeddings (CVPR 2016)

Review SketchNet: Sketch Classification with Web Images [CVPR `16] (Speaker. Doheon Lee)

You Only Look Once: Unified, Real-Time Object Detection Redmon et al., CVPR 2016 Mincheul Kang

Computation, and Innovative Applications Tutorial at CVPR 2014 June 23rd, 1:00pm-5:00pm, Columbus,

Context Encoding for Semantic Segmentation CVPR 2018, Salt Lake City Hang Zhang 1,2 , Kristin

Towards the next generation of image guidance for endoscopic procedures CVPR Workshop on 3D

Learning to Synthesize Motion Blur CVPR 2019 Tim Brooks and Jon Barron Research Motion During

Grieg Seafood ASA

Investing in A Stronger Kentucky Cleaning Up, Rather than Piling On Tax Breaks for Powerful

Systemtap FrOSCon (25. August 2013) Stefan Seyfried Linux Consultant &amp; Trainer B1 Systems

MySQL: An Ecosystem, Not Just a Company Michael Monty Widenius Chief Executive Officer,

Who am I? Co founder and CTO of FNF 10 years of systems/storage/network/security

BIM and Project Controls: An overview of 5D BIM Ian Levers Agenda An introduction to 5D

3. To strengthen the leadership skills of young people within the Independent Living movement; 4.

Why Don t They Understand? Chamber630 Women in Business March 1, 2019 RPC Leadership

Systemtap FrOSCon (25. August 2013) Stefan Seyfried Linux Consultant & Trainer B1 Systems