Delving Deep into Computer Vision Caner Hazirbas Machine Learning - PowerPoint PPT Presentation

Delving Deep into Computer Vision Caner Hazirbas Machine Learning Meetup #1

Delving Deep into Computer Vision FlowNet FuseNet PoseLSTM DDFF Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 2

Delving Deep into Computer Vision FlowNet FlowNetSimple conv1 conv2 conv3 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 7 x refine- prediction 7 5 x ment 3 5 x 5 1024 3 x 96 x 128 9 5 512 512 192 x 256 512 512 256 384 x 512 256 136 x 320 128 64 6 Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 3

Learning Optical Flow with FlowNet Convolutional Networks ICCV’15 FlowNetSimple conv1 conv2 conv3 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 7 x refine- prediction 7 5 x ment 3 5 x 5 1024 3 x 9 96 x 128 5 512 512 192 x 256 512 512 256 256 384 x 512 136 x 320 128 64 6 FlowNetCorr conv1 conv2 conv3 conv_redir 1 x 1 7 x 7 sqrt 1 x 1 5 x 5 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 384 x 512 256 4 x 512 4 x 512 128 64 2 refine- prediction kernel 3 x 3 3 corr ment 1024 512 512 512 512 32 136 x 320 256 441 473 Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 4

Flying Chairs FlowNet Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 5

FlowNetSimple FlowNet FlowNetSimple conv1 conv2 conv3 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 7 x refine- prediction 7 5 x ment 3 5 x 5 1024 3 x 96 x 128 9 5 512 512 192 x 256 512 512 256 256 384 x 512 136 x 320 128 64 6 Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 6

<latexit sha1_base64="bqyMlj+iueCfrlLrqfMh5shUHg=">ACaXicbVFLa9wEJbdV7p9bdpLaS9Dl0JC3MUyhZCILSXQi8pdJPAyhZO94VlmUjySGL2T/ZW/9AL/0DPVbr+LJBwTfY0YaPuWNktbF8a8gvHP3v0Hew9Hjx4/efpsvP/8zNatETgTtarNRc4tKqlx5qRTeNEY5FWu8Dwv2z980s0Vtb6h1s3mFZ8qWUhBXdeysZKHFxlNLrKkM4BmbKutqYFLD/F0ZlSlzskI7kA0wxfVSIRQZ3c7BEdSHkWeJZ0nPgJm+hUsioAx+Abw6RiSEo5oNp7E07gvuA3oACZkqNs/JstatFWqJ1Q3No5jRuXdtw4KRuRqy12HBR8iV2K1SX6Ha0uYea+/3Trk9qA2+9soCiNv5oB726cwuvrF1Xue+suFvZm95W/J83b13xMe2kblqHWlw/VLQKXA3b2GEhDQqn1h5wYaTfH8SKGy6c/5yRD4bejOE2OEumNJ7S7+8nJ5+HiPbIa/KGHBKPpAT8pWckhkR5Cf5GwRBGPwJ98OX4avr1jAYZl6QnQon/wC2bLM3</latexit> <latexit sha1_base64="bqyMlj+iueCfrlLrqfMh5shUHg=">ACaXicbVFLa9wEJbdV7p9bdpLaS9Dl0JC3MUyhZCILSXQi8pdJPAyhZO94VlmUjySGL2T/ZW/9AL/0DPVbr+LJBwTfY0YaPuWNktbF8a8gvHP3v0Hew9Hjx4/efpsvP/8zNatETgTtarNRc4tKqlx5qRTeNEY5FWu8Dwv2z980s0Vtb6h1s3mFZ8qWUhBXdeysZKHFxlNLrKkM4BmbKutqYFLD/F0ZlSlzskI7kA0wxfVSIRQZ3c7BEdSHkWeJZ0nPgJm+hUsioAx+Abw6RiSEo5oNp7E07gvuA3oACZkqNs/JstatFWqJ1Q3No5jRuXdtw4KRuRqy12HBR8iV2K1SX6Ha0uYea+/3Trk9qA2+9soCiNv5oB726cwuvrF1Xue+suFvZm95W/J83b13xMe2kblqHWlw/VLQKXA3b2GEhDQqn1h5wYaTfH8SKGy6c/5yRD4bejOE2OEumNJ7S7+8nJ5+HiPbIa/KGHBKPpAT8pWckhkR5Cf5GwRBGPwJ98OX4avr1jAYZl6QnQon/wC2bLM3</latexit> <latexit sha1_base64="bqyMlj+iueCfrlLrqfMh5shUHg=">ACaXicbVFLa9wEJbdV7p9bdpLaS9Dl0JC3MUyhZCILSXQi8pdJPAyhZO94VlmUjySGL2T/ZW/9AL/0DPVbr+LJBwTfY0YaPuWNktbF8a8gvHP3v0Hew9Hjx4/efpsvP/8zNatETgTtarNRc4tKqlx5qRTeNEY5FWu8Dwv2z980s0Vtb6h1s3mFZ8qWUhBXdeysZKHFxlNLrKkM4BmbKutqYFLD/F0ZlSlzskI7kA0wxfVSIRQZ3c7BEdSHkWeJZ0nPgJm+hUsioAx+Abw6RiSEo5oNp7E07gvuA3oACZkqNs/JstatFWqJ1Q3No5jRuXdtw4KRuRqy12HBR8iV2K1SX6Ha0uYea+/3Trk9qA2+9soCiNv5oB726cwuvrF1Xue+suFvZm95W/J83b13xMe2kblqHWlw/VLQKXA3b2GEhDQqn1h5wYaTfH8SKGy6c/5yRD4bejOE2OEumNJ7S7+8nJ5+HiPbIa/KGHBKPpAT8pWckhkR5Cf5GwRBGPwJ98OX4avr1jAYZl6QnQon/wC2bLM3</latexit> <latexit sha1_base64="bqyMlj+iueCfrlLrqfMh5shUHg=">ACaXicbVFLa9wEJbdV7p9bdpLaS9Dl0JC3MUyhZCILSXQi8pdJPAyhZO94VlmUjySGL2T/ZW/9AL/0DPVbr+LJBwTfY0YaPuWNktbF8a8gvHP3v0Hew9Hjx4/efpsvP/8zNatETgTtarNRc4tKqlx5qRTeNEY5FWu8Dwv2z980s0Vtb6h1s3mFZ8qWUhBXdeysZKHFxlNLrKkM4BmbKutqYFLD/F0ZlSlzskI7kA0wxfVSIRQZ3c7BEdSHkWeJZ0nPgJm+hUsioAx+Abw6RiSEo5oNp7E07gvuA3oACZkqNs/JstatFWqJ1Q3No5jRuXdtw4KRuRqy12HBR8iV2K1SX6Ha0uYea+/3Trk9qA2+9soCiNv5oB726cwuvrF1Xue+suFvZm95W/J83b13xMe2kblqHWlw/VLQKXA3b2GEhDQqn1h5wYaTfH8SKGy6c/5yRD4bejOE2OEumNJ7S7+8nJ5+HiPbIa/KGHBKPpAT8pWckhkR5Cf5GwRBGPwJ98OX4avr1jAYZl6QnQon/wC2bLM3</latexit> FlowNetCorr FlowNet FlowNetCorr conv1 conv2 conv3 conv_redir 1 x 1 7 x 7 sqrt 1 x 1 5 x 5 conv4 conv3_1 conv4_1 conv5 conv5_1 conv6 384 x 512 256 4 x 512 4 x 512 128 64 2 refine- prediction kernel 3 x 3 3 corr ment 1024 512 512 512 512 32 256 136 x 320 441 473 X c ( x 1 , x 2 ) = h f 1 ( x 1 + o ) , f 2 ( x 2 + o ) i , o ∈ [ − k,k ] × [ − k,k ] K := 2 k + 1 Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 7

Simple vs. Corr   FlowNet Flying Chairs FlowNetCorr conv1 conv2 conv3 conv_redir 1 x 1 7 x 7 sqrt 1 x 1 5 x 5 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 384 x 512 256 4 x 512 4 x 512 128 64 2 refine- kernel prediction 3 x 3 3 corr ment 1024 512 512 512 512 32 256 136 x 320 441 473 FlowNetS FlowNetCorr Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 8

Simple vs. Corr   FlowNet Sintel FlowNetCorr conv1 conv2 conv3 conv_redir 1 x 1 7 x 7 sqrt 1 x 1 5 x 5 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 384 x 512 256 4 x 512 4 x 512 128 64 2 refine- kernel prediction 3 x 3 3 corr ment 1024 512 512 512 512 32 256 136 x 320 441 473 FlowNetS FlowNetCorr Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 9

Learning Optical Flow with FlowNet Convolutional Networks Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 10

Delving Deep into Computer Vision FlowNet FuseNet Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 11

Incorporating Depth into Semantic Segmentation via Fusion-based CNN FuseNet Architecture ACCV’16 Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 12

A conventional way: HHA FuseNet Multi-Scale Convolutional Architecture for Semantic Segmentation, Raj et al., Tech. Report, CMU-RI-TR-15-21,2015 Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 13

A deep way… FuseNet Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 14

Why a second encoder for FuseNet Depth input? Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 15

Are we any better than HHA? FuseNet Proposed network improves all segmentation • metrics Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 16

What about the others? FuseNet Proposed network improves all segmentation metrics • Metrics   • Global : total number of correctly classified pixels   Mean : average class accuracy   IoU : average of intersection over union. Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 17

Delving Deep into Computer Vision FlowNet FuseNet PoseLSTM LSTMs Pretrained FC GoogLeNet p ∈ R 3 q ∈ R 4 CNNs y ∈ R 2048 FC Y ∈ R 32 × 64 z ∈ R 128 Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 18

Image-based localization using LSTMs PoseLSTM for structured feature correlation ICCV’17 LSTMs Pretrained FC GoogLeNet p ∈ R 3 q ∈ R 4 CNNs y ∈ R 2048 FC Y ∈ R 32 × 64 z ∈ R 128 Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 19

PoseNet PoseLSTM Pretrained FC GoogLeNet p ∈ R 3 q ∈ R 4 CNNs y ∈ R 2048 FC R 128 Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 20

Structured Feature Correlation PoseLSTM LSTMs Pretrained FC GoogLeNet p ∈ R 3 q ∈ R 4 CNNs y ∈ R 2048 FC Y ∈ R 32 × 64 z ∈ R 128 Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 21

Winner in Outdoor: SIFT PoseLSTM Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 22

Where SIFT dies… PoseLSTM TUM-LSI Dataset The map cannot be reconstructed due to a lack of sufficient matches: repeated structures, textureless areas Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 23

Delving Deep into Computer Vision FlowNet FuseNet PoseLSTM DDFF Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 24

Deep Depth From Focus DDFF Image of a point intersects the camera sensor when the point is in focus • Therefore, sharpness determines the focused regions on the images • https://inst.eecs.berkeley.edu/~cs39j/sp02/session12.html Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 25

Conventional DFF methods DDFF Image of a point intersects the camera sensor when the point is in focus • Therefore, sharpness determines the focused regions on the images • Distance of a point from the camera can be formulated wrt. focus • Measure of Optimizer sharpness [Pertuz et al.] [Moeller et al.] Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 26

Deep Depth From Focus DDFF Focus gradually changes on each image in the stack • End-to-end trained convolutional auto-encoder • Depth (disparity) from focal stack • Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 27

How to get data? DDFF Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 28

Delving Deep into Computer Vision Caner Hazirbas Machine Learning - PowerPoint PPT Presentation

Delving Deep into Computer Vision Caner Hazirbas Machine Learning Meetup #1 Delving Deep into Computer Vision FlowNet FuseNet PoseLSTM DDFF Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 2 Delving Deep into

Return of the Devil in the Details: Delving Deep into Convolutional Nets Ken Chatfield, Karen

Deep Learning in Computer Vision Caner Hazrba Deep Learning in Action 24. June 15

Return of the Devil in the Details: Delving Deep into Convolutional Nets Ken Chatfield - Karen

Deep Convolutional Neural Network for Computer Vision Products LI XU, R&D Director

APPLICATIONS OF DEEP LEARNING TO COMPUTER VISION AND COMPUTER GRAPHICS Mike Houston Practical DEEP

Deep Nets: What have they ever done for Vision? Alan Yuille Dept. Cognitive Science and

Computer Vision Neurobio 230 Bill Lotter Exciting time: Neuroscience computer vision

Recent Trends in Computer Vision and Deep Learning Systems Yangqing Jia Lead Researcher and

FUNDAMENTALS OF DEEP LEARNING FOR COMPUTER VISION Twin Karmakharm DLI Certified Instructor

BIL 722: Advanced Topics in Computer Vision Mehmet Kerim Y cel Deep Structured Models For

The Impact of the Open Set Recognition Problem on Deep Learning Walter J. Scheirer Computer

Impact of Deep Learning Speech Recogni4on Computer Vision

Deep learning 8.1. Computer vision tasks Fran cois Fleuret https://fleuret.org/dlc/ Dec 20,

Dark, Beyond Deep --- Rethink About Computer Vision Song-Chun Zhu 1 Distribution Statement

Deep Tracking & Flow Instructor - Simon Lucey 16-423 - Designing Computer Vision Apps Today

Deep Tracking & Flow Instructor - Simon Lucey 16-623 - Designing Computer Vision Apps Today

EE-559 Deep learning 7. Networks for computer vision Fran cois Fleuret

Deep learning in computer vision and natural language processing Yifeng Tao School of Computer

Deep Learning in Computer Vision Yikang Li MMLab, The Chinese University of Hong Kong Sep 22nd,

Deep Learning in the Connected Kitchen or Launching a Computer Vision program in a new

Learning for Computer Vision Ramprasaath Lecture Outline Computer Vision Before

Computer Vision and Deep Learning Introduction to Data Science 2019 University of Helsinki Mats

Recognition and Classification of Radioactive Waste using Computer Vision-based Deep Learning

8. More Tasks in Computer Vision CS 519 Deep Learning, Winter 2018 Fuxin Li With materials from

Delving Deep into Computer Vision Caner Hazirbas Machine Learning - PowerPoint PPT Presentation

Delving Deep into Computer Vision Caner Hazirbas Machine Learning Meetup #1 Delving Deep into Computer Vision FlowNet FuseNet PoseLSTM DDFF Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 2 Delving Deep into

Return of the Devil in the Details: Delving Deep into Convolutional Nets Ken Chatfield, Karen

Deep Learning in Computer Vision Caner Hazrba Deep Learning in Action 24. June 15

Return of the Devil in the Details: Delving Deep into Convolutional Nets Ken Chatfield - Karen

Deep Convolutional Neural Network for Computer Vision Products LI XU, R&amp;D Director

APPLICATIONS OF DEEP LEARNING TO COMPUTER VISION AND COMPUTER GRAPHICS Mike Houston Practical DEEP

Deep Nets: What have they ever done for Vision? Alan Yuille Dept. Cognitive Science and

Computer Vision Neurobio 230 Bill Lotter Exciting time: Neuroscience computer vision

Recent Trends in Computer Vision and Deep Learning Systems Yangqing Jia Lead Researcher and

FUNDAMENTALS OF DEEP LEARNING FOR COMPUTER VISION Twin Karmakharm DLI Certified Instructor

BIL 722: Advanced Topics in Computer Vision Mehmet Kerim Y cel Deep Structured Models For

The Impact of the Open Set Recognition Problem on Deep Learning Walter J. Scheirer Computer

Impact of Deep Learning Speech Recogni4on Computer Vision

Deep learning 8.1. Computer vision tasks Fran cois Fleuret https://fleuret.org/dlc/ Dec 20,

Dark, Beyond Deep --- Rethink About Computer Vision Song-Chun Zhu 1 Distribution Statement

Deep Tracking &amp; Flow Instructor - Simon Lucey 16-423 - Designing Computer Vision Apps Today

Deep Tracking &amp; Flow Instructor - Simon Lucey 16-623 - Designing Computer Vision Apps Today

EE-559 Deep learning 7. Networks for computer vision Fran cois Fleuret

Deep learning in computer vision and natural language processing Yifeng Tao School of Computer

Deep Learning in Computer Vision Yikang Li MMLab, The Chinese University of Hong Kong Sep 22nd,

Deep Learning in the Connected Kitchen or Launching a Computer Vision program in a new

Learning for Computer Vision Ramprasaath Lecture Outline Computer Vision Before

Computer Vision and Deep Learning Introduction to Data Science 2019 University of Helsinki Mats

Recognition and Classification of Radioactive Waste using Computer Vision-based Deep Learning

8. More Tasks in Computer Vision CS 519 Deep Learning, Winter 2018 Fuxin Li With materials from

Deep Convolutional Neural Network for Computer Vision Products LI XU, R&D Director

Deep Tracking & Flow Instructor - Simon Lucey 16-423 - Designing Computer Vision Apps Today

Deep Tracking & Flow Instructor - Simon Lucey 16-623 - Designing Computer Vision Apps Today