CNN Architectures ILSVRC: Imagenet Large Scale Visual Recognition - PowerPoint PPT Presentation

CS4501: Introduction to Computer Vision CNN Architectures

ILSVRC: Imagenet Large Scale Visual Recognition Challenge [Russakovsky et al 2014]

The Problem: Classification Classify an image into 1000 possible classes: e.g. Abyssinian cat, Bulldog, French Terrier, Cormorant, Chickadee, red fox, banjo, barbell, hourglass, knot, maze, viaduct, etc. cat, tabby cat (0.71) Egyptian cat (0.22) red fox (0.11) …..

The Data: ILSVRC Imagenet Large Scale Visual Recognition Challenge (ILSVRC): Annual Competition 1000 Categories ~1000 training images per Category ~1 million images in total for training ~50k images for validation Only images released for the test set but no annotations, evaluation is performed centrally by the organizers (max 2 per week)

The Evaluation Metric: Top K-error Top-1 error: 1.0 Top-1 accuracy: 0.0 Top-2 error: 1.0 Top-2 accuracy: 0.0 True label: Abyssinian cat Top-3 error: 1.0 Top-3 accuracy: 0.0 Top-4 error: 0.0 Top-4 accuracy: 1.0 Top-5 error: 0.0 Top-5 accuracy: 1.0 cat, tabby cat (0.61) Egyptian cat (0.22) red fox (0.11) Abyssinian cat (0.10) French terrier (0.03) …..

Top-5 error on this competition (2012)

Alexnet (Krizhevsky et al NIPS 2012)

Alexnet https://www.saagie.com/fr/blog/object-detection-part1

Pytorch Code for Alexnet • In-class analysis https://github.com/pytorch/vision/blob/master/torchvision/models/alexnet.py

Dropout Layer model.train() model.eval() Srivastava et al 2014

Preprocessing and Data Augmentation

Preprocessing and Data Augmentation 256 256

Preprocessing and Data Augmentation 224x224

True label: Abyssinian cat

Some Important Aspects • Using ReLUs instead of Sigmoid or Tanh • Momentum + Weight Decay • Dropout (Randomly sets Unit outputs to zero during training) • GPU Computation!

What is happening? https://www.saagie.com/fr/blog/object-detection-part1

SIFT + FV + SVM (or softmax) Feature Feature Classification extraction encoding (SVM or softmax) (SIFT) (Fisher vectors) Deep Learning Convolutional Network (includes both feature extraction and classifier)

VGG Network Top-5: https://github.com/pytorch/vision/blob/master/torchvision/models/vgg.py Simonyan and Zisserman, 2014. https://arxiv.org/pdf/1409.1556.pdf

GoogLeNet https://github.com/kuangliu/pytorch-cifar/blob/master/models/googlenet.py Szegedy et al. 2014 https://www.cs.unc.edu/~wliu/papers/GoogLeNet.pdf

Further Refinements – Inception v3, e.g. GoogLeNet (Inceptionv1) Inception v3

ResNet (He et al CVPR 2016) https://github.com/pytorch/vision/blob/master/ torchvision/models/resnet.py

BatchNormalization Layer https://arxiv.org/abs/1502.03167

Slide by Mohammad Rastegari

https://arxiv.org/pdf/1608.06993.pdf

Object Detection deer cat

Object Detection as Classification deer? cat? CNN background?

Object Detection as Classification with Sliding Window deer? cat? CNN background?

Object Detection as Classification with Box Proposals

Box Proposal Method – SS: Selective Search Segmentation As Selective Search for Object Recognition. van de Sande et al. ICCV 2011

RCNN https://people.eecs.berkeley.edu/~rbg/papers/r-cnn-cvpr.pdf Rich feature hierarchies for accurate object detection and semantic segmentation. Girshick et al. CVPR 2014.

Questions? 36

CNN Architectures ILSVRC: Imagenet Large Scale Visual Recognition - PowerPoint PPT Presentation

CS4501: Introduction to Computer Vision CNN Architectures ILSVRC: Imagenet Large Scale Visual Recognition Challenge [Russakovsky et al 2014] The Problem: Classification Classify an image into 1000 possible classes: e.g. Abyssinian cat,

Object Detection using R-CNN Experiments CS381V: Visual Recognition, Spring 2016 William Xie

Augmentation Introduction ImageNet Classification with Deep Convolutional Neural Networks,

Deep Networks for Computer Vision at Google Chuck Rosenberg ImageNet ILSVRC Workshop September

CS7015 (Deep Learning) : Lecture 12 Object Detection: R-CNN, Fast R-CNN, Faster R-CNN, You Only

Imagenet Xavier Gir-i-Nieto ImageNet ILSRVC Li Fei-Fei, How were teaching computers to

Modern CNNs Prof. Seungchul Lee Industrial AI Lab. ImageNet Human performance = 5.1 % from

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

Architectures Architectural styles Software architectures Architectures versus middleware

Image Retrieval with CNN Giorgos Tolias Visual Recognition Group, CTU in Prague CVPR 2017

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Biovision team 2 Retina Visual cortex 3 Retina Visual cortex 3 Retina Visual cortex 3

Image as a single label king crab Image Source: ImageNet Image as an object set Man

Geirhos et al. (2019) Introduction ImageNet classifjcation with CNNs Which image cues are

Decay vertex ID using CNN for p K+ Aaron Higuera University of Houston CNN Tools on

CNN Ba CNN Based ed Pi Pipeline peline for or Op Optical ical Fl Flow ow Tal Schuster,

CENG5030 Part 2-1: Introduction to Convolutional Nueral Network Bei Yu (Latest update: March 4,

Ba Bayesi esian Deep Deep Le Lear arning ning Prof. Leal-Taix and Prof. Niessner 1 Go

Complementary-Label Learning for Arbitrary Losses and Models Takashi Ishida 1 , 2 Gang Niu 2

CS184b: Computer Architecture [Single Threaded Architecture: abstractions, quantification, and

Classification: Introduction David Dalpiaz STAT 430, Fall 2017 1 Announcements BVT Review

Restart and Recovery Plan South Hackensack Memorial Reopening Plan For 2020-2021 School Year

TO 7XC and I am an English teacher at St Marys College. Reading is my favourite thing to do,

Multidimensional Scaling MAT 6480W / STT 6705V Guy Wolf guy.wolf@umontreal.ca Universit e de

Constructors and Destructors C++ Object Oriented Programming Pei-yih Ting NTOU CS 1 Contents