Going Deeper with Convolutions Christian Szegedy, Wei Liu, Yangqing - PowerPoint PPT Presentation

Going Deeper with Convolutions Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich PRESENTED BY: KAYLEE YUHAS AND KYLE COFFEY

About Neural Networks • Neural networks can be used in many different capacities, often by capitalizing on their shared skills with AIs: • Object classification, such as with images - Given images of 2 different wolves, can identify subspecies • Speech recognition • Through interactive mediums such as video games, identify how people respond to different stimuli in various environments and situations • This work requires a hefty amount of resources to run smoothly • Traditional neural network architecture has remained mostly constant

How to improve on traditional neural network setups? • Increasing the performance of a neural network by increasing its size, while seemingly logically sound, has severe drawbacks: • Increased number of parameters makes the network prone to overfitting • Larger network size requires more computational resources l Green line: overfitting l

How to improve on traditional neural network setups? • Introducing sparsity into the architecture by replacing fully connected layers with sparse ones, even inside convolutions, is key. • Mimics biological systems • How to improve performance without more hardware? • By utilizing computations on dense matrices • This sparse architecture’s name is Inception, based on the 2010 film of the same name

Inception Architecture: Naïve Version In short: Inputs come from the previous layer, and go through various convolutional layers. The pooling layer serves to control overfitting by reducing spatial size. • The paper’s authors determined this was the optimal spatial spread, “the decision based more on convenience than necessity” l This can be repeated spatially for scaling l This alignment also avoids patch-alignment issues • However, 5x5 modules quickly become prohibitively expensive on convolutional layers with a large number of filters

Inception Architecture: Dimensionality Reduction • By computing reductions with 1x1 convolutions before reaching the more expensive 3x3 and 5x5 convolutions, the necessary processing power is tremendously reduced l The use of dimensionality reductions allows for significant increases in the number of units at each stage without having a sharp increase in necessary computational resources at later, more complex stages

GoogLeNet • An iteration of Inception the paper’s authors used as their submission to the 2014 ImageNet Large Scale Visual Recognition Competition (ILSVRC). • The network was designed to be so efficient it could run with a low memory footprint on individual devices that have limited computational resources. l If CNNs are to gain a foothold in private industry, having low overhead costs is especially important. Here is a small sample of the architecture of GoogLeNet, where you can note the usage of dimensionality reduction as opposed to the naïve.

GoogLeNet • Because the entirety of the architecture is far too large to fit legibly in one slide.

GoogLeNet • GoogLeNet incarnation of the Inception architecture. • “#3x3/#5x5 reduce” stands for the number of 1x1 filters in the reduction layer used before 3x3 and 5x5 convolutions. • While there are many layers to this, the main goal of it is to have the final “softmax” layers give “scores” to the image classes. • i.e. dogs, skin diseases, etc. • Loss function determines how good or bad each score is.

GoogLeNet • GoogLeNet was 22 layers deep, when counting only layers with parameters. l 27 if you count pooling l About 100 total layers • Could be trained to convergence with a few high-end GPUs in about a week l The main limitation would be memory usage • It was trained to classify images of into one of over 1000 leaf-node image categories in the ImageNet hierarchy l ImageNet is a large visual database designed specifically for visual software recognition research l GoogLeNet performed quite well in this contest

GoogLeNet • GoogLeNet was 22 layers deep, when counting only layers with parameters: 27 if you count pooling, with about 100 layers in total. • Left: GoogLeNet’s performance at the 2014 ILSVRC: it came in first place. • Right: A breakdown of its classification performance breakdown. l Using multiple different CNNs and averaging their scores to get a prediction class for an image results in better scores than just 1 CNN. See: the instance with 7 CNNs.

Summary • Convolutional neural networks are still top performers in neural networks. • The Inception framework allows for large scaling while minimizing processing bottlenecks, as well as “choke points” where if it scales to a certain point, it becomes inefficient. l It also runs well on machines without powerful hardware. • Reducing with using 1x1 convolutions before passing it to 3x3 and 5x5 convolutions has proven efficient and effective. • Further study: is mimicking the actual biological conditions universally the best case for neural network architecture? Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A. (2015). Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). doi:10.1109/cvpr.2015.7298594 . Chabacano. (2008, February). Overfitting. Retrieved April 08, 2017, from https://en.wikipedia.org/wiki/Overfitting

Going Deeper with Convolutions Christian Szegedy, Wei Liu, Yangqing - PowerPoint PPT Presentation

Going Deeper with Convolutions Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich PRESENTED BY: KAYLEE YUHAS AND KYLE COFFEY About Neural Networks

Synchronization: Going Deeper Synchronization: Going Deeper SharedLock : Reader/Writer Lock :

Fast Convolutions Via the Overlap- and-Save Method Using Shared Memory FFT Karel Admek , Sofia

Dense Predictions Using Dilated Convolutions Najmus Ibrahim University of Toronto Institute for

Time-aware Large Kernel Convolutions Vasileios Lioutas and Yuhong Guo ICML | 2020 Brief Overview

Laplace Transforms and Convolutions Bernd Schr oder logo1 Bernd Schr oder Louisiana Tech

Budget 2017 Common Fund Budget 2017 Transforming Communities Going Deeper into God Making New

A Deeper Deeper Look Look at at Ba Bay Ar Area ea Opportunity Opportunity Zo Zones August 13,

DEEPER THAN LIGHT DEEPER THAN LIGHT Art- - Science Science - - Technology Technology Art

Nonprofit Financial Basics (a deeper dive) Nonprofit Financial Basics-(a deeper dive) Today

support every child reach every student A Deeper Conversation A Deeper Conversation

Day 2: Diving Deeper into Day 2: Diving Deeper into Data Visualization with R Data Visualization

GoogLeNet Deeper than deeper Some slides are from Christian Szegedy GoogLeNet Convolution

Are you experiencing the fullness of Christ? 2020 Vision: Deeper. Closer. Wider. Higher. Deeper

CONNECT Deeper Friendships! Deeper Faith! Where Is God In All This? Why F y Frien iendship

Monocular Depth Estimation Using Atrous Convolutions Group 5 - Faraz Saeedan Fabian Kessler,

Convolutions and Their Tails Anirban DasGupta 1 Y, Z : ( , A , P ) ( X , B ) , Y, Z

Artificial Neural Networks for Storm Surge Predictions in NC DHS Summer Research Team 1 Outline

BrainChip Holdings Ltd. Corporate Presentation 1 BrainChip Overview BrainChip is a leading

General Neural Networks Compositions of linear maps and component-wise non- linearities Neural

Neural Networks applied to Blending Challenges Sowmya Kamath, Patricia Burchat Blending

Word Sense Determination from Wikipedia Data Using Neural Networks Advisor Dr. Chris Pollett

A Mean Field View of the Landscape of Two-Layers Neural Networks Song Mei Stanford University

Breaking CAPTCHAs on the Dark Web Using neural networks to enable scraping RP #62, Kevin Csuka

Spherical Convolutional Neural Networks Empirical analysis of SCNNs LTS2 Prof. Pierre