GoogLeNet Deeper than deeper Some slides are from Christian Szegedy
GoogLeNet Convolution Pooling Softmax Other
GoogLeNet vs Previous GoogLeN Convolution et Pooling Softmax Other Zeiler-Fergus Architecture (1 tower)
Why is the deep learning revolution arriving just now?
Why is the deep learning revolution arriving just now?
Why is the deep learning revolution arriving just now? Re ctified L inear U nit Glorot, X., Bordes, A., & Bengio, Y. (2011). Deep sparse rectifier networks In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. JMLR W&CP Volume (Vol. 15, pp. 315-323).
Theoretical breakthroughs Arora, S., Bhaskara, A., Ge, R., & Ma, T. Provable bounds for learning some deep representations . ICML 2014
Hebbian Principle Cells that fire together, wire together Input
Cluster according activation statistics Layer 1 Input
Cluster according correlation statistics Layer 2 Layer 1 Input
Cluster according correlation statistics Layer 3 Layer 2 Layer 1 Input
In images, correlations tend to be local
Cover very local clusters by 1x1 convolutions number of 1x1 filters
Less spread out correlations number of 1x1 filters
Cover more spread out clusters by 3x3 convolutions number of 1x1 filters 3x3
Cover more spread out clusters by 5x5 convolutions number of 1x1 filters 3x3
Cover more spread out clusters by 5x5 convolutions number of 1x1 filters 5x5 3x3
A heterogeneous set of convolutions number of 1x1 filters 3x3 5x5
Schematic view (naive version) number of 1x1 filters Filter concatenation 3x3 1x1 3x3 5x5 convolutions convolutions convolutions 5x5 Previous layer
Naive idea Filter concatenation 1x1 3x3 5x5 convolutions convolutions convolutions Previous layer
Naive idea ( does not work! ) Filter concatenation 1x1 3x3 5x5 3x3 max convolutions convolutions convolutions pooling Previous layer
Inception module Filter concatenation 3x3 5x5 1x1 convolutions convolutions convolutions 1x1 convolutions 1x1 1x1 3x3 max convolutions convolutions pooling Previous layer
Inception Convolution Why does it have so Pooling many layers??? Softmax Other
Inception Convolution 9 Inception modules Pooling Softmax Other Network in a network in a network...
1024 Inception 832 832 512 512 512 480 256 480 Width of inception modules ranges from 256 filters (in early modules) to 1024 in top inception modules.
1024 Inception 832 832 512 512 512 480 256 480 Width of inception modules ranges from 256 filters (in early modules) to 1024 in top inception modules. � Can remove fully connected layers on top completely
1024 Inception 832 832 512 512 512 480 256 480 Width of inception modules ranges from 256 filters (in early modules) to 1024 in top inception modules. � Can remove fully connected layers on top completely � Number of parameters is reduced to 5 million �
1024 Inception 832 832 512 512 512 480 256 480 Width of inception modules ranges from 256 filters (in early modules) to 1024 in top inception modules. Computional cost is � increased by less than Can remove fully connected layers on top completely 2X compared to � Krizhevsky’s network. Number of parameters is reduced to 5 million (<1.5Bn operations/ � evaluation)
Efficient Gradient Propatation • Shadow network can always provide good performance • Auxiliary classifier connected to intermediate layers
Multiple Models and Crops Performance break
Classification performance
Where Are We Now
Where Are We Now •It is very hard for hymn •Even if the number of choices is reduced to 1000
Where Are We Now •It is very hard for hyman •Even if the number of choices is reduced to 1000 •It is time consuming •1 image per minute •Human performance •Without training: 13 - 15% error •With training: 5.1% •GoogLeNet: 6.7%
Recommend
More recommend