and Inference for Convolutional Neural Networks 1 2 FFT IFFT 3 - PowerPoint PPT Presentation

Band-limited Training and Inference for Convolutional Neural Networks 1

FFT IFFT 3

Mathieu et al.: “Fast Training of Convolutional Networks through FFTs” Fast Convolutional Nets With fbfft: A GPU Performance Evaluation Data: x xfft FFT(x) offt Out: o xfft  yfft IFFT(offt) Filter: y FFT(y) yfft

Mathieu et al.: “Fast Training of Convolutional Networks through FFTs” Fast Convolutional Nets With fbfft: A GPU Performance Evaluation Data: x xfft FFT(x) Out: o offt xfft  yfft IFFT(offt) Filter: y FFT(y) yfft

Mathieu et al.: “Fast Training of Convolutional Networks through FFTs” Fast Convolutional Nets With fbfft: A GPU Performance Evaluation Data: x xfft FFT(x) offt Out: o xfft  yfft IFFT(offt) Filter: y FFT(y) yfft

Mathieu et al.: “Fast Training of Convolutional Networks through FFTs” Fast Convolutional Nets With fbfft: A GPU Performance Evaluation cuDNN cu DNN: Subs ubstantial tantial memor ory y wor orkspace space neede ded d for or intermed ermediate iate resul ults. ts. Data: x xfft FFT(x) offt Out: o xfft  yfft IFFT(offt) Filter: y FFT(y) yfft

Band-limiting = masking out high frequencies xfft Data: x xCfft Band-limited FFT(x) (xfft) offt Out: o xCfft  yCfft IFFT(offt) Filter: y yfft Band-limited FFT(y) (yfft) yCfft

xfft Data: x xCfft Band-limited Less memory used FFT(x) (xfft) offt Out: o xCfft  yCfft IFFT(offt) Filter: y yfft Band-limited FFT(y) (yfft) yCfft

xfft Data: x xCfft Band-limited Less memory used FFT(x) (xfft) offt Out: o xCfft  yCfft IFFT(offt) Filter: y yfft Band-limited FFT(y) Faster computation (yfft) yCfft

Preserve enough of the spectrum to retain high accuracy of models. xfft Data: x xCfft Band-limited Less memory used FFT(x) (xfft) offt Out: o xCfft  yCfft IFFT(offt) Filter: y yfft Band-limited FFT(y) Faster computation (yfft) yCfft

2. Conjugate symmetry 1-j 1+j 15

2. Conjugate symmetry 1+j 16

2. Conjugate symmetry DC 1+j 3. Real values 17

2. Conjugate symmetry DC 1+j 3. Real values 4. No constraints 18

2. Conjugate symmetry DC 3. Real values 4. No constraints 5. 1 st compression 19

2. Conjugate symmetry DC 3. Real values 4. No constraints 5. 1 st compression 20

2. Conjugate symmetry DC 3. Real values 4. No constraints 5. 1 st compression 6. 2 nd compression 21

2. Conjugate symmetry DC 3. Real values 4. No constraints 5. 1 st compression 6. 2 nd compression 22

Test Accuracy (%) 95 90 ResNet-18 on CIFAR-10 85 0 20 40 60 80 Compression rate (%) 23

93.5% Test Accuracy (%) 95 90 ResNet-18 on CIFAR-10 85 0 20 40 60 80 Compression rate (%) 24

93.5% Test Accuracy (%) 95 92% 90 ResNet-18 on CIFAR-10 85 0 20 40 60 80 Compression rate (%) 25

93.5% Test Accuracy (%) 95 92% 90 ResNet-18 on CIFAR-10 85 0 20 40 60 80 Test Accuracy (%) 80 75.3% 71.2% 70 DenseNet-121 on CIFAR-100 60 0 20 40 60 80 Compression rate (%) 26

▪ ▪ ▪ ▪ ▪ ▪ ▪ 27

▪ ▪ ▪ ▪ ▪ ▪ ▪ ▪ 30

Cross-correlate input data and filter: x ∗ c y F x ω = F x n F y ω = F y n x ∗ c y = F −1 (F x ω ʘ F y ω ) Spectrum of convolution: S ω = F x ω ʘ F y ω 𝐍 𝐝 𝛛 = ቊ 𝟐, 𝛛 ≤ 𝐝 𝐏, 𝛛 > 𝐝 x ∗ c y = F −1 [ F x ω ʘ M c ω ) ʘ (F y ω ʘ M c ω ] x ∗ c y = F −1 S ω ʘ M c ω 𝑂−1 |𝑦 𝑜 | 2 = σ 𝜕=0 Energy (Parseval’s theorem): σ 𝑜=0 2𝜌 𝑦 𝜕 | 2 |𝐺 31

DenseNet-121 on CIFAR-100 80 70 Test accuracy (%) 60 50 40 30 20 10 C=50 C=75 0 0 20 40 60 80 Inference Compression Rate (%) 33

DenseNet-121 on CIFAR-100 80 70 Test accuracy (%) 60 50 40 30 20 10 C=0 C=50 C=75 C=85 0 0 20 40 60 80 Inference Compression Rate (%) 34

ResNet-18 on CIFAR-10 performance Normalized 100 (%) 50 GPU memory allocated 0 0 20 40 60 80 performance 100 Normalized 50 (%) Epoch time 0 0 20 40 60 80 Compression rate (%) 35

100 ResNet-18 on CIFAR-10 80 Test accuracy (%) 60 40 Train Compression Rate (%): 20 C=0 0 0 10 20 30 40 50 60 70 80 Inference Compression Rate (%) 36

100 ResNet-18 on CIFAR-10 80 Test accuracy (%) 60 40 Train Compression Rate (%): 20 C=0 C=85 0 0 10 20 30 40 50 60 70 80 Inference Compression Rate (%) 37

100 ResNet-18 on CIFAR-10 80 Test accuracy (%) 60 40 Train Compression Rate (%): 20 C=0 C=85 0 0 10 20 30 40 50 60 70 80 Inference Compression Rate (%) Smooth degradation of accuracy during inference 38

100 ResNet-18 on CIFAR-10 80 Test accuracy (%) 60 40 Train Compression Rate (%): 20 C=0 C=30 C=50 C=85 0 0 10 20 30 40 50 60 70 80 Inference Compression Rate (%) Apply the same compression rate to training and inference 39

Test Accuracy (%) 95 100 90 50 GPU memory allocated 85 0 0 50 0 20 40 60 80 100 Test Accuracy (%) 80 50 70 Epoch time 0 60 0 20 40 60 80 0 50 Compression rate (%) Compression rate (%) 40

“Speaking of longer term, it would be nice if the community migrated to a fully open sourced implementation for all of this [convolution operations, etc.]. This stuff is just too important to the progress of the field for it to be locked away in proprietary implementations . The more people working together on this the better for everyone. There's plenty of room to compete on the hardware implementation side.” Scott Gray https://github.com/soumith/convnet-benchmarks/issues/93 46

and Inference for Convolutional Neural Networks 1 2 FFT IFFT 3 - PowerPoint PPT Presentation

Band-limited Training and Inference for Convolutional Neural Networks 1 2 FFT IFFT 3 4 Mathieu et al.: Fast Training of Convolutional Networks through FFTs Fast Convolutional Nets With fbfft: A GPU Performance Evaluation Data: x

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

Anytime Reliability of Systematic LDPC Motivation Convolutional Codes LDPC Convolutional Codes

Convolutional Autoencoder (CAE) Prof. Seungchul Lee Industrial AI Lab. Convolutional Autoencoder

Introduction CSCE 970 CSCE 970 Lecture 4: Lecture 4: Convolutional Convolutional Neural

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN)

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Coding and decoding with convolutional codes. The Viterbi Algorithm. J.-M. Brossier 2008 J.-M.

15-780 Graduate Artificial Intelligence: Convolutional and recurrent networks J. Zico Kolter

ECEN 5682 Theory and Practice of Error Control Codes Convolutional Codes Peter Mathys

Coding and decoding with convolutional codes. The Viterbi Algorithm. J.-M. Brossier 2008 J.-M.

Multidimensional Quasi-Cyclic and Convolutional Codes Buket Ozkaya joint work with Cem G

Information Transmission Chapter 5, Convolutional codes FREDRIK TUFVESSON ELECTRICAL AND

ON TEGRA X1 ALAN WANG, NVIDIA Convolutional Neural Network optimization target Result

Convolutional Neural Nets CS447 Natural Language Processing (J. Hockenmaier)

Semantic Segmentation of the sekleton in bone scintigraphy images with convolutional neural

Near Neighbor Search in High Dimensional Data (2) Locality-Sensitive Hashing (continued) LS

How to make the first million? How to create a company? Leszek Czarnecki, PhD London, October

association of financial mutuals 53 member companies 20 million policyholders Association of

Aviva plc plc Aviva 2002 Interim Results 2002 Interim Results 1 August 2002 1 August 2002

Unsupervised Music Understanding based on Nonparametric Bayesian Models Kazuyoshi Yoshii Masataka

Communications Strategy Page 1 Scott Edgell General Manager Will Graham Minute Item 4

Transition Research Programme Allan Colver Professor of Community Child Health Newcastle

31 January 2015 Primary 6 Outline 1) About the teachers Introduction Preferred mode of

Sambuz

Useful Links

Newsletter

Mail Us