Tuning the Performance of Convolutional Neural Network for Image - PowerPoint PPT Presentation

Apr 15, 2023 •32 likes •182 views

Tuning the Performance of Convolutional Neural Network for Image Classification on GPU Agenda Adoptions of Image classification or image recognition at Alibaba Easy ways to improve performance of Caffe Further performance

Tuning the Performance of Convolutional Neural Network for Image Classification on GPU
Agenda • Adoptions of Image classification or image recognition at Alibaba • Easy ways to improve performance of Caffe • Further performance optimization of convolution layer • Ongoing works 2 Confidential & Proprietary
Image classification at Alibaba • Product Display Classification Model-Upper/Item-Bottom/Multi-Object • Fashion Style Classification • Buy-by-photo mobile Sweet / Street / Office app, search for visually similar products by images • Leverage Caffe framework Confidential & Proprietary
Profiling Caffe • Most expensive part Caffe spends more than 70% of time on Convolution layers ! 4 Confidential & Proprietary
Convolution layer • How does the convolution layer work in Caffe Image to Column SGemm Confidential & Proprietary
The gap • Is it really fast? Blue: Caffe(imagenet model) Red: Sgemm routine of Cublas Green: Peak of K20 ImageNet model, refer to the ILSVRC12 challenge Confidential & Proprietary
How does Cublas Sgemm perform 7 Confidential & Proprietary
Easiest way to narrow the gap • To Overcome the low efficient of SGEMM at small scale Processing one batch Processing one batch Image to Column Image to Column Single image Batch-coalesced images every every loop loop Gemm Gemm 8 Confidential & Proprietary
Performance of Fast mode • Titan black, mini-batch size is 256 9 Confidential & Proprietary
Moving forward • How is cublas sgemm implemented Confidential & Proprietary
Use high performance sgemm routines • Example: ImageNet convolution layer “conv5”: M = 96, N=3025, K=363 • cuBLAS use: sgemm_64x16x64x16x16, slow! • We use: sgemm_128x8x128x16x16 to get the same result, 1.54x faster on K20 ！ Confidential & Proprietary
Implement our own conv layer • Auto-gen gpu kernels for convolution layers • Kernels are implemented in PTX assembly Conv2 from Alex’s Net, Height = 16; Width = 16; Channel = 5; Stride = 1; Ksize = 5; Pad = 2; Neuron = 32 Confidential & Proprietary
Is PTXAS good enough? snippet of instru • Problem /* 0x 09 00 10 1c1c 10 1c1c */ ctions from sge FFMA R23, R83, R84, R23; mm kernel, sm_ – Register usage FFMA R33, R88, R84, R33; 35 FFMA R36, R88, R85, R36; – Manipulate “control code” on Kepler NOP; FFMA R45, R89, R84, R45; • Our own assembler for Kepler FFMA R32, R89, R85, R32; NOP; – Probe native ins /* 0x 08 80 10 14 10 14 10 14 */ FFMA R5, R80, R86, R5; – Probe control ins FFMA R2, R81, R86, R2; FFMA R14, R81, R87, R14; – Ongoing FFMA R7, R80, R92, R7; FFMA R3, R80, R87, R3; • Some users need a native assembler, please! FFMA R8, R81, R92, R8; NOP; Confidential & Proprietary
Other ongoing works • Convert model from Single-precision floating points to – half-precision (maxwell) – flexible fixed-points (FPGA) Confidential & Proprietary
Thank You • Download the mobile app at taobao.com and try out Buy-by- Photo 15 Confidential & Proprietary

Recommend

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Convolutional Neural Networks<br/><br/> 5/4/19, 4(03 PM Convolutional Neural Networks<br/><br/> 5/4/19, 4(03 PM Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use UMaine

412 views • 9 slides

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

Transfer Learning with Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural Networks A breakthough Convolutional Neural Networks VGG-16 example Layers of Convolutional filters Bottleneck

625 views • 23 slides

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN)

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN) A.k.a. CNN or ConvNet Adit Deshpande, A Beginner's Guide To Understanding Convolutional Neural Networks. Digital Images Input array: an images

1.42k views • 72 slides

Introduction CSCE 970 CSCE 970 Lecture 4: Lecture 4: Convolutional Convolutional Neural

Introduction CSCE 970 CSCE 970 Lecture 4: Lecture 4: Convolutional Convolutional Neural Neural CSCE 970 Lecture 4: Networks Networks Good for data with a grid-like topology Stephen Scott Convolutional Neural Networks Stephen Scott

355 views • 3 slides

Convolutional Neural Nets 4-25-16 Reading Quiz Convolutional neural networks are most commonly

Convolutional Neural Nets 4-25-16 Reading Quiz Convolutional neural networks are most commonly used for what sort of task? a) Recognizing images b) Transcribing speech c) Translating documents d) Playing games Neural network review

431 views • 13 slides

ON TEGRA X1 ALAN WANG, NVIDIA Convolutional Neural Network optimization target Result

DIRECT CONVOLUTION FOR DEEP NEURAL NETWORK CLASSIFICATION ON TEGRA X1 ALAN WANG, NVIDIA Convolutional Neural Network optimization target Result Convolutional Fully Connected Input layer layer Convolutional Layer An example: A E

621 views • 18 slides

Neural Network Part 3: Convolutional Neural Networks CS 760@UW-Madison Goals for the lecture

Neural Network Part 3: Convolutional Neural Networks CS 760@UW-Madison Goals for the lecture you should understand the following concepts convolutional neural networks (CNN) convolution and its advantage pooling and its advantage 2

750 views • 50 slides

CAPES:Unsupervised Storage Performance Tuning Using Neural Network-Based Deep Reinforcement

CAPES:Unsupervised Storage Performance Tuning Using Neural Network-Based Deep Reinforcement Learning Yan li, Kenneth Chang, Oceane Bel, Ethan L. Miller, Darrel D. E. Long Performance Tuning Tuning systems parameters for high

343 views • 15 slides

Convolutional Neural Nets CS447 Natural Language Processing (J. Hockenmaier)

Lecture 8: Convolutional Neural Nets Convolutional Neural Nets CS447 Natural Language Processing (J. Hockenmaier) https://courses.grainger.illinois.edu/cs447/ 1 Convolutional Neural Nets (ConvNets, CNNs) [4 parameters, applied 3

615 views • 24 slides

Convolutional Neural Networks for Sentence Classification Yoon Kim New York University 1 / 34

Convolutional Neural Networks for Sentence Classification Convolutional Neural Networks for Sentence Classification Yoon Kim New York University 1 / 34 Convolutional Neural Networks for Sentence Classification Agenda Word Embeddings

389 views • 34 slides

Convolutional Neural Networks 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image Processing

Convolutional Neural Networks 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image Processing 2016 Prof. Luiz Velho Convolutional Neural Networks 1 Summary & References 08/11 ImageNet Classification with Deep Convolutional Neural Networks

648 views • 26 slides

Outline Convolutional Neural Network Architectures for Matching Natural Language Sentences.

Outline Hu, NIPS14 Irsoy, NIPS14 Outline Convolutional Neural Network Architectures for Matching Natural Language Sentences. NIPS14 Convolutional Sentence Model Convolutional Matching Models Experiments Deep Recursive Neural

779 views • 35 slides

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural IR tasks Neural IR architecture Feature Representations Neural IR query auto completion Neural IR query suggestion Neural IR document

1.48k views • 18 slides

CS4501: Introduction to Computer Vision Deeper Convolutional Neural Network Architectures Last

CS4501: Introduction to Computer Vision Deeper Convolutional Neural Network Architectures Last Class Neural Networks multilayer perceptron model (MLP) Backpropagation Convolutional Neural Networks Todays Class More on

772 views • 41 slides

Semantic Segmentation of the sekleton in bone scintigraphy images with convolutional neural

Semantic Segmentation of the sekleton in bone scintigraphy images with convolutional neural networks Problem Description Convolutional Neural Networks Convolutional Layer Max Pooling Transforming Classification networks to segmentation

157 views • 12 slides

Convolutional Neural Networks in Speech Lecture 20 CS 753 Instructor: Preethi Jyothi

Convolutional Neural Networks in Speech Lecture 20 CS 753 Instructor: Preethi Jyothi Convolutional Neural Networks (CNNs) Fully connected (dense) layers have no awareness of spatial information Key concept behind convolutional layers is

983 views • 41 slides

Convergence of Local Search Sebastian U. Stich a,b joint work with uller a,b , Bernd G artner a

ThRaSH Workshop 2012 Convergence of Local Search Sebastian U. Stich a,b joint work with uller a,b , Bernd G artner a Christian L. M Institute of Theoretical Computer Science Department of Computer Science a , Swiss Institute of

507 views • 17 slides

Contains public sector information published by the Health and Safety Executive and licensed

Example Assessment 1 TVW Bricklaying LTD - extract from risk assessment. What are you already doing? What further action is necessary? Done What are the Who might be Action by Action hazards? harmed and how? who? by when? LG Noise from use

399 views • 4 slides

Gold Fields In Australia DELIVERY & GROWTH WA Mining Club, 27 July 2017 Forward looking

Gold Fields In Australia DELIVERY & GROWTH WA Mining Club, 27 July 2017 Forward looking statements Certain statements in this document constitute forward looking statements within the meaning of Section 27A of the US Securities Act of

340 views • 19 slides

Safety Meeting Report Vancleave Attendance Center Mr. Todd Knight February 15, 2016 Report

Safety Meeting Report Vancleave Attendance Center Mr. Todd Knight February 15, 2016 Report All Accidents Report immediately Notify office: If you seek treatment from: Doctor Urgent Care Emergency Room Incident Number:

362 views • 13 slides

DEJA-VU: DOUBLE FEATURE PRESENTATION IN DEEP TRANSFORMER NETWORKS Andros Tjandra 1 , Chunxi Liu

DEJA-VU: DOUBLE FEATURE PRESENTATION IN DEEP TRANSFORMER NETWORKS Andros Tjandra 1 , Chunxi Liu 2 , Frank Zhang 2 , Xiaohui Zhang 2 , Yongqiang Wang 2 , Gabriel Synnaeve 2 , Satoshi Nakamura 1 , Geoffrey Zweig 2 1 Nara Institute of Science and

443 views • 5 slides

Ashmore Group plc Final Results 12 months to 30 June 2012 11 September 2012 Presentation team

Ashmore Group plc Final Results 12 months to 30 June 2012 11 September 2012 Presentation team Mark Coombs, Chief Executive Officer Graeme Dell, Group Finance Director Tom Shippey , Head of Corporate Development 1 Contents

671 views • 41 slides

Investor Presentation March 2017 1 Forward-Looking Information This management presentation (the

Investor Presentation March 2017 1 Forward-Looking Information This management presentation (the presentation) was prepared as a summary overview of current information about Fortune Min era ls Limited (the Company) only and is not a

612 views • 30 slides

UNITED STATES SECURITIES AND EXCHANGE COMMISSION Washington, D.C. 20549 FORM 10-Q QUARTERLY

Table of Contents UNITED STATES SECURITIES AND EXCHANGE COMMISSION Washington, D.C. 20549 FORM 10-Q QUARTERLY REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934 For the quarterly period ended: September 30, 2016

742 views • 72 slides