MIXED PRECISION TRAINING Michael OConnor MIXED PRECISION What is - PowerPoint PPT Presentation

Oct 25, 2022 •423 likes •541 views

MIXED PRECISION TRAINING Michael OConnor MIXED PRECISION What is the benefit? Using mixed precision and Volta your networks can be: 1. 3-4x faster 2. Reduce memory consumption and bandwidth pressure 3. just as powerful with no architecture

MIXED PRECISION TRAINING Michael O’Connor
MIXED PRECISION What is the benefit? Using mixed precision and Volta your networks can be: 1. 3-4x faster 2. Reduce memory consumption and bandwidth pressure 3. just as powerful with no architecture change. 2
A MIXED PRECISION SOLUTION Imprecise weight updates "Master" weights in FP32 Gradients underflow Loss (Gradient) Scaling Maintain precision Accumulate to FP32 (Tensor Cores) 3
MIXED SOLUTION: FP32 MASTER WEIGHTS Apply FP32 Master FP32 Master Weights Gradients FP16 Copy Gradients FP16 FP16 Loss Weights Forward Pass 4
GRADIENTS RANGE OFFSET 5
MIXED PRECISION TRAINING Remove scale, Apply (+clip, etc.) Scaled FP32 Master FP32 Gradients FP32 Weights Gradients Scaled Copy FP16 Gradients FP16 FP32 Scaled FP32 Weights Loss Loss Loss Scaling Forward Pass 6
NVCAFFE V0.16 TRAINING ALEXNET 2700 Balance memory alloc 2568 btw. I/O & conv w.s. Parallelize I/O decode & deserialize Images per second Improved algo selection CPU Affinity 2200 Fused weight update Parallel AllReduce 1700 Starting point nvCaffe 0.15 @ 1265 1200 June 2016 Sept 2016 Oct 2016 Dec 2016 Feb 2017 March 2017 May 2017 Single P100 GPU, Batch Size=128 7
RESNET-50 FP32 PERFORMANCE Caffe Caffe2 TensorFlow MXNet Torch CNTK Chainer 2000 1750 1500 1250 1000 750 Images per second 500 250 0 1 GPU 2 GPU 4 GPU 8 GPU 4/30/2017 : DGX-1 with Batch Size=64 per GPU. Chainer numbers are preliminary. 8
RESNET-50 MIXED PRECISION AND FP32 1 GPU 2 GPU 4 GPU 8 GPU 7000 6500 6000 5500 5000 4500 4000 3500 3000 2500 Images per second 2000 1500 1000 500 0 MXNet FP32 GTC 2017 MXNet FP32 GTC 2018 MXNet Mixed GTC 2018 9
INFORMATION SOURCES Where to learn about mixed precision training CE8130 - Connect with the Experts: Deep Learning Training for Volta Tensor Cores Tu 2PM S8923 - Training Neural Networks with Mixed Precision: Theory and Practice Wed 2PM S81012 - Training Neural Networks with Mixed Precision: Real Examples Th 9 AM CE8162 - Connect with the Experts: Deep Learning Training for Volta Tensor Cores Th 2PM Mixed- Precision Training of Deep Neural Networks (NVIDIA Developer Blog) Training with Mixed Precision (NVIDIA User Guide) 10

Recommend

Mixed Precision Training PAI Overview What is mixed-precision

Mixed Precision Training PAI Overview What is mixed-precision & Why mixed-precision How mixed-precision Mixed-precision tools on PAI-tensorflow Experimental results 1 What is mixed-precision

1.38k views • 45 slides

MIXED PRECISION TRAINING: THEORY AND PRACTICE Paulius Micikevicius What is Mixed Precision

MIXED PRECISION TRAINING: THEORY AND PRACTICE Paulius Micikevicius What is Mixed Precision Training? Reduced precision tensor math with FP32 accumulation, FP16 storage Successfully used to train a variety of: Well known public

1.61k views • 37 slides

MIXED PRECISION TRAINING OF DEEP NEURAL NETWORKS Carl Case, NVIDIA OUTLINE 1. What is mixed

MIXED PRECISION TRAINING OF DEEP NEURAL NETWORKS Carl Case, NVIDIA OUTLINE 1. What is mixed precision training? 2. Considerations and methodology for mixed precision training 3. Automatic mixed precision 4. Performance guidelines and

1.03k views • 41 slides

Automated Mixed-Precision for TensorFlow Training Reed Wanderman-Milne (Google) and Nathan Luehr

Automated Mixed-Precision for TensorFlow Training Reed Wanderman-Milne (Google) and Nathan Luehr (NVIDIA) March 20, 2019 Mixed Precision Training Background What is Mixed Precision? Using a mix of float32 and float16 precisions float16 is

697 views • 38 slides

EFFECTIVE USE OF MIXED PRECISION FOR HPC Kate Clark, Smoky Mountain Conference 2019 Why Mixed

EFFECTIVE USE OF MIXED PRECISION FOR HPC Kate Clark, Smoky Mountain Conference 2019 Why Mixed Precision Lattice Quantum Chromodynamics Mixed Precision and Krylov Solvers AGENDA Mixed Precision and Multigrid Tensor cores Future Challenges

945 views • 47 slides

Taking Advantage of Low Precision to Accelerate Training and Inference Using PyTorch Presented

Taking Advantage of Low Precision to Accelerate Training and Inference Using PyTorch Presented by: Myle Ott and Sergey Edunov Facebook AI Research (FAIR) Talk ID: S9832 Overview Mixed precision training in PyTorch: 3-4x speedups in

852 views • 60 slides

MIXED PRECISION Boris Ginsburg, Sergei Nikolaev, Paulius Micikevicius bginsburg, pauliusm,

TRAINING WITH MIXED PRECISION Boris Ginsburg, Sergei Nikolaev, Paulius Micikevicius bginsburg, pauliusm, snikolaev@nvidia.com 05/11/2017 ACKNOWLEDGMENTS Michael Houston, Hao Wu, Oleksii Kuchaiev, Ahmad Kiswani, Amir Gholaminejad, Ujval

598 views • 55 slides

AUTOMATIC MIXED PRECISION IN PYTORCH Michael Carilli and Michael Ruberry, 3/20/2019 THIS TALK

AUTOMATIC MIXED PRECISION IN PYTORCH Michael Carilli and Michael Ruberry, 3/20/2019 THIS TALK Using mixed precision and Volta/Turing your networks can be: 1. 2-4x faster 2. more memory-efficient 3. just as powerful with no architecture change.

1.14k views • 50 slides

Mixed Feelings about Mixed Precision? Judy Hill Scientific Computing Group Leader, Center for

Mixed Feelings about Mixed Precision? Judy Hill Scientific Computing Group Leader, Center for Computational Sciences Stuart Slattery Computational Scientist, Computational Sciences and Engineering August 27, 2019 Smoky Mountains Computational

441 views • 5 slides

Accelerate Iterative Methods Good Algorithms Mixed Precision Iterative Methods Good

Accelerate Iterative Methods Good Algorithms Mixed Precision Iterative Methods Good Preconditioners using High Precision Arithmetic Parallel Algorithms Good Implementations Hidehiko Hasegawa Accurate Computations

499 views • 32 slides

Training of Convolutional Neural Networks (CNNs) Typical Datasets Typical Networks CIFAR10

Multi-Precision Policy Enforced Training (MuPPET) Multi-Precision Policy Enforced Training (MuPPET) A precision-switching strategy for quantised fixed-point training of CNNs A precision-switching strategy for quantised fixed-point training of

430 views • 23 slides

An Error Correction Solver for Linear Systems Evaluation of Mixed Precision Implementations

Karlsruhe Institute of Technology An Error Correction Solver for Linear Systems Evaluation of Mixed Precision Implementations Hartwig Anzt Berkeley, June 23rd 2010 Engineering Mathematics and Computing Lab (EMCL) KIT University of the

915 views • 23 slides

Adaptive Mixed Precision Kernel Recursive Least Squares JunKyu Lee, Hans Vandierendonck,

Adaptive Mixed Precision Kernel Recursive Least Squares JunKyu Lee, Hans Vandierendonck, Dimitrios S. Nikolopoulos En Entrans Key Message from this talk (Two Fold) Introduction to Transprecision Computing: - What? Why? How?

444 views • 25 slides

Mixed models in R using the lme4 package Part 6: Theory of linear mixed models, evaluating

Definition PLS Cholesky Likelihood Mixed models in R using the lme4 package Part 6: Theory of linear mixed models, evaluating precision of estimates Douglas Bates University of Wisconsin - Madison and R Development Core Team

513 views • 34 slides

Corporate Overview Cirrus is a leader in audio, video, and precision mixed-signal ICs for

Corporate Overview Cirrus is a leader in audio, video, and precision mixed-signal ICs for consumer entertainment, automotive, and industrial applications Premier Silicon for Digital Entertainment Electronics! Entertainment Processors Plus

534 views • 30 slides

Using Mixed Precision in Numerical Computations to Speedup Linear Algebra Solvers Jack Dongarra,

Using Mixed Precision in Numerical Computations to Speedup Linear Algebra Solvers Jack Dongarra, UTK/ORNL/U Manchester Azzam Haidar, Nvidia Nick Higham, U of Manchester Stan Tomov, UTK Slides can be found: http://bit.ly/icerm-05-2020-dongarra

529 views • 36 slides

SWALP: Stochastic Weight Averaging in Low-Precision Training Guandao Yang, Tianyi Zhang,

SWALP: Stochastic Weight Averaging in Low-Precision Training Guandao Yang, Tianyi Zhang, Polina Kirichenko, Junwen Bai, Andrew Gordon Wilson, Christopher De Sa Low-precision Computation Problem Statement We study how to leverage

514 views • 19 slides

Mixed Precision Neural Architecture Search for Energy Efficient Deep Learning Chengyue Gong* 1 ,

Mixed Precision Neural Architecture Search for Energy Efficient Deep Learning Chengyue Gong* 1 , Zixuan Jiang * 2 , Dilin Wang 1 , Yibo Lin 2 , Qiang Liu 1 , and David Z. Pan 2 1 CS Department, 2 ECE Department The University of Texas at Austin

329 views • 31 slides

Fast, scalable and accurate finite-element based ab initio calculations using mixed precision

Fast, scalable and accurate finite-element based ab initio calculations using mixed precision computing Vikram Gavini Department of Mechanical Engineering Department of Materials Science and Engineering University of Michigan, Ann Arbor

412 views • 29 slides

Power-Aware Performance of Mixed-Precision Linear Solvers for FPGAs and GPGPUs Tennessee Advanced

Power-Aware Performance of Mixed-Precision Linear Solvers for FPGAs and GPGPUs Tennessee Advanced Computing Laboratory University of Tennessee July 14 th 2010 JunKyu Lee, Junqing Sun, Gregory D. Peterson, Robert J. Harrison, Robert J. Hinde This

380 views • 18 slides

Software Tools for Mixed-Precision Program Analysis Mike Lam James Madison University Lawrence

Software Tools for Mixed-Precision Program Analysis Mike Lam James Madison University Lawrence Livermore National Lab About Me Ph.D in CS from University of Maryland ('07-'14) Topic: Automated floating-point program analysis (w/ Jeff

532 views • 34 slides

MAAC Precision Aerobatics MAAC Precision Aerobatics JUDGES TRAINING JUDGES TRAINING

MAAC Precision Aerobatics MAAC Precision Aerobatics JUDGES TRAINING JUDGES TRAINING PRESENTATION PRESENTATION 2008 2008 1 SCHEMATIC MANEUVER DIAGRAMS SCHEMATIC MANEUVER DIAGRAMS INTERMEDIATE INTERMEDIATE 2 1 Takeoff It is not

321 views • 17 slides

Experiments with Mixed Prevision Algorithms in Linear Algebra Jack Dongarra (UTK/ORNL/U

Experiments with Mixed Prevision Algorithms in Linear Algebra Jack Dongarra (UTK/ORNL/U Manchester) Azzam Haidar (Nvidia) Stan Tomov (UTK) Nick Higham (U of Manchester) 8/28/19 1 Mixed Precision Today many precisions to deal with (IEEE

391 views • 26 slides

A Quantitative Assessment of Flight Training Effectiveness in Mixed Reality Peter Bellows 1 , Amy

IT 2 EC 2020 A Quantitative Assessment of Flight Training Effectiveness in Mixed Reality Presentation/Panel A Quantitative Assessment of Flight Training Effectiveness in Mixed Reality Peter Bellows 1 , Amy Dideriksen 1 , Joe Williams 1 , Tom

236 views • 4 slides