DYNAMIC PRECISION NUMERICS USING A VARIABLE-PRECISION UNUM TYPE I HW - PowerPoint PPT Presentation

DYNAMIC PRECISION NUMERICS USING A VARIABLE-PRECISION UNUM TYPE I HW COPROCESSOR ARITH’26 | BOCCO Andrea | 11 June 2019

INTRODUCTION: STATE OF THE ART ➢ Variable Precision (VP) computing has been investigated to improve convergence of algorithms. It has been investigated in: ▪ Software (SW): GMP [2] and MPFR [3] ▪ Slow, they might not met requirements in high speed applications ▪ Hardware (HW): ▪ Kulisch [4] : large fixed point accumulator ▪ Schulte and Swartzlander [5] : mantissas divided in multiple words ➢ None of the previous works show how to store efficiently VP Floating Point (FP) number in main memory ▪ They support IEEE 754 FP format in main memory [1] IEEE754-2008 2008. IEEE Standard for Floating-Point Arithmetic. IEEE 754-2008 https://doi.org/10.1109/IEEESTD.2008.4610935 [2] Torbjörn Granlund and the GMP development team. 2012. GNU MP: The GNU Multiple Precision Arithmetic Library. https://gmplib.org/ [3] Laurent Fousse, et al. MPFR: A Multiple precision Binary Floating-point Library with Correct Rounding. https://doi.org/10.1145/1236463.1236468 [4] Ulirich Kulisch. 2013. Computer arithmetic and validity: Theory, implementation, and applications [5] M. J. Schulte and E. E. Swartzlander. 2000. A family of variable precision interval arithmetic processors. https://doi.org/10.1109/12.859535 | 2

INTRODUCTION: MY WORK Our previous work [6] : a VP FP hardware accelerator : • Supports the UNUM type I format in Rocket tile main memory 1 5 FPU RISC-V • Does computation internally with another Rocket LSU Chip R R (hardware friendly) FP format $ $ RoCC A A 2 L1 L1 • M M 3 Supports I nterval A rithmetic (IA) UNUM co-proc LSU Scratchpad 4 This work: ▪ Refines the UNUM type I FP format. ▪ Proposes a new VP FP architecture. ▪ Proposes a new programming model. ▪ Benchmarks our system. [6] A. Bocco, Y. Durand, F. Dinechin, 2019, SMURF: Scalar Multiple-precision UNUM RISC-V Floating-point Accelerator for Scientific Computing. | 3

OUTLINE • Choice of the memory format: the UNUM type I • Refinements on the UNUM type I FP format • The adopted VP FP Architecture • The programming model • System benchmark: gauss elimination solver • Conclusions | 4

CHOICE OF THE MEMORY FORMAT: THE UNUM TYPE I We decided to use the UNUM type I FP format in main memory • It is 6 sub-fields self-descriptive FP format es bits fs bits s e f u es-1 fs-1 sign exponent fraction ubit exponent fraction size size 3 more that conventional IEEE 754 FP numbers • WHY? • UNUM is a VP FP format • It self-encodes the exponent and fraction field lengths However UNUM type I has some peculiarities to be fixed: • How to organize UNUM arrays in main memory • How to organize the UNUM fields in memory | 6

REFINEMENTS ON THE UNUM TYPE I FP FORMAT: - UNUM FIELD ORGANIZATION For a UNUM/ubound which spans multiple addresses in main memory it is important to have the descriptor fields present in the lower addresses. ➢ We have re-organized the order of the fields for UNUM and ubound LSB MSB s u es-1 fs-1 e f 1 left right left right left right 2 s u es-1 fs-1 s u es-1 fs-1 e e f f 00--00 00--00 @1’: ? U1 @1’: U1 ? ? ? @2’: ? U2 ? ? ? FF--FF FF--FF p p | 8

REFINEMENTS ON THE UNUM TYPE I FP FORMAT: - UNUM ARRAY ORGANIZATION Handling a two-element UNUM array on main memory with p bits parallelism p p p U1_0 U1_1 U1 : U2_0 U2_1 U2_2 U2 : bit length 0 p 2p 3p 1 2 00--00 00--00 U3_0 U3_0 @1’: U1_0 @1’: U1_0 U3_1 U3_1 U1_1 U1_1 ! U3_2 U2_0 U3_2 @2’’: U2_0 U2_1 @2’: U3=U1*U2 U2_1 U2_2 Array support : Guarantee affine U2_2 addressing FF--FF FF--FF p p scheme | 9

THE ADOPTED VP FP ARCHITECTURE • 1 integer register file (iRF): 32 integer general purpose register (GPR) + pc, in the main processor. • 1 g-bound register file (gRF): 32 entries, in the co-processor. • UNUMs/u-bounds are strictly considered as memory formats: • Load operations: • Load UNUMs/u-bounds from the main memory, and converts them into internal g-bounds. • Store operations: • Convert internal g-bounds (entries of the internal gRF) into u-bounds. Store the latter the main memory. • The coprocessor internal parallelism is fixed to 64 bits • Coprocessor’s status registers: Rocket tile • 1 5 DUE FPU RISC-V • SUE Rocket • LSU Chip MBB NEW! R R $ $ • A A WGP RoCC 2 L1 L1 M M 3 UNUM co-proc LSU Scratchpad 4 | 11

THE MBB: MAXIMUM BYTE BUDGET UNUM format is variable length (up to a maximum length) ▪ It is impossible to have compacted arrays having random access to its elements ➢ We define the Maximum Byte Budget (MBB) as the maximum length that a UNUM number can have in main memory LSU MBB MBB u ’0 g0 u0 u’1 g1 u1 u’2 g2 G2U u2 BMF u’3 g3 u3 u’4 g4 u4 MBB ➢ The user can address VP FP numbers specifying their length with Byte granularity. | 12

THE BMF: BOUNDED MEMORY FORMAT ess ’ fss ’ es_max fs_max s u es-1 fs-1 1a) 0 1 1-----1 1-----1 1--------------1 1---------------------------------1 qNaN 2a) 1 1 1-----1 1-----1 1--------------1 1---------------------------------1 sNaN +∞↓ 3a) 0 0 1-----1 1-----1 1--------------1 1---------------------------------1 UNUSED BITS - ∞↓ MBB 4a) 1 0 1-----1 1-----1 1--------------1 1---------------------------------1 +∞) right >= 5a) 0 1 1-----1 1-----1 1--------------1 1-------------------------------10 (- ∞ left max unum lengh 6a) 1 1 1-----1 1-----1 1--------------1 1-------------------------------10 +∞) right 7a) 0 1 es-1 fs-1 1------1 1---------------------1 (- ∞ left 8a) 1 1 es-1 fs-1 1------------1 1------------------------1 9a) s u es-1 fs-1 e f x 1b) 0 1 1--------1 1--------1 qNaN 2b) 1 1 1--------1 1--------1 sNaN UNUSED BITS +∞↓ MBB 3b) 0 0 1--------1 1--------1 < - ∞↓ 4b) 1 0 1--------1 1--------1 +∞) right max unum lengh 5b) 0 1 es-1 fs-1 1------1 1---------------------1 (- ∞ left 6b) 1 1 es-1 fs-1 1------------1 1------------------------1 7b) s u es-1 fs-1 e f x s u es-1 fs-1 es fs fss ’’ ess ’’ bit length 0 MBB*8 | 13

THE COPROCESSOR PROGRAMMING MODEL Our hardware is best suited for VP kernels which exploit three different storage types: • The external (main memory) storage • The intermediate (L1 cache) storage • The internal (register-level) storage 01: k = 0 Legend: Outermost loop 02: while convergence not reached do · Intermediate loop 03: for i := 1:n do Ā = x b Innermost loop 04:  =0 05: for j := 1:n do Rocket tile 2 FPU 06: if j ≠ i then RISC-V (𝒍) LSU 07: 𝝉 += 𝒃 𝒋𝒌 𝒚 𝒌 R $ RoCC 08: end A UNUM L1 UNUM M 09: end co-proc LSU co-proc (𝒍+𝟐) = 𝝉 𝟐 3 1 Scratchpad 10: 𝒚 𝒋 𝒃 𝒋𝒋 (𝒄 𝒋 − 𝝉) 11: end 12: k=k+1 x 13: end | 15

SYSTEM BENCHMARK: GAUSS ELIMINATION SOLVER Our system benchmarked with a Gauss elimination solver, both in UNUM (scalar) and ubound (interval), showed: • A gain of up to 65 decimal digits on IEEE double • The result precision is constrained by the adopted precision in memory. • Intervals do not converge always but it is useful in the computational error estimation (Ax-b). • A speed up of 4-10x with respect to the MPFR software library | 17

DYNAMIC PRECISION NUMERICS USING A VARIABLE-PRECISION UNUM TYPE I HW - PowerPoint PPT Presentation

DYNAMIC PRECISION NUMERICS USING A VARIABLE-PRECISION UNUM TYPE I HW COPROCESSOR ARITH26 | BOCCO Andrea | 11 June 2019 INTRODUCTION: STATE OF THE ART Variable Precision (VP) computing has been investigated to improve convergence of

Numberjack User Guide May 27, 2013 1 Variables Constructor for the class Variable : Constructor

Sub-Riemannian geometry and numerics for SDEs Charles Curry May 9, 2019 SDE numerics The CMT

1. Foundations of Numerics from Advanced Mathematics 1. Foundations of Numerics from Advanced

LLVM Numerics Improvements Michael C. Berg, Apple LLVM Developers Meeting, Brussels,

9. Hardware-Aware Numerics Approaching supercomputing ... 9. Hardware-Aware Numerics Numerical

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

Mixed Precision Training PAI Overview What is mixed-precision

ENHANCING SCIENTIFIC COMPUTATION USING A VARIABLE PRECISION FPU WITH A RISC-V PROCESSOR Y.Durand,

in:Flux - Intelligent CFD Software Developed by Insight Numerics Slide 1

Detect3D Fire and Gas Mapping Developed by Insight Numerics Slide 1 info@insightnumerics.com

in:Flux - Intelligent CFD Software Developed by Insight Numerics Slide 1

Probabilistic Numerics Part II Linear Algebra and Nonlinear Optimization Philipp Hennig

EGRIN Ecoulements Gravitaires et RIsques Naturels Modeling and Numerics for (Shallow) (Water)

Probabilistic Numerics Part I Integration and Differential Equations Philipp Hennig

Probabilistic Numerics Uncertainty in Computation Philipp Hennig ParisBD 9 May 2017 Research

Oscillating particles in fluids: Theory, experiment and numerics Victor Yakhot Boston

Reinforcement Learning for Safe Decision-Making in Autonomous Driving Edouard Leurent 1,2,3 ,

TITRE DE LA THESE Pattern Analysis for Source-Code Performance Improvement Authors: Riyane SID

Smooth Path Planning for Cars Thierry Fraichard Dubins/Reeds & Shepp Car Kinematics = y

The Cost of Monotonicity in Distributed Graph Searching David Ilcinkas 1 Nicolas Nisse 2 David

Bayesian View Synthesis and Image-Based Rendering Principles 1 1 2 Sergi Pujades, Frdric

Robust adaptive discourse parsing for e-learning fora Nadine Lucas & Emmanuel Giguet Cnrs

Urban Sound Symposium April 4, 2019 Ghent University, Belgium Urban Low Barriers Jrme

The IF toolset VERIMAG M.Bozga, S. Graf, L. Mounier, Y. Lakhnech, Il. Ober, Iu. Ober , J. Sifakis

Sambuz

Useful Links

Newsletter

Mail Us

DYNAMIC PRECISION NUMERICS USING A VARIABLE-PRECISION UNUM TYPE I HW - PowerPoint PPT Presentation

DYNAMIC PRECISION NUMERICS USING A VARIABLE-PRECISION UNUM TYPE I HW COPROCESSOR ARITH26 | BOCCO Andrea | 11 June 2019 INTRODUCTION: STATE OF THE ART Variable Precision (VP) computing has been investigated to improve convergence of

Numberjack User Guide May 27, 2013 1 Variables Constructor for the class Variable : Constructor

Sub-Riemannian geometry and numerics for SDEs Charles Curry May 9, 2019 SDE numerics The CMT

1. Foundations of Numerics from Advanced Mathematics 1. Foundations of Numerics from Advanced

LLVM Numerics Improvements Michael C. Berg, Apple LLVM Developers Meeting, Brussels,

9. Hardware-Aware Numerics Approaching supercomputing ... 9. Hardware-Aware Numerics Numerical

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

Mixed Precision Training PAI Overview What is mixed-precision

ENHANCING SCIENTIFIC COMPUTATION USING A VARIABLE PRECISION FPU WITH A RISC-V PROCESSOR Y.Durand,

in:Flux - Intelligent CFD Software Developed by Insight Numerics Slide 1

Detect3D Fire and Gas Mapping Developed by Insight Numerics Slide 1 info@insightnumerics.com

in:Flux - Intelligent CFD Software Developed by Insight Numerics Slide 1

Probabilistic Numerics Part II Linear Algebra and Nonlinear Optimization Philipp Hennig

EGRIN Ecoulements Gravitaires et RIsques Naturels Modeling and Numerics for (Shallow) (Water)

Probabilistic Numerics Part I Integration and Differential Equations Philipp Hennig

Probabilistic Numerics Uncertainty in Computation Philipp Hennig ParisBD 9 May 2017 Research

Oscillating particles in fluids: Theory, experiment and numerics Victor Yakhot Boston

Reinforcement Learning for Safe Decision-Making in Autonomous Driving Edouard Leurent 1,2,3 ,

TITRE DE LA THESE Pattern Analysis for Source-Code Performance Improvement Authors: Riyane SID

Smooth Path Planning for Cars Thierry Fraichard Dubins/Reeds &amp; Shepp Car Kinematics = y

The Cost of Monotonicity in Distributed Graph Searching David Ilcinkas 1 Nicolas Nisse 2 David

Bayesian View Synthesis and Image-Based Rendering Principles 1 1 2 Sergi Pujades, Frdric

Robust adaptive discourse parsing for e-learning fora Nadine Lucas &amp; Emmanuel Giguet Cnrs

Urban Sound Symposium April 4, 2019 Ghent University, Belgium Urban Low Barriers Jrme

The IF toolset VERIMAG M.Bozga, S. Graf, L. Mounier, Y. Lakhnech, Il. Ober, Iu. Ober , J. Sifakis

Sambuz

Useful Links

Newsletter

Mail Us

Smooth Path Planning for Cars Thierry Fraichard Dubins/Reeds & Shepp Car Kinematics = y

Robust adaptive discourse parsing for e-learning fora Nadine Lucas & Emmanuel Giguet Cnrs