AN ANAL ALYSIS A SIS AND SYNTHESIS ND SYNTHESIS OF OF FL FLOATING TING-POIN OINT T ROUTINES OUTINES Zvonimir Rakamarić
FL FLOATING TING-POINT POINT COMPUT COMPUTATIONS TIONS ARE ARE UBIQUIT UBIQUITOUS OUS
CHALLENGES CHALLENGES FP is “weird” Does not faithfully match math (finite precision) Non-associative Heterogeneous hardware support FP code is hard to get right Lack of good understanding Lack of good and extensive tool support FP software is large and complex High-performance computing (HPC) simulations Machine learning
FP IS FP IS WEIRD WEIRD Finite precision and rounding x + y in reals ≠ x + y in floating-point Non-associative (x + y) + z ≠ x + (y + z) Creates issues with Compiler optimizations (e.g., vectorization) Concurrency (e.g., reductions) Standard completely specifies only +, -, *, /, comparison, remainder, and square root Only recommendation for some functions (trigonometry)
FP IS FP IS WEIRD WEIRD cont cont. Heterogeneous hardware support x + y*z on Xeon ≠ x + y*z on Xeon Phi Fused multiply-add Intel’s online article “Differences in Floating -Point Arithmetic Between Intel Xeon Processors and the Intel Xeon Phi Coprocessor” Common sense does not (always) work x “is better than” log( e^x) (e^x- 1)/x “can be worse than” (e^x -1)/log(e^x) Error cancellation
FL FLOATING TING-POINT POINT NUMBERS NUMBERS IEEE 754 standard Sign (s), mantissa (m), exponent (exp): (-1) s * 1.m * 2 exp Single precision: 1, 23, 8 bits Double precision: 1, 52, 11 bits
FL FLOATING TING-POINT POINT NUMBER NUMBER LINE LINE 3 bits for precision Between any two powers of 2, there are 2 3 = 8 representable numbers
ROUNDING OUNDING IS IS SOUR SOURCE CE OF OF ERR ERRORS ORS 𝒚 𝒛 - ∞ ∞ 0 Real Numbers 𝒚 𝒛 0 - ∞ ∞ Floating-Point Numbers ( 𝒚 − 𝒚) ( 𝒛 − 𝒛)
FL FLOATING TING-POINT POINT OPERA OPERATIONS TIONS First normalize to the same exponent Smaller exponent -> shift mantissa right Then perform the operation Losing bits when exponents are not the same!
UT UTAH AH FL FLOATING TING-POINT POINT TEAM TEAM 1. Ganesh Gopalakrishnan (prof) 2. Zvonimir Rakamarić (prof) 3. Ian Briggs (staff programmer) 4. Mark Baranowski (PhD) 5. Rocco Salvia (PhD) 6. Shaobo He (PhD) 7. Thanhson Nguyen (PhD) Alumni: Alexey Solovyev (postdoc), Wei-Fan Chiang (PhD), Dietrich Geisler (undergrad), Liam Machado (undergrad)
RESE RESEAR ARCH CH THR THRUSTS USTS Analysis Verification of floating-point programs Estimation of floating-point errors Dynamic 1. Best effort, produces lower bound (under-approximation) Static 2. Rigorous, produces upper bound (over-approximation) Synthesis Rigorous mixed-precision tuning Constraint Solving Search-based solving of floating-point constraints Solving mixed real and floating-point constraints
RESE RESEAR ARCH CH THR THRUSTS USTS Analysis Verification of floating-point programs Estimation of floating-point errors Dynamic 1. Best effort, produces lower bound (under-approximation) Static 2. Rigorous, produces upper bound (over-approximation) Synthesis Rigorous mixed-precision tuning Constraint Solving Search-based solving of floating-point constraints Solving mixed real and floating-point constraints
ERROR ANALYSIS
FL FLOATING TING-POINT POINT ERR ERROR OR Input values: x, y Finite precision Infinite precision z fp = f fp (x, y) z inf = f inf (x, y) ≠ z fp z inf Absolute error: | z fp – z inf | Relative error: | (z fp – z inf ) / z inf |
ERR ERROR OR PL PLOT FO T FOR R MUL MULTIPLICA TIPLICATION TION Absolute Error Y values X values
ERR ERROR OR PL PLOT FO T FOR R ADDIT ADDITION ION Absolute Error Y values X values
USA USAGE GE SCEN SCENARIOS ARIOS Reason about floating-point computations Precisely characterize floating-point behavior of libraries Support performance-precision tuning and synthesis Help decide where error-compensation is needed “Equivalence” checking
STATIC ANALYSIS http://github.com/soarlab/FPTaylor
CONTRIB CONTRIBUTIO UTIONS NS Handles non-linear and transcendental functions Tight error upper bounds Better than previous work Rigorous Over-approximation Based on our own rigorous global optimizer Emits a HOL-Lite proof certificate Verification of the certificate guarantees estimate Tool called FPTaylor publicly available
FPT FPTaylor aylor TOO OOLF LFLOW Given FP Obtain Obtain Maximize Expression Symbolic Error the Error and Input Taylor Form Function Function Intervals Generate Certificate in HOL-Lite
IEEE IEEE ROUNDING OUNDING MODEL MODEL Consider 𝑝𝑞 𝑦, 𝑧 where 𝑦 and 𝑧 are floating- point values, and 𝑝𝑞 is a function from floats to reals IEEE round-off errors are specified as 𝑝𝑞 𝑦, 𝑧 ⋅ 1 + 𝑓 𝑝𝑞 + 𝑒 𝑝𝑞 For subnormal values For normal values Only one of 𝑓 𝑝𝑞 or 𝑒 𝑝𝑞 is non-zero: 𝑓 𝑝𝑞 ≤ 2 −24 , 𝑒 𝑝𝑞 ≤ 2 −150 (single precision) 𝑓 𝑝𝑞 ≤ 2 −53 , 𝑒 𝑝𝑞 ≤ 2 −1075 (double precision)
ERR ERROR OR EST ESTIMA IMATION TION EXAMP EXAMPLE LE Model floating-point computation of 𝐹 = 𝑦/ 𝑦 + 𝑧 using reals as 𝑦 ෨ 𝐹 = ⋅ 1 + 𝑓 2 𝑦 + 𝑧 ⋅ 1 + 𝑓 1 𝑓 1 ≤ 𝜗 1 , 𝑓 2 ≤ 𝜗 2 Absolute rounding error is then ෨ 𝐹 − 𝐹 We have to find the max of this function over Input variables 𝑦, 𝑧 Exponential in the number of inputs Additional variables 𝑓 1 , 𝑓 2 for operators Exponential in floating-point routine size!
SYMBOLIC SYMBOLIC TAYL YLOR OR EXP EXPANSION ANSION Reduces dimensionality of the optimization problem Basic idea Treat each 𝑓 as “noise” (error) variables Now expand based on Taylor’s theorem Coefficients are symbolic Coefficients weigh the “noise” correctly and are correlated Apply global optimization on reduced problem Our own parallel rigorous global optimizer called Gelpia Non-linear reals, transcendental functions
ERR ERROR OR EST ESTIMA IMATION TION EXAMP EXAMPLE LE 𝑦 ෨ 𝐹 = ⋅ 1 + 𝑓 2 𝑦 + 𝑧 ⋅ 1 + 𝑓 1 expands into 𝐹 = 𝐹 + 𝜖 ෨ 0 × 𝑓 1 + 𝜖 ෨ 𝐹 𝐹 ෨ 0 × 𝑓 2 +M 2 𝜖𝑓 1 𝜖𝑓 2 where 𝑁 2 summarizes the second and higher order error terms and 𝑓 0 ≤ 𝜗 0 , 𝑓 1 ≤ 𝜗 1 Floating-point error is then bounded by 𝐹 − 𝐹 ≤ 𝜖 ෨ × 𝜗 1 + 𝜖 ෨ 𝐹 𝐹 ෨ 0 0 × 𝜗 2 +M 2 𝜖𝑓 1 𝜖𝑓 2
ERROR ERR OR EST ESTIMA IMATION TION EXAMP EXAMPLE LE Using global optimization find constant bounds M 2 can be easily over-approximated Greatly reduced problem dimensionality Search only over inputs 𝑦, 𝑧 using our Gelpia optimizer 𝜖෩ 𝐹 𝑦 ∀𝑦, 𝑧. = 𝑦+𝑧 ≤ 𝑉 1 𝜖𝑓1 0 𝐹 − 𝐹 ≤ 𝜖 ෨ × 𝜗 1 + 𝜖 ෨ 𝐹 𝐹 ෨ 0 0 × 𝜗 2 +M 2 𝜖𝑓 1 𝜖𝑓 2
ERR ERROR OR EST ESTIMA IMATION TION EXAMP EXAMPLE LE Operations are single-precision (32 bits) ෨ 𝐹 − 𝐹 ≤ 𝑉 1 × 𝜗 32−𝑐𝑗𝑢 +𝑉 2 × 𝜗 32−𝑐𝑗𝑢 Operations are double-precision (64 bits) ෨ 𝐹 − 𝐹 ≤ 𝑉 1 × 𝜗 64−𝑐𝑗𝑢 +𝑉 2 × 𝜗 64−𝑐𝑗𝑢
RESUL RESULTS TS FOR FOR JETENGINE JETENGINE
SUMMAR SUMMARY New method for rigorous floating-point round- off error estimation Our method is embodied in new tool FPTaylor FPTaylor performs well and returns tighter bounds than previous approaches
SYNTHESIS http://github.com/soarlab/FPTuner
MIXED MIXED-PRECISIO PRECISION N TUNING TUNING Goal: Given a real-valued expression and output error bound, automatically synthesize precision allocation for operations and variables
APPR APPROACH CH Replace machine epsilons with symbolic variables 𝑡 0 , 𝑡 1 ∈ 𝜗 32−𝑐𝑗𝑢 , 𝜗 64−𝑐𝑗𝑢 ෨ 𝐹 − 𝐹 ≤ 𝑉 1 × 𝑡 1 + 𝑉 2 × 𝑡 2 Compute precision allocation that satisfies given error bound Take care of type casts Implemented in FPTuner tool
FPT FPTuner uner TOO OOLF LFLOW Routine: Real-valued Expression Generic Gelpia Efficiency Error Global Model User Specifications Model Optimizer Error Threshold Gurobi Optimization Problem Operator Weights Extra Constraints Optimal Mixed- precision
EXAMP EXAMPLE: LE: JACOBI COBI METHOD METHOD Inputs: 2x2 matrix Vector of size 2 Error bound: 1e-14 Available precisions: single, double, quad FPTuner automatically allocates precisions for all variables and operations
SUMMAR SUMMARY Support mixed-precision allocation Based on rigorous formal reasoning Encoded as an optimization problem Extensive empirical evaluation Includes real-world energy measurements showing benefits of precision tuning
SOLVING http://github.com/soarlab/OL1V3R
Recommend
More recommend