LLVM Numerics Improvements • Michael C. Berg, Apple LLVM Developers’ Meeting, Brussels, Belgium, April 2019 � 1
Agenda • Handling Numerics via Flags • Current LLVM Numerics Models • How Unsafe Changes Behavior • Mixed Mode • Flag Guided Optimizations • Conclusions � 2
Handling Numerics via Flags Module and IR Language Flags Introduced Front Ends LLVM IR IR Flags Mid Level Translated to Optimizer SDNode IR Flags Translated DAG Lowering to MachineInstr SDNode SelectionDAG GlobalIsel MachineInstr Targeted Backends � 3
Agenda • Handling Numerics via Flags • Current LLVM Numerics Models • How Unsafe Changes Behavior • Mixed Mode • Flag Guided Optimizations • Conclusions � 4
Current LLVM Numerics Models • Unsafe : module-wide scope overrides Fast Math Flags (FMF). • Fast-Math: IR scope, FMFs all set. • Precise-Math: IR scope, FMFs all unset, IEEE–754. � 5
Current LLVM Numerics Models Models and their Flags Unsafe Fast-Math Precise- Math � 6
Current LLVM Numerics Models Nsz: Allow optimizations to treat the sign of a zero argument or result as insignificant. Models and Nsz their Flags Unsafe Overrides Fast-Math √ Precise- X Math � 6
Current LLVM Numerics Models Nsz: Allow optimizations to treat the sign of a zero argument or result as insignificant. Models and Nsz Nnan their Flags Nnan: Allow optimizations to assume the arguments and result are not NaN. Unsafe Overrides Overrides Fast-Math √ √ Precise- X X Math � 6
Current LLVM Numerics Models Nsz: Allow optimizations to treat the sign of a zero argument or result as insignificant. Models and Nsz Nnan Ninf their Flags Nnan: Allow optimizations to assume the arguments and result are not NaN. Ninf : Allow optimizations to assume the arguments Unsafe Overrides Overrides Overrides and result are not +/-Inf. Fast-Math √ √ √ Precise- X X X Math � 6
Current LLVM Numerics Models Nsz: Allow optimizations to treat the sign of a zero argument or result as insignificant. Models and Nsz Nnan Ninf Arcp their Flags Nnan: Allow optimizations to assume the arguments and result are not NaN. Ninf : Allow optimizations to assume the arguments Unsafe Overrides Overrides Overrides Overrides and result are not +/-Inf. Arcp : Allow optimizations to use reciprocal operations with approximate expressions. Fast-Math √ √ √ √ Precise- X X X X Math � 6
Current LLVM Numerics Models Nsz: Allow optimizations to treat the sign of a zero argument or result as insignificant. Models and Nsz Nnan Ninf Arcp Contract their Flags Nnan: Allow optimizations to assume the arguments and result are not NaN. Ninf : Allow optimizations to assume the arguments Unsafe Overrides Overrides Overrides Overrides Overrides and result are not +/-Inf. Arcp : Allow optimizations to use reciprocal operations with approximate expressions. Fast-Math √ √ √ √ √ Contract : Allow floating-point contraction (e.g. fusing a multiply add/sub). Precise- X X X X X Math 6 �
Current LLVM Numerics Models Nsz: Allow optimizations to treat the sign of a zero argument or result as insignificant. Models and Nsz Nnan Ninf Arcp Contract Reassoc their Flags Nnan: Allow optimizations to assume the arguments and result are not NaN. Ninf : Allow optimizations to assume the arguments Unsafe Overrides Overrides Overrides Overrides Overrides Overrides and result are not +/-Inf. Arcp : Allow optimizations to use reciprocal operations with approximate expressions. Fast-Math √ √ √ √ √ √ Contract : Allow floating-point contraction (e.g. fusing a multiply add/sub). Reassoc : Allow reassociation transformations on floating-point instructions. Precise- X X X X X X Math 6 �
Current LLVM Numerics Models Nsz: Allow optimizations to treat the sign of a zero argument or result as insignificant. Models and Nsz Nnan Ninf Arcp Contract Reassoc Afn their Flags Nnan: Allow optimizations to assume the arguments and result are not NaN. Ninf : Allow optimizations to assume the arguments Unsafe Overrides Overrides Overrides Overrides Overrides Overrides Overrides and result are not +/-Inf. Arcp : Allow optimizations to use reciprocal operations with approximate expressions. Fast-Math √ √ √ √ √ √ √ Contract : Allow floating-point contraction (e.g. fusing a multiply add/sub). Reassoc : Allow reassociation transformations on floating-point instructions. Precise- X X X X X X X Math Afn : Allow substitution of approximate calculations for functions (sin, log, cos, etc). 6 �
Current LLVM Numerics Models FMF Precision and Behavior Math operation order changed IEEE behavior changed IEEE precision changed Notes: The above FMF on IR maps to the same optimizations as Unsafe � 7
Current LLVM Numerics Models FMF Precision and Nsz Behavior Math operation √ order changed IEEE behavior √ changed IEEE precision √ changed Notes: The above FMF on IR maps to the same optimizations as Unsafe � 7
Current LLVM Numerics Models FMF Precision and Nsz Nnan Behavior Math operation √ √ order changed IEEE behavior √ √ changed IEEE precision √ √ changed Notes: The above FMF on IR maps to the same optimizations as Unsafe � 7
Current LLVM Numerics Models FMF Precision and Nsz Nnan Ninf Behavior Math operation √ √ X order changed IEEE behavior √ √ √ changed IEEE precision √ √ √ changed Notes: The above FMF on IR maps to the same optimizations as Unsafe � 7
Current LLVM Numerics Models FMF Precision and Nsz Nnan Ninf Arcp Behavior Math operation √ √ X NA order changed IEEE behavior √ √ √ √ changed IEEE precision √ √ √ √ changed Notes: The above FMF on IR maps to the same optimizations as Unsafe � 7
Current LLVM Numerics Models FMF Precision and Nsz Nnan Ninf Arcp Contract Behavior Math operation √ √ X NA √ order changed IEEE behavior √ √ √ √ √ changed IEEE precision √ √ √ √ √ changed Notes: The above FMF on IR maps to the same optimizations as Unsafe � 7
Current LLVM Numerics Models FMF Precision and Nsz Nnan Ninf Arcp Contract Reassoc Behavior Math operation √ √ X NA √ √ order changed IEEE behavior √ √ √ √ √ √ changed IEEE precision √ √ √ √ √ √ changed Notes: The above FMF on IR maps to the same optimizations as Unsafe � 7
Current LLVM Numerics Models FMF Precision and Nsz Nnan Ninf Arcp Contract Reassoc Changing order of Behavior operations may cause rounding differences, NaN Math and Inf instances may operation √ √ X NA √ √ order materialize in new ways or changed even disappear, generalizing the intended values expected in user code. IEEE behavior √ √ √ √ √ √ changed IEEE precision √ √ √ √ √ √ changed Notes: The above FMF on IR maps to the same optimizations as Unsafe � 7
Current LLVM Numerics Models FMF Precision and Nsz Nnan Ninf Arcp Contract Reassoc Afn Changing order of Behavior operations may cause rounding differences, NaN Math and Inf instances may operation √ √ X NA √ √ NA order materialize in new ways or changed even disappear, generalizing the intended values expected in user code. IEEE behavior √ √ √ √ √ √ √ changed IEEE precision √ √ √ √ √ √ √ changed Notes: The above FMF on IR maps to the same optimizations as Unsafe � 7
Current LLVM Numerics Models FMF Precision and Nsz Nnan Ninf Arcp Contract Reassoc Afn Fast Changing order of Behavior operations may cause rounding differences, NaN Math and Inf instances may operation √ √ X NA √ √ NA √ order materialize in new ways or changed even disappear, generalizing the intended values expected in user code. IEEE behavior √ √ √ √ √ √ √ √ changed IEEE precision √ √ √ √ √ √ √ √ changed Notes: The above FMF on IR maps to the same optimizations as Unsafe � 7
Current LLVM Numerics Models � 8
Current LLVM Numerics Models Model Attributes Fine Grain Control IR annotated with flags NaNs and Infs Preserved Best Performance and Size IEEE Compliant � 8
Current LLVM Numerics Models Model Attributes Unsafe Fine Grain X Control IR annotated with NA flags NaNs and Infs X Preserved Best Performance √ and Size IEEE Compliant X � 8
Current LLVM Numerics Models Model Attributes Unsafe Fast-math Fine Grain X √ Control IR annotated with NA √ flags NaNs and Infs X X Preserved Best Performance √ √ and Size IEEE Compliant X X � 8
Current LLVM Numerics Models Model Attributes Unsafe Fast-math Precise-math Fine Grain X √ √ Control IR annotated with NA √ None or arcp flags NaNs and Infs X X √ Preserved Best Performance √ √ X and Size IEEE Compliant X X √ � 8
Recommend
More recommend