Symbolic Crosschecking of Floating-Point and SIMD Code Peter - PowerPoint PPT Presentation

Symbolic Crosschecking of Floating-Point and SIMD Code Peter Collingbourne, Cristian Cadar, Paul H J Kelly Department of Computing, Imperial College London 13 April, 2011 1

SIMD ◮ Single Instruction Multiple Data ◮ A popular means of improving the performance of programs by exploiting data level parallelism ◮ SIMD vectorised code operates over one-dimensional arrays of data called vectors m128 c = mm mul ps (a , b ) ; / ∗ c = { a [ 0 ] ∗ b [ 0 ] , a [ 1 ] ∗ b [ 1 ] , a [ 2 ] ∗ b [ 2 ] , a [ 3 ] ∗ b [ 3 ] } ∗ / ◮ SIMD code is typically translated manually based on a reference scalar implementation ◮ Manually translating scalar code into an equivalent SIMD version is a difficult and error-prone task 2

SIMD and Floating Point ◮ SIMD vectorised code frequently makes intensive use of floating point arithmetic ◮ Developers have to reason about subtle floating point semantics: ◮ Associativity ◮ Distributivity ◮ Precision ◮ Rounding 3

Spot the Difference Scalar out [ 0 ] = x [ 0 ] ∗ y [ 0 ] ∗ z [ 0 ] ; SIMD outv = mm mul ps ( xv , mm mul ps ( yv , zv ) ) ; Scalar out [ 0 ] = std : : min ( x [ 0 ] , y [ 0 ] ) ; SIMD outv = mm min ps ( xv , yv ) ; 4

min and max are not commutative or associative in FP! Scalar out [ 0 ] = std : : min ( x [ 0 ] , y [ 0 ] ) ; SIMD outv = mm min ps ( xv , yv ) ; ◮ SSE mm min ps : min ( X , Y ) = Select ( X < ord Y , X , Y ) ◮ X < ord Y evaluates to false if either of X or Y is NaN min ( X , NaN ) = NaN min ( NaN , Y ) = Y min ( min ( X , NaN ) , Y ) = min ( NaN , Y ) = Y min ( X , min ( NaN , Y )) = min ( X , Y ) 5

min and max are not commutative or associative in FP! Scalar out [ 0 ] = std : : min ( x [ 0 ] , y [ 0 ] ) ; SIMD outv = mm min ps ( xv , yv ) ; ◮ SSE mm min ps : min ( X , Y ) = Select ( X < ord Y , X , Y ) ◮ X < ord Y evaluates to false if either of X or Y is NaN min ( X , NaN ) = NaN min ( NaN , Y ) = Y min ( min (1 , NaN ) , 200) = min ( NaN , 200) = 200 min (1 , min ( NaN , 200)) = min (1 , 200) = 1 6

min and max are not commutative or associative in FP! Scalar out [ 0 ] = std : : min ( x [ 0 ] , y [ 0 ] ) ; SIMD outv = mm min ps ( xv , yv ) ; ◮ SSE mm min ps : min ( X , Y ) = Select ( X < ord Y , X , Y ) ◮ X < ord Y evaluates to false if either of X or Y is NaN ◮ libstdc++ std::min stl min ( X , Y ) = min ( Y , X ) ◮ out [0] = min ( x [0] , y [0]) ◮ outv [0] = min ( yv [0] , xv [0]) 7

Symbolic Execution for SIMD ◮ A novel automatic technique based on symbolic execution for verifying that the SIMD version of a piece of code is equivalent to its (original) scalar version ◮ Symbolic execution can automatically explore multiple paths through the program ◮ Determines the feasibility of a particular path by reasoning about all possible values using a constraint solver 8

Challenges ◮ Huge number of paths involved in typical SIMD vectorisations ◮ The current generation of symbolic execution tools lack symbolic support for floating point and SIMD ◮ Due to lack of available constraint solvers ◮ (Recent development: floating point support in CBMC) 9

Architecture scalar code x[i] * y[i] * z[i] execution SIMD code engine mm mul ps(xv, mm mul ps(yv, zv)) test harness assert(scalar(...) == simd(...)); execution engine 10

Architecture scalar code x[i] * y[i] * z[i] execution SIMD code engine mm mul ps(xv, mm mul ps(yv, zv)) test harness assert(scalar(...) == simd(...)); execution engine choose (scalar path, SIMD path) 11

Architecture scalar code x[i] * y[i] * z[i] execution SIMD code engine mm mul ps(xv, mm mul ps(yv, zv)) test harness assert(scalar(...) == simd(...)); execution engine choose paths no mismatch (scalar path, equiv? found! SIMD path) yes 12

Architecture scalar code x[i] * y[i] * z[i] execution SIMD code engine mm mul ps(xv, mm mul ps(yv, zv)) test harness assert(scalar(...) == simd(...)); execution engine no more paths all paths equivalent choose paths no mismatch (scalar path, equiv? found! SIMD path) yes 13

Symbolic Execution – Operation ◮ Program runs on symbolic input , initially unconstrained ◮ Each variable may hold either a concrete or a symbolic value ◮ Symbolic value: an input dependent expression consisting of mathematical or boolean operations and symbols ◮ For example, an integer variable i may hold a value such as x + 3 ◮ When program reaches a branch depending on symbolic input ◮ Determine feasibility of each side of the branch ◮ If both feasible, fork execution and follow each path separately, adding corresponding constraints on each side 15

Symbolic Execution – Example int x; mksymbolic(x); ∅ if (x > 0) { ... } else { ... } if (x > 10) { ... } else { ... } 16

Symbolic Execution – Example int x; mksymbolic(x); ∅ if (x > 0) { ... } else { ... } x > 0 ¬ ( x > 0) if (x > 10) { ... } else { ... } 17

Symbolic Execution – Example int x; mksymbolic(x); ∅ if (x > 0) { ... } else { ... } x > 0 ¬ ( x > 0) if (x > 10) { ... } else { ... ¬ ( x > 10) ¬ ( x > 10) x > 10 x > 10 } 18

Architecture scalar code x[i] * y[i] * z[i] static path SIMD code merging mm mul ps(xv, mm mul ps(yv, zv)) test harness assert(scalar(...) == simd(...)); execution engine no more paths all paths equivalent choose paths no mismatch (scalar path, equiv? found! SIMD path) yes 21

Static Path Merging ( unsigned i = 0; i < N; ++i ) { for d i f f [ i ] = x [ i ] > y [ i ] ? x [ i ] − y [ i ] : y [ i ] − x [ i ] ; } ◮ 2 N paths! 22

Static Path Merging diff( x , y ) = x > y ? x − y : y − x A ¬ ( x > y ) x > y ... ... B C %r1 = ”x − y” %r2 = ”y − x” D %r = phi [%r1, %B], [%r2, %C] ... 23

Static Path Merging diff( x , y ) = x > y ? x − y : y − x ... ABCD A’ %p = ”x > y” A ¬ ( x > y ) x > y ... B ... ... B C %r1 = ”x − y” %r1 = ”x − y” %r2 = ”y − x” ... C %r2 = ”y − x” D %r = phi [%r1, %B], [%r2, %C] ... D’ %r = select %p, %r1, %r2 ... 24

Static Path Merging diff( x , y ) = x > y ? x − y : y − x ... ABCD A’ %p = ”x > y” A ¬ ( x > y ) x > y ... B ... ... B C %r1 = ”x − y” %r1 = ”x − y” %r2 = ”y − x” ... C %r2 = ”y − x” D %r = phi [%r1, %B], [%r2, %C] ... D’ %r = select %p, %r1, %r2 ... ◮ morph benchmark, 16 × 16 matrix: 2 256 → 1 25

Technique ◮ The requirements for equality of two floating point expressions are harder to satisfy than for integers ◮ Usually, the two expressions need to be built up in the same way to be sure of equality ◮ We can check expression equivalence via simple expression matching! 28

Architecture scalar code x[i] * y[i] * z[i] static path SIMD code merging mm mul ps(xv, mm mul ps(yv, zv)) test harness assert(scalar(...) == simd(...)); execution engine no more paths all paths equivalent choose paths no mismatch (scalar path, equiv? found! SIMD path) yes 29

Architecture scalar code x[i] * y[i] * z[i] static path SIMD code merging mm mul ps(xv, mm mul ps(yv, zv)) test harness assert(scalar(...) == simd(...)); execution engine no more paths all paths equivalent choose paths no canonical- mismatch (scalar path, equiv? isation found! SIMD path) yes 30

Scalar/SIMD Implementation void zlimit(int simd, float *src, float *dst, size_t size) { if (simd) { __m128 zero4 = _mm_set1_ps(0.f); while (size >= 4) { __m128 srcv = _mm_loadu_ps(src); __m128 cmpv = _mm_cmpgt_ps(srcv, zero4); __m128 dstv = _mm_and_ps(cmpv, srcv); _mm_storeu_ps(dst, dstv); src += 4; dst += 4; size -= 4; } } while (size) { *dst = *src > 0.f ? *src : 0.f; src++; dst++; size--; } } 31

Symbolic Crosschecking of Floating-Point and SIMD Code Peter - PowerPoint PPT Presentation

Symbolic Crosschecking of Floating-Point and SIMD Code Peter Collingbourne, Cristian Cadar, Paul H J Kelly Department of Computing, Imperial College London 13 April, 2011 1 SIMD Single Instruction Multiple Data A popular means of

Debugging Floating-Point Debugging Floating-Point Debugging Floating-Point Math in Racket Math

Formal verification of floating-point algorithms John Harrison Intel Corporation Floating

SIMD+ Overview Illiac IV History Early machines First massively parallel (SIMD) computer

SIMD+ Overview Illiac IV History Early machines First massively parallel (SIMD) computer

Floating-point numbers Fractional binary numbers IEEE floating-point standard Floating-point

Lecture 3 Floating Point Representations 1 Floating-point arithmetic We often incur

Machine numbers: how floating point numbers are stored? Floating-point number representation

Floating point Today ! IEEE Floating Point Standard ! Rounding ! Floating Point Operations !

SIMD+ Overview Illiac IV History Early machines First massively

ECS 231 Computer Arithmetic 1 / 27 Outline Floating-point numbers and representations 1

9/20/2018 Today: Floating Point Background: Fractional binary numbers IEEE floating point

2/10/2020 Today: Floating Point Background: Fractional binary numbers IEEE floating point

15-213 The course that gives CMU its Zip! Floating Point Sept 6, 2006 Topics Topics

Decidability Decidability and Symbolic Symbolic Verification Symbolic Symbolic Verification

Parallel Programming and Heterogeneous Computing SIMD: Integrated Accelerators Max Plauth, Sven

SIMD Programming CS 240A, 2017 1 Flynn* Taxonomy, 1966 In 2013, SIMD and MIMD most common

Challenges in GPGPU architectures: fixed-function units and regularity Sylvain Collange CARAMEL

Simulation of OpenCL and APUs on Multi2Sim 4.1 Rafael Ubal, David Kaeli Conference title 1

Microarchitectural Mechanisms to Exploit Value Structure in SIMT Architectures Ji Kim,

Visualization of OpenCL Application Execution on CPU-GPU Systems A. Ziabari, R. Ubal, D.

Dynamic Front End Sharing In Graphics Dynamic Front End Sharing In Graphics Processing

Building an AI that Codes http:// chris cummins. cc 2013 2014 2015 + 2016 What makes a good

Decision Tree Ensembles Random Forest & Gradient Boosting CSE 416 Quiz Section 4/26/2018

Welcome to the course! Importing Data in Python I Import data Flat files, e.g. .txts,

Sambuz

Useful Links

Newsletter

Mail Us

Symbolic Crosschecking of Floating-Point and SIMD Code Peter - PowerPoint PPT Presentation

Symbolic Crosschecking of Floating-Point and SIMD Code Peter Collingbourne, Cristian Cadar, Paul H J Kelly Department of Computing, Imperial College London 13 April, 2011 1 SIMD Single Instruction Multiple Data A popular means of

Debugging Floating-Point Debugging Floating-Point Debugging Floating-Point Math in Racket Math

Formal verification of floating-point algorithms John Harrison Intel Corporation Floating

SIMD+ Overview Illiac IV History Early machines First massively parallel (SIMD) computer

SIMD+ Overview Illiac IV History Early machines First massively parallel (SIMD) computer

Floating-point numbers Fractional binary numbers IEEE floating-point standard Floating-point

Lecture 3 Floating Point Representations 1 Floating-point arithmetic We often incur

Machine numbers: how floating point numbers are stored? Floating-point number representation

Floating point Today ! IEEE Floating Point Standard ! Rounding ! Floating Point Operations !

SIMD+ Overview Illiac IV History Early machines First massively

ECS 231 Computer Arithmetic 1 / 27 Outline Floating-point numbers and representations 1

9/20/2018 Today: Floating Point Background: Fractional binary numbers IEEE floating point

2/10/2020 Today: Floating Point Background: Fractional binary numbers IEEE floating point

15-213 The course that gives CMU its Zip! Floating Point Sept 6, 2006 Topics Topics

Decidability Decidability and Symbolic Symbolic Verification Symbolic Symbolic Verification

Parallel Programming and Heterogeneous Computing SIMD: Integrated Accelerators Max Plauth, Sven

SIMD Programming CS 240A, 2017 1 Flynn* Taxonomy, 1966 In 2013, SIMD and MIMD most common

Challenges in GPGPU architectures: fixed-function units and regularity Sylvain Collange CARAMEL

Simulation of OpenCL and APUs on Multi2Sim 4.1 Rafael Ubal, David Kaeli Conference title 1

Microarchitectural Mechanisms to Exploit Value Structure in SIMT Architectures Ji Kim,

Visualization of OpenCL Application Execution on CPU-GPU Systems A. Ziabari*, R. Ubal*, D.

Dynamic Front End Sharing In Graphics Dynamic Front End Sharing In Graphics Processing

Building an AI that Codes http:// chris cummins. cc 2013 2014 2015 + 2016 What makes a good

Decision Tree Ensembles Random Forest &amp; Gradient Boosting CSE 416 Quiz Section 4/26/2018

Welcome to the course! Importing Data in Python I Import data Flat files, e.g. .txts,

Sambuz

Useful Links

Newsletter

Mail Us

Visualization of OpenCL Application Execution on CPU-GPU Systems A. Ziabari, R. Ubal, D.

Decision Tree Ensembles Random Forest & Gradient Boosting CSE 416 Quiz Section 4/26/2018