cuda accelerated fault tree analysis with c xsc
play

CUDA accelerated fault tree analysis with C-XSC Gabor Rebner 1 , - PowerPoint PPT Presentation

Table of Contents Motivation Definitions Implementation Conclusion References CUDA accelerated fault tree analysis with C-XSC Gabor Rebner 1 , Michael Beer 2 1 Department of Computer and Cognitive Sciences (INKO) University of Duisburg-Essen


  1. Table of Contents Motivation Definitions Implementation Conclusion References CUDA accelerated fault tree analysis with C-XSC Gabor Rebner 1 , Michael Beer 2 1 Department of Computer and Cognitive Sciences (INKO) University of Duisburg-Essen Duisburg, Germany 2 Institute for Risk & Uncertainty University of Liverpool Liverpool, UK 19.09.2012 1 / 19

  2. Table of Contents Motivation Definitions Implementation Conclusion References Table of Contents 1 Motivation 2 Definitions Verification CUDA Fault Tree Analysis 3 Implementation C++ and CUDA Evaluation 4 Conclusion Conclusion Future Work 2 / 19

  3. Table of Contents Motivation Definitions Implementation Conclusion References Motivation Implementation of verified fault tree analysis in C++ using high-performance GPU 1 computing Issues Using GPU accelerated high-performance features to 1 Reduce the trade-off between computation accuracy and computation time 2 Use directed rounding based on the IEEE 754-2008 standard on the GPU 1 Graphics Processing Unit 3 / 19

  4. Table of Contents Motivation Verification Definitions CUDA Implementation Fault Tree Analysis Conclusion References Verification Definition We use verification in its narrow sense of referring to a mathematical proof for correctness of a result obtained by a computer calculation. Tools Interval arithmetic provided by C-XSC Floating point arithmetic with directed rounding Central Processing Unit (CPU) Compute Unified Device Architecture (CUDA) 4 / 19

  5. Table of Contents Motivation Verification Definitions CUDA Implementation Fault Tree Analysis Conclusion References A short introduction to CUDA Compute Unified Device Architecture (CUDA) High Performance GPU architecture Single Instruction, Multiple Data (SIMD) implementation Up to 2 10 CUDA cores on the NVIDIA GTX 590 Restriction to NVIDIA graphic cards Support of IEEE 754 floating point operations Double precision Directed rounding to the next floating point number (such as fl ▽ ( x ) and fl △ ( x ) with x ∈ R ) 5 / 19

  6. Table of Contents Motivation Verification Definitions CUDA Implementation Fault Tree Analysis Conclusion References Fault Tree Analysis Fundamentals The implementation is based on The approach by Traczinsky et al. (2006) Verified on modern computer systems CUDA 6 / 19

  7. Table of Contents Motivation Verification Definitions CUDA Implementation Fault Tree Analysis Conclusion References 7 / 19

  8. Table of Contents Motivation Verification Definitions CUDA Implementation Fault Tree Analysis Conclusion References Complexity Computation step Each computation of a logical gate (AND- or OR-gate) has a complexity of O ( n 3 ) : Computation of each interval element ( O ( n × n ) ) Computation of the mass assignment for each interval Total complexity: O ( n 3 ) Improvements The algorithm can be improved to obtain an upper bound of complexity slightly smaller than O ( n 3 ) . 8 / 19

  9. Table of Contents Motivation Definitions C++ and CUDA Implementation Evaluation Conclusion References Verification under CUDA Goal Compute correct results on computer systems using finite floating point arithmetic Approach Directed rounding (GPU source code) Interval arithmetic (C-XSC in CPU source code) 9 / 19

  10. Table of Contents Motivation Definitions C++ and CUDA Implementation Evaluation Conclusion References Interval Notation Real Intervals ( IR ) x = [ x , x ] | x ≤ x ≤ x , x , x and x ∈ R Machine Intervals ( IF ) x = [ x , x ] | x ≤ x ≤ x , x , x and x ∈ F \{ Not a number , ±∞} Description x is an interval from the set IR or IF x is the infimum/minimum of x x is the supremum/maximum of x 10 / 19

  11. Table of Contents Motivation Definitions C++ and CUDA Implementation Evaluation Conclusion References Verification under CUDA Goal Compute correct results on computer systems using finite floating point arithmetic Problem Let x = 1 3 and x ∈ R x + x � = 2 3 � �� � in floating point arithmetic 2 3 ∈ [ fl ▽ (x + x) , fl △ (x + x) ] � �� � � �� � lower bound upper bound 11 / 19

  12. Table of Contents Motivation Definitions C++ and CUDA Implementation Evaluation Conclusion References Verification under CUDA Let x and y be two scale elements (intervals) and m x and m y the corresponding mass assignments Lower Failure Bound (OR-Gate) � � � � �� lb = fl ▽ fl ▽ x + y − fl △ x · y with x , y ∈ [0 , 1] , m lb = fl △ ( m x · m y ) with m x , m y ∈ [0 , 1] . 12 / 19

  13. Table of Contents Motivation Definitions C++ and CUDA Implementation Evaluation Conclusion References Verification under CUDA Let x and y be two scale elements (intervals) and m x and m y the corresponding mass assignments Lower Failure Bound (AND-Gate) � � lb = fl ▽ x · y with x , y ∈ [0 , 1] ub = fl △ ( x · y ) with x , y ∈ [0 , 1] m = fl △ ( m x · m y ) with m x , m y ∈ [0 , 1] . 13 / 19

  14. Table of Contents Motivation Definitions C++ and CUDA Implementation Evaluation Conclusion References Computation time Wall-clock time [s] spend on computation Configurations: Benchmark 1 (B1): n = 200 , f = 20 , l = 100 Benchmark 2 (B2): n = 5000 , f = 100 , l = 60 C++(LB) a DSI b (LB) C++(UB) DSI(UB) B1 7 7 1685 1712 B2 721 654 48070 46160 a C++ utilizing C-XSC and CUDA b DSI 3.5.2 and INTLAB V6 14 / 19

  15. Table of Contents Motivation Definitions C++ and CUDA Implementation Evaluation Conclusion References Computation time 10 5 C++ & CUDA MATLAB & INTLAB 10 4 Wall-clock time [s] 10 3 10 2 10 1 ) ) ) ) B B B B L U L U ( ( ( ( 1 1 2 2 k k k k r r r r a a a a m m m m h h h h c c c c n n n n e e e e b b b b Figure : Wall-clock time [s] spend on computation (logarithmic) 15 / 19

  16. Table of Contents Motivation Definitions Conclusion Implementation Future Work Conclusion References Conclusion Achievements Reduction of the trade-off between accuracy and computation time Verified computation on the GPU using CUDA 16 / 19

  17. Table of Contents Motivation Definitions Conclusion Implementation Future Work Conclusion References Future Work Perspective Using high performance computing In MATLAB utilizing the MEX-Interface with CUDA and C-XSC To compute Markov set chains (imprecise Markov chains) 17 / 19

  18. Table of Contents Motivation Definitions Implementation Conclusion References References [1] Auer , E. ; Luther , W. ; Rebner , G. ; Limbourg , P.: A Verified MATLAB Toolbox for the Dempster-Shafer Theory. In: Proceedings of the Workshop on the Theory of Belief Functions www. udue. de/ DSIPaperone , http: // www. udue. de/ DSI , 2010 [2] Carreras , C. ; Walker , I.: Interval Methods for Fault-Tree Analyses in Robotics. In: IEEE Transactions on Reliability 50 (2001), 3–11. http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=00935010 [3] IEEE Computer Society : IEEE Standard for Floating-Point Arithmetic. In: IEEE Std 754-2008 (2008), 29, S. 1 –58. http://dx.doi.org/10.1109/IEEESTD.2008.4610935 . – DOI 10.1109/IEEESTD.2008.4610935 [4] Kr¨ amer , H.: C-XSC 2.0: A C++ Library for Extended Scientific Computing. In: Lecture Notes in Computer Science Bd. 2991/2004. Springer-Verlag, Heidelberg, 2004, S. 15–35 [5] Kr¨ amer , W. ; Zimmer , M. ; Hofschuster , W.: Using C-XSC for High Performance Verified Computing. Version: 2012. http://dx.doi.org/10.1007/978-3-642-28145-7_17 . In: J´ onasson , Kristj´ an (Hrsg.): Applied Parallel and Scientific Computing Bd. 7134. Springer Berlin / Heidelberg, 2012. – ISBN 978–3–642–28144–0, 168-178. – 10.1007/978-3-642-28145-7 17 [6] NVIDIA : Plattform f¨ ur Parallel-Programmierung und parallele Berechnungen . Website http://www.nvidia.de/object/cuda_home_new_de.html , [7] Rebner , G. ; Auer , E. ; Luther , W.: A verified realization of a Dempster–Shafer based fault tree analysis. In: Computing 94 (2012), S. 313–324. http://dx.doi.org/10.1007/s00607-011-0179-3 . – DOI 10.1007/s00607–011–0179–3. – ISSN 0010–485X 18 / 19

  19. Table of Contents Motivation Definitions Implementation Conclusion References Thank you 19 / 19

Recommend


More recommend