dataflow super computing
play

Dataflow Super-Computing Jacob Bower YU INFO, February 2012 - PowerPoint PPT Presentation

Dataflow Super-Computing Jacob Bower YU INFO, February 2012 Maxeler Technologies Maxeler offers complete hardware, software and application acceleration solutions for high performance computing ~70 people, offices in London, UK and Palo


  1. Dataflow Super-Computing Jacob Bower YU INFO, February 2012

  2. Maxeler Technologies • Maxeler offers complete hardware, software and application acceleration solutions for high performance computing • ~70 people, offices in London, UK and Palo Alto, CA  Card: PCI Express x16, compute, memory and local interconnect  Node: 1U solutions with 1 or 4 Cards Hardware  Rack: 10U, 20U or 40U, balancing compute, storage & network  MaxelerOS: Resource management of Dataflow Computing  Runtime support: memory management and data choreography Software  MaxCompiler: providing programmability  HPC System Performance Architecture  Algorithms and Numerical Optimization Consulting  Integration into business and technical processes 2

  3. Overview Dataflow Computing Programming Dataflow Systems Dataflow Engines and Platforms Case Study: Accelerating Risk Computation 3

  4. DATAFLOW COMPUTING

  5. Computing with Instruction Processors 5

  6. Instruction Processor Spectrum Single-Core CPU Multi-Core Many-Core Intel, AMD GPU (NVIDIA, AMD) Tilera, XMOS etc... Hybrid e.g. AMD Fusion, IBM Cell 6

  7. Computing with Dataflow 7

  8. Computation Resources Intel 6- Core X5680 “ Westmere ” Xilinx Virtex-6 SX 475T Computation Computation MaxelerOS 8

  9. PROGRAMMING DATAFLOW SYSTEMS

  10. Programming with MaxCompiler Host application CPU Kernels MaxCompilerRT MaxelerOS Dataflow Engine PCI Express Memory + + * Memory Manager (MaxelerOS) 10

  11. MaxCompiler Architecture 11

  12. Dataflow Kernel Programming    ( ) / 3 y x x x   i i 1 i i 1 12

  13. DATAFLOW ENGINES AND PLATFORMS

  14. Various Dataflow Systems MaxWorkstation Desktop development system MaxNode High density compute system 1-4 Dataflow Engines with up to 192GB RAM MaxNode10G Low latency connectivity platform 1-2 Dataflow Engines with up to six 10GE connections MaxRack 10, 20 or 40 node rack systems Balanced compute, networking & storage 14

  15. MaxNode with MAX3 • 1U Form Factor • 4x MAX3 cards with Virtex-6 FPGAs • MaxRing interconnect • 2x Intel Xeon CPUs • Up to 192GB host RAM • Up to 192GB FPGA RAM • 3x 3.5” disks • ~700W Power 15

  16. CASE STUDY: ACCELERATING J.P. MORGAN RISK COMPUTATION

  17. Computational Finance • Compute value of complex financial products • Compute risk: what’s the sensitivity to X changing? • Typically computed overnight on hundreds to thousands of CPU cores. But – The market changes throughout the day! – We really need to evaluate scenarios: what happens if country X defaults? 17

  18. Credit Derivatives 101 • Bonds are a way for Companies/Countries to borrow – Investors make profit through coupon payments • Investors mitigate risk using – Credit Default Swaps (CDS) – Collateralized Debt Obligations (CDO)... 18

  19. CDOs CDS CDS CDS CDS CDS 19

  20. CDOs High risk Tranche CDS CDS CDS CDS Tranche CDS Low risk 20

  21. CDOs High risk CDS CDS CDS CDS CDS Low risk 21

  22. CDOs High risk CDS CDS CDS CDS CDS Low risk 22

  23. Losses for Different Tranches Amount of Loss for $1M notional in various tranches of 125 name pool 1000000 900000 800000 Amount of Loss in Tranche ($) 700000 600000 500000 0% - 100% (CDSI) 400000 0%-3% 300000 3%-7% 7%-15% 200000 15%-30% 100000 30%-100% 0 0 20 40 60 80 100 120 Number of Names Defaulted 23

  24. Valuing Tranched Credit Derivatives Unconditional Survival Probability for this Name Market factor M Conditional Survival Probability for this Name Correlation Good Market (M>>0) Bad Market (M<<0) 1 1 Probability Probability 0 0 100 100 0 0 Amount of Loss (%) Amount of Loss (%) 24

  25. Application Analysis 25

  26. Convoluter Design Conditional Survival Probabilities Weights Credits Unrolled (c) Notional Sizes Accumulated Loss Distribution (weighted sum) Market Factors Unrolled (m) 26

  27. Credit Derivatives Acceleration • Calculation of current value and credit spread risk for population of 2,925 bespoke tranches. • Speedup from 1 MAX2: – 219 – 270x compared to 1 core – ~30x compared to 8-core node • Power consumption drops from 250W/node to 240W/node with acceleration – >30x more power efficient 27

  28. Summary & Conclusions • Dataflow computing allows: – massive parallelism in computation – highly efficient use of silicon area on chips • Maxeler creates: – dataflow engines and systems – MaxCompiler to easily program these • Dataflow computing used at J.P. Morgan – around 30x performance improvement in speed 28

Recommend


More recommend