parallem massively parallel landscape evolution modelling
play

PARALLEM: massively Parallel Landscape Evolution Modelling Tuesday - PowerPoint PPT Presentation

PARALLEM: massively Parallel Landscape Evolution Modelling Tuesday 28 th November 2017 The University of Sheffield A. Stephen McGough , Darrel Maddy J. Wainwright, S. Liang, M. Rapoportas, A. Trueman, R. Grey, G. Kumar Vinod, and James Bell


  1. PARALLEM: massively Parallel Landscape Evolution Modelling Tuesday 28 th November 2017 The University of Sheffield A. Stephen McGough , Darrel Maddy J. Wainwright, S. Liang, M. Rapoportas, A. Trueman, R. Grey, G. Kumar Vinod, and James Bell

  2. Outline • What is Landscape Evolution Modelling (LEM) • Parallelization of LEM • Preliminary Results • The Current Situation • Future Directions

  3. Landscape Evolution Modeling • Landscapes change over time due to water/weathering • Physical and Chemical Weathering require water to break down material • Higher energy flowing water both Erodes and Transports material until decreasing energy conditions result in Deposition of material • These processes take a long time • Many glacial-Interglacial Cycles • Cycles are ~100ka for last 800ka, prior to 800ka cycles were ~40ka in length • We want to use retrodiction to work out how the landscape has changed

  4. Landscape Evolution Modeling • Use a simulation to model how the landscape changes • 3D Landscape is discretized as a regular 2D grid (x, y) with cell values representing surface heights (z) derived from a digital elevation model (DEM) • Cells can be 10m x 10m or larger 31 22 32 33 32 25 33 34 29 26 27 39 36 27 26 41 44 50 45 44 40 51 55 39 44 46

  5. Landscape Evolution Modeling (simplified) Each iteration of the simulation: How much material will be removed? 7 8 5 9 6 How much material will be deposited? 4 6 6 7 8 4 • Each step is ‘fairly’ fast… 5 8 9 10 9 • But we want to do lots of them 120K to 7 10 7 8 7 1M years Erosion/ 9 8 4 6 5 • On landscapes of 6-56M cells Deposition • If we could simulate 1 year in 1 minute Flow this would take 83 – 694 days! Routing • assuming 1 year = 1 iteration • may need more Sequential version is Flow much slower than this… Accumulation 1 1 3 1 1 8 7 2 1 1 5 1 1 1 1 1 2 1 1 1 1 1 1 6 1 2

  6. Execution analysis of Sequential LEM • We started from an existing sequential LEM • 51x100 cells for just 120K years took 72 hours • estimate for 25M cells 64,000 years • This was non-optimal code • Reduced execution time from 72 to 4.7 hours • 64,000 years down to 300 years • But this is still not enough for our needs

  7. Execution analysis of Sequential LEM • Performance Analysis: • ~74% of time spent routing and accumulating • Need orders of magnitude speedup • So focus was on flow routing / accumulation

  8. Outline • What is Landscape Evolution Modelling (LEM) • Parallelization of LEM • Preliminary Results • The Current Situation • Future Directions

  9. Parallel Flow Routing • Each cell can be done independently of all others 3 2 4 • SFD • 100% flow in the direction of steepest decent 7 5 8 (normally lowest neighbour) 7 1 9 • MFD • Flow is proportioned between all lower 3 2 4 neighbours 7 5 8 • Proportional to slope to each neighbours 7 1 9 • Almost linear speed-up Single flow direction vs multiple flow direction • Problems with code divergence MFD is ‘better’ but much more computationally demanding • CUDA Warps split when code contains a fork

  10. Parallel Accumulation: Correct Flow • Iterate: • Do not compute a cell until it has no incorrect cells flowing into it • Sum all inputs and add self • All cells can work independently of each other • Some restriction on updates not happening immediately Flow Routing Accumulation Correct 6 7 14 19 5 1 4 3 1 6 1 2 2 3 1 1 1 1 1 1 1 1 2 2 4 Cell values are not normally 1, but the initial rainfall on the cell

  11. Not the whole story… • Sinks and Plateaus • Can’t work out flow routing on sinks and plateaus • Need to ‘fake’ a flow routing • Fill a sink until it can flow out • Turn it into a plateau • Fake flow directions on a plateau to the outlet

  12. Parallel Plateau routing • Need to find the outflow of a plateau and flow all water to it • A common solution is to use a breadth first search algorithm • Parallel implementation • Though result does look ‘unnatural’ • Alternative patterns are possible – but acceptable • We are investigating alternative solutions

  13. Sink filling • Dealing with a single sink is (relatively) simple • Fill sink until we end up with a plateau (lake) • But what if we have multiple nested sinks?

  14. Nested Sink filling • Implemented parallel version of the sink filling algorithm proposed by Arger et al [2003] • Identify each sink (parallel) • Determine which cells flow into this sink - watershed (parallel) • Determine the lowest cell joining each pair of sinks (parallel/sequential) • Work out how high cells in each sink need to be raised to to allow all cells to flow out of the DEM (sequential) • Fill all sink cells to this height (parallel)

  15. Outline • What is Landscape Evolution Modelling (LEM) • Parallelization of LEM • Preliminary Results • The Current Situation • Future Directions

  16. Results : Performance • Overall performance 1000 100 10 Time (s) 1 CybErosion-slim T esla single iteration 0.1 580 single iteration T esla average 10 580 average 10 0.01 0.01 0.1 1 10 100 DEM size (millions)

  17. Results : Performance • Flow Direction • Including sink & plateau solution 1000 100 10 1 Time (s) 0.1 0.01 Sequential Flow Direction T esla Flow Direction 0.001 580 Flow Direction T erraflow Flow Direction 0.0001 0.001 0.01 0.1 1 10 100 DEM size (millions)

  18. Results : Performance • Flow Accumulation 1000 Sequential Flow Accumulation T esla Flow Accumulation 580 Flow Accumulation 100 T erraflow Flow Accumulation T esla K20 Flow Accumulation T esla K20 - removed conditionals 10 Time (s) 1 0.1 0.01 0.001 0.001 0.01 0.1 1 10 100 DEM size (millions)

  19. Outline • What is Landscape Evolution Modelling (LEM) • Parallelization of LEM • Preliminary Results • The Current Situation • Future Directions

  20. The Current Simulation • Core Model now extended with processes • Most only affect individual cells (weathering, vegetation) • Some have cross DEM effects (mass movement) but can use same process as before

  21. The Current Simulation • Actively running landscape models on K40/K80 GPGPUs Upper Thames • Taking ~7 weeks to run our model (MFD) Valley + 120K • Leading to interesting results • Not seen as models have traditionally been much smaller • Taking ~4 weeks for SFD • Currently running on just 1 GPGPU • Running multiple models simultaneously • Now have a multi-GPGPU code for running flow accumulation • Designed to ‘sweep’ over the landscape

  22. Multi-GPU: Attempt 1 • Flow direction can be done without problems • Flow accumulation requires communication • Perform each flow direction as one kernel call • No branching • Communication easier between cards GPU 1 GPU 2 GPU 3 GPU 4

  23. Multi-GPU: Attempt 1 Whole Simulation Flow Accumulation 10000 2.5E+11 Compute Transfer Wallclock Runtime (nanoseconds) Wall-clock runtime (seconds) 1000 2E+11 1.5E+11 100 1E+11 5m Active Cells (Kepler K40/K80) 10 5E+10 20m Active Cells (Kepler K40/K80) 5m Active Cells (Pascal Titan XP) 5m Active Cells Sequential (CPU) 0 1 1 2 3 4 5 6 7 0 1 2 3 4 5 6 GPU Count GPU Count

  24. Problem: Landscape Cutting with SFD 100 100 100 100 100 100 100 100 100 100 90 90 90 90 90 90 90 90 90 90 80 80 80 80 80 80 80 80 80 80 70 70 70 70 70 70 70 70 70 70 60 60 60 60 60 60 60 60 60 60 50 50 50 50 50 50 50 50 50 50 40 40 40 40 40 40 40 40 40 40 30 30 30 30 30 30 30 30 30 30 20 20 20 20 20 20 20 20 20 20 10 10 10 10 10 10 10 10 10 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 500 500 500 500 500 500 500 500 500 500 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1500 1500 1500 1500 1500 1500 1500 1500 1500 1500 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2500 2500 2500 2500 2500 2500 2500 2500 2500 2500 3000 3000 3000 3000 3000 3000 3000 3000 3000 3000 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 4000 4000 4000 4000 4000 4000 4000 4000 4000 4000 4500 4500 4500 4500 4500 4500 4500 4500 4500 4500 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5 10 10 15 20 15 15 20 25 20 50 25 20 25 50 70 70 50 25 25 50 70 115 115 70 50 50 115 70 125 115 70 125 70 115 125 138 115 125 138 115 125 138 125 138 125 138 138 138 115 125 138 125 138 138

  25. SFD MFD

  26. Comparing ‘cut in’ between SFD and MFD 84 82 80 78 76 74 72 70 0 500 1000 1500 2000 2500 3000 1k 20kSFD 20kMFD

  27. Problem: Algorithm Slow-down • Correct flow algorithm requires all input cells to be correct before progressing • Becomes a problem for rivers 100 80 Percentage Complete 60 • Correct flow completion 40 profile 20 0 1 10 100 1000 10000 Iteration

  28. Outline • What is Landscape Evolution Modelling (LEM) • Parallelization of LEM • Preliminary Results • The Current Situation • Future Directions

Recommend


More recommend