the improvement of start
play

The improvement of START Kenji Hasegawa (U. Tsukuba, CCS Kobe - PowerPoint PPT Presentation

The improvement of START Kenji Hasegawa (U. Tsukuba, CCS Kobe branch) Takashi Okamoto (U. Tsukuba, CCS Kobe branch) Cosmological Radiative Transfer Comparison Project Workshop IV @ Austin, Texas, Dec 11-14, 2012 Outline Introduction What is


  1. The improvement of START Kenji Hasegawa (U. Tsukuba, CCS Kobe branch) Takashi Okamoto (U. Tsukuba, CCS Kobe branch) Cosmological Radiative Transfer Comparison Project Workshop IV @ Austin, Texas, Dec 11-14, 2012

  2. Outline Introduction What is “START” Previous Studies with START Improvements New Ray-tracing Test & Scalability Additional Process (Roles of Dust) Preliminary Results Summary

  3. What is “START” SPH with Tree-based Accelerated Radiative Transfer (KH & Umemura 2010) • Hydrodynamics SPH (Smoothed Particle Hydrodynamics) method • Non-equilibrium chemistry e - , H + , H, H - , H 2 , H 2+ , He, He + , and He 2+ • Radiative Transfer HI, HeI, HeII ionizing photon, and H 2 photodissociating photon. SPH particles are directly used as grids for RT → Spatial resolution changes adaptively. RT Calculation is accelerated by Tree Algorithm

  4. What is “START” SPH with Tree-based Accelerated Radiative Transfer (KH & Umemura 2010) 1)Make an oct-tree structure for sources. 2) If a cell which contains sources is far enough away from an SPH particle, the cell is regarded as a virtual luminous sources.

  5. What is “START” SPH with Tree-based Accelerated Radiative Transfer (KH & Umemura 2010) 1)Make an oct-tree l d < θ crit structure for sources. In the limit of θ crit = 0 . 0, the scheme corresponds to RSPH 2) If a cell which (Susa 2006) contains sources is far l: size of a cell enough away from an d: distance between a SPH particle and a cell SPH particle, the cell is regarded as a virtual luminous source. calculation cost is proportional to log(N s ) (Not N s )

  6. What is “START” Similar method for grid-based RT ARGOT: Accelerated Radiative Transfer on grids using oct-tree (Okamoto, Yoshikawa & Umemura 2012)

  7. Previous Work with START KH, Umemura & Suwa (2010), Umemura, Susa, KH, Suwa, & Semelin (2012) UV feedback on a secondary collapsing Pop III halo • RHD simulation including the transfer of diffuse recombination photons. ⇨ N source = N SPH = 2million ~70pc First Star The secondary core can survive!!

  8. Previous Work with START KH & Semelin (2012) UV feedback on galaxies during the Epoch of Reionization • RHD simulation including internal UV (ionization and LW) feedback in each galaxy. 5cMpc Z=24 Z=9.5 Z=7.3 Z=6.0

  9. Previous Work with START KH & Semelin(2012) We found: • The formation of galaxies M min,halo =2 × 10 7 M sun during the EoR is controled High Ionization resolution by internal UV & SN feedback. run history • Ionization and Cosmic SF histories are very sensitive to M min,halo = 1.6 × 10 8 M sun the mass resolution. Low Box size is too small to show resolution run cosmic reionization history... High resolution Much larger number of run Cosmic SF particles are required history

  10. What we need are ★ Powerful Super Computer ★ RHD code which enables us to perform massive parallel simulations

  11. K Computer Top500 list Nov. 2012 http:/ /www.top500.org ~82k nodes (650k cores) available Peak Performance ~ 10PFlops

  12. Ray-Tracing: Old version. Ray-tracings are Lv solved from all sources in all .1 levels. Lv. 2 Lv. 3 DISTANCE

  13. Ray-Tracing: Old version. Ray-tracings are Lv solved from all sources in all .1 levels. Lv. 2 Lv. 3 DISTANCE

  14. Ray-Tracing: Old version. Ray-tracings are Lv solved from all sources in all .1 levels. Lv. 2 Time of MPI communications Lv. dramatically 3 increases with increase of Nn ode DISTANCE

  15. Ray-tracing: Improved version Point: Reuse of the information of lower level Lv .1 Lv .2 Lv .3 DISTANCE

  16. Ray-tracing: Improved version Point: Reuse of the information of lower level Lv .1 Lv .2 Lv .3 DISTANCE

  17. Ray-tracing: Improved version Point: Reuse of the information of lower level Lv .1 Not only MPI time Lv but also the cost .2 of RT calculation can be reduced. Lv .3 DISTANCE

  18. • TREE WALK • • *In practice, oct- tree is utilized. Lv.5 Lv.4 Lv.3 Lv.2 Lv.1

  19. • TREE WALK • • *In practice, oct- tree is utilized. Lv.5 Lv.4 Lv.3 Lv.2 Lv.1

  20. • TREE WALK • • *In practice, oct- tree is utilized. Lv.5 Lv.4 Lv.3 Lv.2 Lv.1

  21. • TREE WALK • • *In practice, oct- tree is utilized. Lv.5 Lv.4 Lv.3 Lv.2 Lv.1 Parallelization via openmp

  22. Parallelization: Between nodes ★ The size of each domain is adjusted to have equivalent calculation cost every a few steps.

  23. Parallelization: Between ★ The size of each domain is adjusted to have equivalent calculation cost every a few steps. ★ Each domain asynchronously sends (receives) optical depths. to downstream (from upstream) domains. (Same as RSPH by Susa 2006)

  24. Parallelization: Between ★ The size of each domain is adjusted to have equivalent calculation cost every a few steps. ★ Each domain asynchronously sends (receives) optical depths. to downstream (from upstream) domains. (Same as RSPH by Susa 2006)

  25. Parallelization: Between ★ The size of each domain is adjusted to have equivalent calculation cost every a few steps. ★ Each domain asynchronously sends (receives) optical depths. to downstream (from upstream) domains. (Same as RSPH by Susa 2006) Make load balance better

  26. Test of the new method DATA: the distributions of the SPH and stellar particles @z=7.0 obtained by a cosmological hydrodynamic simulation. N SPH = 128 3 , N s ~300 Reference (by RSPH) Density Temperature Ionized fraction

  27. Test of the new method DATA: the distributions of the SPH and stellar particles @z=7.0 obtained by a cosmological hydrodynamic simulation. N SPH = 128 3 , N s ~300 Temperature by New START Reference (by RSPH) 30Myr 10Myr 20Myr θ crit = 0 . 5 Density Temperature θ crit = 0 . 7 Ionized θ crit = 0 . 9 fraction

  28. Test of the new method * If we employ an appropriate tolerance parameter, RT can be solved accurately. * Similar method will be implemented into ARGOT by T. Oakamoto. Temperature by New START Reference (by RSPH) 30Myr 10Myr 20Myr θ crit = 0 . 5 Density Temperature θ crit = 0 . 7 Ionized θ crit = 0 . 9 fraction

  29. START scalability: Hydro Part Cosmological Hydrodynamics N=512 3 × 2: Test on K computer Very Good strong scaling up to 8k nodes (64k cores)

  30. START scalability: RT Part Comparison between the improved and old versions N source =16k N SPH =256 3 XE6(cray)@Kyoto • Speed-up is factor of 2 • Time for MPI does not increase with increase of N node .

  31. START scalability: RT Part Dependence on the number of sources. Comparison between the runs with N s =2k and N s =16k • Calculation time is insensitive to the number of sources.

  32. START scalability: RT Part Test with 512 3 SPH particles and 16k source particles • With 512 3 particles, the scheme still shows good scalability. • It is expected that the scheme keeps good scalability, even if we increase the number of nodes.

  33. Additional Processes • Evolution of spectrum (age, freq.) = (22, 60) In previous study (KH & Semelin 2012), we assumed blackbody-shape with 50,000K for stellar sources. M=M sun High energy photons were overproduced. • Metal Enrichment Population synthesis by PEGASE Affect • Metal cooling STAR FORMATION • Roles of Dust grain and • Molecular formation • Absorption of Photons REIONIZATION • Radiation Force

  34. Role of Dust Absorption Dust date from Draine & Lee (1984) • Even if H and He atoms are ionized, dust opacity does not Z=Z sun change. M dust = 0 . 01 M H • Opacity is Draine et al. (2007) sensitive to the size of dust at 30 bins frequency range 30 bins above the Lyman limit. H Lyman limit

  35. Absorption by Dust Z=0.01Z sun without dust Dust size 0.1micron *Found in Local Group *Proposed by Nozawa+(2007)

  36. Absorption by Dust Z=0.01Z sun without dust Dust size 0.01micron *Typical size of first grains, proposed by Todini & Ferrara (2001)

  37. Summary New method: • Good Strong Scaling. (So far up to 2,000 nodes) Probably N SPH =1024 3 run is possible, using K computer 8k-16k nodes (in 1-2 weeks?). • Accurate (with small tolerance parameters) Simulations including Metal Enrichment: • The role of metal (especially dust) on the evolution of high-z galaxies and IGM. • Compute SEDs, LF , escape fraction ... of high-z galaxies.

Recommend


More recommend