Real-time visualisation and analysis of tera-scale datasets - PowerPoint PPT Presentation

Real-time visualisation and analysis of tera-scale datasets Christopher Fluke Amr Hassan (Swinburne; PhD student), David Barnes (Monash University), Virginia Kilborn (Swinburne) Thank you to the SPS15 organizers for the invitation to speak CRICOS provider 00111D

Motivation The Petascale Astronomy Data Era MORE of the sky MORE often MORE pixels MORE wavelengths MORE data MORE … MORE computational work MORE time passes before you can do… MORE science

Desktop Astronomy How long are YOU prepared to wait for an “interactive” response at your desktop? Volume Memory Local disk Gigascale Yes Yes Terascale No Yes (slow) Petascale No No Scalable Remote service

Australian SKA Pathfinder: Astronomy’s Petascale Present • 36 antennas • Phased-array feeds • Wide field of view • 700 MHz – 1.8 GHz 2012-13: BETA 2014: Full science “ Hazards along the road include kangaroos, cattle, sheep, goats, goannas, eagles, emus, wild dogs…. ” http://www.atnf.csiro.au/observers/visit/guide_murchison.html#directions Credit: Swinburne Astronomy Productions

WALLABY: The ASKAP H I All-Sky Survey B.Koribalski (ATNF), L.Staveley-Smith (ICRAR) + 100 others… • Redshifted 21-cm H I � • ~0.5 million new galaxies • 75% of sky covered • z = 0.26 ~ 3 Gyr look-back Sky / Sky Frequency Observed Emitted (Line-of-sight velocity) Line-of-sight velocity

WALLABY: The ASKAP H I All-Sky Survey B.Koribalski (ATNF), L.Staveley-Smith (ICRAR) + 100 others… Likely data products: 4096 x 4096 x 16384 channels ~ 1TB per cube Sky [ x1200 cubes ] Can we support / Sky Frequency real-time, interactive (Line-of-sight velocity) visualisation and data analysis? 387 HIPASS cubes: 1721 x 1721 x 1024 = 12GB Data: R. Jurek (HIPASS;ATNF)

gSTAR GPU Supercomputer for Theoretical Astrophysics Research Funding = AAL/Education Investment Fund + Swinburne Peak: ~130 Tflop/s 100 x NVIDIA Tesla C2070 + 21 x NVIDIA Tesla M2090 Credit: Gin Tan

Graphics Processing Units (GPUs) are… Massively parallel Programmable* Computational co-processors Providing 10x-100x speed-ups For many scientific problems At low cost (TFLOP/$) (But you can’t use existing code) [* CUDA, OpenCL, PyCUDA, Thrust, OpenACC, CUFFT, cuBLAS ….]

The future of computing is massively parallel Lower price/ Run an performance individual for Tflop/s HPC problem faster Save money Save time Solve more Run more Solve bigger complex problem problems in the problem in the in the same time same time same time Increased accuracy Parameter space Higher resolution Is my algorithm suitable for a GPU? See: Barsdell et al. MNRAS (2010), Fluke et al. PASA (2011)

Why types of problems are GPUs good for? Inherent data parallelism Abell 1689: NASA/Benitez et al. A ij B ij C ij = A ij *B ij E.g. pixel-by-pixel operations (SIMD) High arithmetic intensity N * >> 1

What are GPUs being used for in astronomy? (ADS abstract search: 1 February 2012) 115+ abstracts O(40) application areas Mostly single-GPU Fluke (2011), arXiv1111:5081 Early adopters (“low-hanging fruit”?) (10) (21) (11) (5) (8) (10)

Volume Rendering via Ray Casting Shading Ray casting Sampling Compositing Transfer function Data parallelism + high arithmetic intensity Image: Wikimedia Commons

Inter-node communication is the bottleneck For details see: Hassan et al. (2010), NewA and Hassan et al. (2012), PASA

Early Benchmarking: Maximum Intensity Projection CSIRO GPU cluster Resolution of output frame • 64 CPU nodes • 128 GPUs Time per frame • C1060 (older) C1060 • C2050 (newer) C2050 20 fps Overhead = Inter-node C1060 communication C2050 50 fps See Hassan et. al. (2012), PASA, online early 4 26 66 204 File size (Gbytes)

Framework enhancements (Hassan et al. 2012, submitted) Dynamic peer-to-peer Reduced communication computational and merging via load on Server MPI Supports arbitrary transfer function = quantitative visualisation or data analysis

By the numbers: put the whole cube in memory 48 x HIPASS • 4 x 4 x 3 • 6884 x 6884 x 3072 • 542.33 GB 96 GPUs • 90 Tesla C2070 • 6 Tesla C2090 • 6 GB/GPU • 43392 cores Lustre file system • 113 strips • 546 sec = 9 min load

Visualisation: Scalability Testing Configuration Facility Maximum size Tested 32 node – 64 GPU (3GB/GPU) CSIRO GPU Cluster 140 GB Yes Minimum 128 CPU cores > 10 64 node – 128 GPU (3GB/GPU) CSIRO GPU Cluster 281 GB Yes Minimum 256 CPU cores fps 32 node – 64 GPU (6GB/GPU) gSTAR 300 GB Yes Minimum 128 CPU cores ~ 7fps 48 nodes – 96 GPU (6GB/GPU) gSTAR 540 GB Yes Minimum 192 CPU cores 64 nodes – 128 GPU (6GB/GPU) Upgrade (2012?) 650 GB Planned Minimum 256 CPU cores 128 nodes – 256 GPU (6GB/GPU) Upgrade (2013?) 1.3 TB No Minimum 512 CPU cores WALLABY: 2014!

Analysing 0.5 Tbyte (on 96 GPUs) Task Description Time Histogram Visit each data point once ~4 sec Global mean and Summarizing whole dataset into single value(s) ~2 sec standard deviation Global median Multiple iterations to convergence (Torben’s method) ~45 sec 3D spectrum tool Quantitative data interaction: click for spectrum 20 msec Interactive 3D quantitative visualisation Data: GASS (N.McClure-Griffiths; ATNF)

Interactive data thresholding 2 σ 3 σ Real-time interaction = “Immediacy” “What if?” questions = Knowledge Discovery 4 σ 7 σ Hassan et al. 2012, submitted

Future directions? • Large-format displays • Temporal data • Polarisation (Stokes) • New transfer functions • E.g. medical imaging 8000 × 8000 pixel volume rendering of the HIPASS dataset on the CSIRO Optiportal at Marsfield, NSW. Data: R. Jurek (ATNF) from 387 HIPASS cubes. Image: C.Fluke

Conclusions • Terascale real-time, interactive visualisation and data analysis? • Achievable with GPU clusters • Communication bound • Wish list • More memory/GPU • More GPU/node (PCIe limit) • Faster inter-node communication • Exciting parallel future!

Real-time visualisation and analysis of tera-scale datasets - PowerPoint PPT Presentation

Real-time visualisation and analysis of tera-scale datasets Christopher Fluke Amr Hassan (Swinburne; PhD student), David Barnes (Monash University), Virginia Kilborn (Swinburne) Thank you to the SPS15 organizers for the invitation to speak

TERA CONTRIBUTIONS TO PARTNER Ugo Amaldi University of Milano Bicocca and TERA Foundation

Science Visualisation Paul Bourke iVEC @ University of Western Australia Contents What is

Nanostructures for Tera Tera- -bit Level bit Level Nanostructures for Charge Trap Flash

Science Visualisation Paul Bourke iVEC @ University of Western Australia Contents What is

RESULTS VISUALISATION RESULTS VISUALISATION At the beginning of this course, the large majority

Detection, Analysis and Visualisation of Detection, Analysis and Visualisation of Georeferenced

Real- Real -Time Systems Time Systems Real- -Time Systems Time Systems Real

Real Real- -Time Systems Time Systems Designing a real- Designing a real -time system time

Real- Real -time systems time systems Real- Real -time programming time programming

Global Risk Resources ITER & RiskIE Databases Andrea Wullenweber & Oliver Kroner

Real graduates, Real graduates, real transitions, real transitions, real stories: real

Data Visualisation with R Data Visualisation with R Workshop Day 1 Workshop Day 1 Scales and

Data Handling: Import, Cleaning and Visualisation Lecture 11: Visualisation and Dynamic Documents

Machine Learning and Visualisation Ian T. Nabney Aston University, Birmingham, UK March 2015

The Diversity of Visualisation: Selected visualisation projects from 2011 Paul Bourke Contents

Computer Graphics: Visualisation Lecture 3 Taku Komura tkomura@inf.ed.ac.uk Institute for

SERVICE COMPANY

Second Quarter 2020 Earnings Release July 28, 2020 1 FORWARD LOOKING STATEMENTS & NON-GAAP

Testing to 10kV with the new 1555/1550C Insulation Testers Any Motor. Any Voltage. Any

Decisions and Disparities: Disentangling Sources of Inequity John D. Fluke, Kempe Center for

Tower Climber Opportunities Overview The United States is using wireless bandwidth at an

Overview of Sanctuary Splash/ Big Mama Programming Designed as a three part lesson able

Sm art Tools for Sm arter Maintenance Leveraging Predictive Technologies to Optimize Your

Jonathon Peros & Sam Asci Council Staff Council Meeting South Portland, ME June 11, 2019 1

Sambuz

Useful Links

Newsletter

Mail Us

Real-time visualisation and analysis of tera-scale datasets - PowerPoint PPT Presentation

Real-time visualisation and analysis of tera-scale datasets Christopher Fluke Amr Hassan (Swinburne; PhD student), David Barnes (Monash University), Virginia Kilborn (Swinburne) Thank you to the SPS15 organizers for the invitation to speak

TERA CONTRIBUTIONS TO PARTNER Ugo Amaldi University of Milano Bicocca and TERA Foundation

Science Visualisation Paul Bourke iVEC @ University of Western Australia Contents What is

Nanostructures for Tera Tera- -bit Level bit Level Nanostructures for Charge Trap Flash

Science Visualisation Paul Bourke iVEC @ University of Western Australia Contents What is

RESULTS VISUALISATION RESULTS VISUALISATION At the beginning of this course, the large majority

Detection, Analysis and Visualisation of Detection, Analysis and Visualisation of Georeferenced

Real- Real -Time Systems Time Systems Real- -Time Systems Time Systems Real

Real Real- -Time Systems Time Systems Designing a real- Designing a real -time system time

Real- Real -time systems time systems Real- Real -time programming time programming

Global Risk Resources ITER &amp; RiskIE Databases Andrea Wullenweber &amp; Oliver Kroner

Real graduates, Real graduates, real transitions, real transitions, real stories: real

Data Visualisation with R Data Visualisation with R Workshop Day 1 Workshop Day 1 Scales and

Data Handling: Import, Cleaning and Visualisation Lecture 11: Visualisation and Dynamic Documents

Machine Learning and Visualisation Ian T. Nabney Aston University, Birmingham, UK March 2015

The Diversity of Visualisation: Selected visualisation projects from 2011 Paul Bourke Contents

Computer Graphics: Visualisation Lecture 3 Taku Komura tkomura@inf.ed.ac.uk Institute for

SERVICE COMPANY

Second Quarter 2020 Earnings Release July 28, 2020 1 FORWARD LOOKING STATEMENTS &amp; NON-GAAP

Testing to 10kV with the new 1555/1550C Insulation Testers Any Motor. Any Voltage. Any

Decisions and Disparities: Disentangling Sources of Inequity John D. Fluke, Kempe Center for

Tower Climber Opportunities Overview The United States is using wireless bandwidth at an

Overview of Sanctuary Splash/ Big Mama Programming Designed as a three part lesson able

Sm art Tools for Sm arter Maintenance Leveraging Predictive Technologies to Optimize Your

Jonathon Peros &amp; Sam Asci Council Staff Council Meeting South Portland, ME June 11, 2019 1

Sambuz

Useful Links

Newsletter

Mail Us

Global Risk Resources ITER & RiskIE Databases Andrea Wullenweber & Oliver Kroner

Second Quarter 2020 Earnings Release July 28, 2020 1 FORWARD LOOKING STATEMENTS & NON-GAAP

Jonathon Peros & Sam Asci Council Staff Council Meeting South Portland, ME June 11, 2019 1