Cloud-based data analysis: GPU-accelerated Coherent X-ray Imaging & HERCULES school perspective Vincent Favre-Nicolin ESRF, X-ray NanoProbe HERCULES director HERCULES European School Neutron & Synchrotron radiation for science
COHERENT IMAGING SOFTWARE: PYNX 30-100x increase in coherent flux after ESRF upgrade Need for software: • Robust • Algorithms • Standard experimental protocols • Fast (online/live analysis) • Computer clusters, GPU • Evolutive: flexible toolbox • Simple for users • Online (during experiment) and offline (back in home laboratory) • Can run in a notebook (cloud-based software) http://ftp.esrf.fr/pub/scisoft/PyNX/doc/ https://software.pan-data.eu/software/102/pynx Page 2 l PaNOSC Kickoff meeting l Vincent FAVRE-NICOLIN 2019/01/15
GPU VS CPU COST EFFICIENCY (AMAZON) GPU (V100), GPU (V100), Xeon E5-2686 OpenCL (clFFT) CUDA 4 cores, FFTW 2D FFT (16x1024x1024) 2.14 ms 1.09 ms 38 ms 3D FFT (128**3) 0.28 ms 0.12 ms 4 ms 3D FFT (256**3) 4.2 ms 0.7 ms 60 ms 3D FFT (512**3) 46 ms 5.54 ms 550 ms Amazon price/hour 3 € 3 € 0.4 € x7 Cost per 10**6 2D FFT 0.11 € 0.06 € 0.26 € x5 Cost per 10**6 3D FFT 38 € 4.6 € 61 € x13 NB: timing does not include data transfer to GPU (implies long on-GPU computing) The 3D 512**3 FFT on the V100 runs at 3.3 Tflop/s • GPU are 2 orders of magnitude faster compared to CPU (4 cores) • Price per FFT is ~1 order of magnitude cheaper per FFT • GPU memory max 48 Gb L Notes: Xeon E5-2686 test on the Amazon V100 machine (4 core=8 vCPUs). 256 and 512 3D FFTs are 10-20% faster on ESRF scisoft14 (Xeon Gold 6134). FFTW with FFTW_MEASURE Page 3 l PaNOSC Kickoff meeting l Vincent FAVRE-NICOLIN 2019/01/15
DATA ANALYSIS WITH PYNX (GPU-BASED) You can use PyNX (without any GPU knowledge): • Python API with operators • Command-line scripts • Notebooks For: • Coherent Diffraction Imaging (CDI) • Ptychograohy (near and far field) • Small-angle and Bragg geometry Simple installation script but requires GPU workstation with CUDA and/or OpenCL pynx-id16apty.py ptychomotors=mot_pos.txt,-x,y probe =focus,120e-6x120e-6,0.1 h5meta =meta.h5 h5data =data..nxs algorithm =analysis,ML**200,DM**300,probe=1,nbprobe=3 saveplot=object_phase save=all defocus=250e-6 Page 4 l PaNOSC Kickoff meeting l Vincent FAVRE-NICOLIN 2019/01/15
EXAMPLE NOTEBOOK: PTYCHOGRAPHY Page 5 l PaNOSC Kickoff meeting l Vincent FAVRE-NICOLIN 2019/01/15
CXI: COHERENT X-RAY IMAGING FORMAT The CXI file format aims to create a data format with the following requirements : 1. Simple-both writing and reading should be made simple. 2. Flexible-users should be able to easily extend it. 3. Fast-it should be efficient so as not to become a bottleneck. 4. Extendable - new features should be easily added without breaking com- patibility with previous versions. 5. Unambiguous - it should be possible to interpret the files without using external information. 6. Compatible-the format should be as compatible as possible with existing formats Based on the HDF5 format Now with NeXus implementation (NXcxi_ptycho) http://cxidb.org/cxi.html Page 6 l PaNOSC Kickoff meeting l Vincent FAVRE-NICOLIN 2019/01/15
EXAMPLE 1: 3D CDI DGX-V100 Test K20m Titan X Titan V (1 GPU) 3D CDI, 512**3 235 s 127 s 39 s 32 s Time for 30 runs 2 h 1h 20 mn 16 mn • id10 dataset: Crystal Growth and Design, 14, 4183 (2017) • Data acquisition time: 3h EBS Outlook : • id10 plans for 1k**3 and even 2k**3 datasets: • Size*2 = memory *8 = computation time * 10 • Faster data acquisition • Expect 10x to 50x increase in data analysis • GPU with large amounts of memory are essential. 2k cannot fit in current generation GPU. Need multi-GPU FFT (slow) Notes: • 512**3 dataset, need 5 Gb • Recipe used for solution: 1000 cycles (800 RAAR + 200 ER) • 30 runs necessary to be sure of correct solution. Can be distributed. Page 7 l PaNOSC Kickoff meeting l Vincent FAVRE-NICOLIN 2019/01/15
3D PTYCHO-TOMO, NEAR FIELD DGX-V100 Test K20m Titan X Titan V (1 GPU) 2D Ptycho 65 mn 36 mn 13 mn 10.5 mn 17 frames 2k*2k Time for 720 projections 33 days 18 days 6.6 days 5.2 days (extrapolated) • id16A dataset, courtesy of P. Cloetens & J. da Silva • Reconstruction with ptypy took ~20h (10 cores) per projection • Extrapolation to a ptycho-tomo dataset with 720 angles • Data acquisition time for 700-800 angles: ~ 14h EBS Outlook : • Faster data acquisition (10x) • Up to 2000 projections, and 4k frames • 50 to 100x increase in data analysis Notes: • Recipe used for solution: 4000 DM + 300 ML, 3 probe modes • Time does not include tomography reconstruction & unwrapping Page 8 l PaNOSC Kickoff meeting l Vincent FAVRE-NICOLIN 2019/01/15
(COHERENT) IMAGING DATA ANALYSIS • Analysis algorithms/workflows exist • Data often needs some tuning of algorithms or data corrections • Need GPU • Computing resources @facility: • Will allow online analysis to follow the experiment and tune parameters/understand samples • Won’t allow analysis of all acquired datasets • Users need to continue data analysis in their home laboratory as seamlessly as possible: • avoid relying on facility scientist post-experiment • Avoid too complicated software deployment • Hardware (virtual or not) solutions should be accessible • Reduce time-to and rate of publication after experiment • What about paid-for DaaS ? (private companies) • Can we have a DaaS marketplace ? Page 9 l PaNOSC Kickoff meeting l Vincent FAVRE-NICOLIN 2019/01/15
CLOUD & GPU-BASED DATA ANALYSIS • Need GPU for fast data analysis (cloud: tested with amazon EC2 machines) • Easy to provide virtual images with all the requirements • Notebooks: • GPU analysis is possible • GPU memory not released by kernels (persistent GPU context) => issue for multi-user machines • Client/GPU-server approach ? At least for large datasets => broadcast jobs to GPU cluster • Need reliable / stable API for 2D and 3D data display & manipulation Page 10 l PaNOSC Kickoff meeting l Vincent FAVRE-NICOLIN 2019/01/15
3D IMAGING ANALYSIS: SEGMENTATION Coccoliths: CaCO 3 shells around phytoplankton, responsible for storing >50% of human produced CO 2 (>500 Gt since 200 years) Segmentation of coccoliths from 3D coherent diffraction imaging data. Þ Takes much longer than data acquisition & 3D reconstruction ! Þ Automated data processing through known algorithms work fine in the cloud (notebooks), but parts requiring heavy user manipulation are still challenging (no solution or no stable API) A Gibaud, T. Beuvier, Y. Chushkin et al. – in press Page 11 l PaNOSC Kickoff meeting l Vincent FAVRE-NICOLIN 2019/01/15
CLOUD-BASED TOOLS VALIDATION Moving (user) communities to new tools: • “doing better than imageJ is hard” • “My users only want TIFF files, they can’t use hdf5 ” • “It took us 10 years to create our scripts” • Workflows often have lots of options/need for tuning to specific parameters • Need to involve & convince user communities early. • Validate tools • If all PaN facilities could use the same portal, that will be great ! Page 12 l PaNOSC Kickoff meeting l Vincent FAVRE-NICOLIN 2019/01/15
HERCULES SCHOOL • Annual school in ~March • Training PhD students and post-docs since 1991 to use neutrons & synchrotron radiation, with a wide range of applications (from biology to condensed matter physics) • 70-80 participants every year (>2000 since 1991) • 5-week school, with 35-40% hands-on: • Practicals on instruments & beamlines • Tutorials with data analysis • 1 week (2 this year !) with groups in European partners • 19 HERCULES Specialised Schools (HSC): • 5 days, focused on a single topic • Also with 2 days of practicals & tutorials • Other schools: • Brazil in 2010, Taiwan in 2015, SESAME in 2019… • Calipsoplus 1-week regional schools (Solaris, …) Page 13 l PaNOSC Kickoff meeting l Vincent FAVRE-NICOLIN 2019/01/15
HERCULES CLOUD ANALYSIS TRAINING Evangelise: • the use of cloud-based data analysis • What FAIR principles mean • What is Open Data • What resources are available • Use cloud-based solutions for data analysis Schedule is already packed but we can promote this through lectures, tutorials and practicals Page 14 l PaNOSC Kickoff meeting l Vincent FAVRE-NICOLIN 2019/01/15
MOTIVATING TUTORS & LECTURERS Main difficulty for data analysis evolution (workstation->cloud): • scientists are already overwhelmed by running their beamline or instrument, and need to help users after their experiment too ! • Existing workflows ‘work’ Þ Difficult to find extra time to develop e-learning documents/examples unless it fits an immediate purpose However: • Cloud-based solutions should improve how easy users can analyse data in their lab • E-learning examples should be identical to real experimental workflows (or code/notebooks, …) Page 15 l PaNOSC Kickoff meeting l Vincent FAVRE-NICOLIN 2019/01/15
Recommend
More recommend