CBRAIN An Integrated Web Platform for Neuroimaging Tarek Sherif EGI User Forum April 2011 Dr. Alan Evans Laboratory
Funded by CANARIE CANADA'S ADVANCED RESEARCH AND INNOVATION NETWORK http://www.canarie.ca
Network: CAnet & Global Lambda Facility
Canadian Brain Imaging Research Network Global Brain Imaging Research Network
Summary - Neuroimaging Overview - Scientific Data Flow - CBRAIN: A Distributed Computing Platform for Neuroimaging - CBRAIN as a Web Service
What is Neuroimaging? Clinical Expertise Physical Basic Sciences Neuroscience 3 Tesla MRI Brain Imaging Techniques: - Magnetic Resonance Imaging (MRI) - Functional MRI (fMRI) - Position Emission Tomography (PET) - Magnetoencephalography (MEG) Imaging High Performance Technology Computing
Neuroimaging Research Population Studies: Alzheimer’s Disease Multiple Sclerosis Autism Schizophrenia Normal brain development Alzheimer loss of cortical thickness Multiple Sclerosis lesions Normal Brain Development in Children
Population Studies - Hundreds/thousands brain scans - Images not aligned across subjects or scans - Difficult to compare one brain to another Not Registered Registered
Extracting Features from Data - Skull Masking - Registration in Stereotaxic Space - Tissue Classification lobe-based cortical thickness - 3D Volumes - Cortical Thickness - Gyrification Index complexity - Lobe-based Complexity - 3-5 hours per scan MS Lesion Model - Hundreds of scans per study - GBs of data - 1000s of CPU hours
Tools Many valuable tools exist for Neuroimaging - Desktop based - Hard to install - Hard to use - Compute intensive - Not collaborative
“Modern” Data Flow Analysis Data Compute (visualisation) Knowledge
“Modern” Data Flow Analysis Data Compute (visualisation) Knowledge Data: - Lots of formats (some are weird)! - Lots of it (GB - TB+) - Security - Acquisition quality - Completeness - Annotation
“Modern” Data Flow Analysis Data Compute (visualisation) Knowledge
“Modern” Data Flow Analysis Data Compute (visualisation) Knowledge Compute: - Lots of tools of various quality - Open source & proprietary - Large amounts of compute - Compute access - Data transfer - Cost
“Modern” Data Flow Analysis Data Compute (visualisation) Knowledge
“Modern” Data Flow Analysis Data Compute (visualisation) Knowledge Analysis: - Lots of tools of various quality - Open source & proprietary - 3D is often desktop based (not collaborative) - Large data often requires infrastructure (cost)
CBRAIN: An Integrated Web Platform
Goal: Lightweight Distributed Architecture Nothing specific to Neuroimaging Distributed Data Distributed Computing Distributed Users
Simple Web Interface Data
Simple Web Interface Compute
Simple Web Interface Results Visualisation
Distributed Components Separation of work is key CBRAIN Portal Presentation Models, Logic, Coordination LIGHT network & compute HTTP SSL XML SQL SSH Database (MySQL) MetaData Instances: data, users, jobs, tools, HPCs DB States Execution Servers Control of resources (HPC, Web Services...) HEAVY network & compute Data sync SSH Data Providers Files Networked File Servers, Databases
Distributed Platform
Distributed Platform Scientist Montréal DB PORTAL
Distributed Platform Vancouver DB Data Provider DATA Scientist Montréal DB PORTAL
Distributed Platform Vancouver DB Data Provider DATA Scientist Montréal Sherbrooke RQCHP DB PORTAL COMPUTE
Distributed Platform Vancouver DB Data Provider DATA Scientist Montréal Sherbrooke 1 RQCHP DB PORTAL COMPUTE
Distributed Platform Vancouver DB Data Provider DATA Scientist Montréal Sherbrooke 1 RQCHP DB 2 PORTAL Execution Controller Cluster Head Node COMPUTE
Distributed Platform Vancouver DB Data Provider DATA Scientist Montréal Sherbrooke 1 RQCHP DB Workers 3 2 PORTAL Execution Controller Cluster Head Node COMPUTE
Distributed Platform Vancouver DB Data Provider DATA Scientist Montréal Sherbrooke 1 4 RQCHP DB Workers 3 2 PORTAL Execution Controller Cluster Head Node COMPUTE
Distributed Platform Vancouver DB Data Provider DATA Scientist Montréal Sherbrooke 1 4 RQCHP DB 5 Workers 3 Scheduler 2 PORTAL Execution Controller Cluster Head Node COMPUTE
Distributed Platform Vancouver DB Data Provider 6 DATA Scientist Montréal Sherbrooke 1 4 RQCHP DB 5 Workers 3 Scheduler 2 PORTAL Execution Controller Cluster Head Node COMPUTE
Distributed Platform Vancouver Vancouver 5 Workers 4 3 DB Scheduler 6 Data Provider Execution Controller Cluster Head Node 6 DATA COMPUTE Scientist Montréal 2 Sherbrooke 1 4 RQCHP DB 5 Workers 3 Scheduler 2 PORTAL Execution Controller Cluster Head Node COMPUTE
Distributed Platform Vancouver Vancouver 5 Workers 4 3 DB Scheduler 6 Data Provider Execution Controller Cluster Head Node 6 DATA COMPUTE Scientist Montréal 2 Sherbrooke 1 Status 4 Job Control 7 RQCHP DB 5 Workers 3 Scheduler 2 PORTAL Execution Controller Cluster Head Node COMPUTE
Achievements Illustrative Performance Comparison NIH-Pediatric-Obj1 : up to 3 visits per subject 866 CIVET pipeline runs to generate cortical thickness maps Input : 866 x 3 x 5Mb = 15Gb Output : 866 x 250 Mb = 211Gb Maximum Performance Maximum Performance Typical Performance ypical Performance Cluster Total CPU-hrs # cores Execution # cores Execution time (h) time (h) mammouth-ms2 866 x 4 = 3464 ~500 3 176 20 (RQCHP -Sherbrooke) CLUMEQ-Krylov 866 x 6 = 5196 ~90 58 24 216 (McGill) BIC (Linux) 866 x 8 = 6928 ~100 69 40 173 In general, studies which use to takes 1 week to 1 month now take 1 day.
HPC Integration (8 compute installations, 80,000+ core) JUROPA – Julich (26304 cores) Colosse - CLUMEQ (7616 cores) Orcinus - Westgrid (3072 cores) Mammouth II - RQCHP (2464 cores) Kraken - SHARCNET McGill - CLUMEQ & Local Servers GPC - SciNET (3774 cores) (350 - 16000 cores) (30240 cores)
Collaboration - The integrative approach of CBRAIN makes sharing resources extremely simple. - CBRAIN uses Projects to define permissions, similar to groups in Unix. • Each resource in the system (files, HPCs, Data Providers) is assigned a Project. • All users in a given Project have access to any resources associated with that Project.
Infrastructures Continental Access for Communities CBRAIN LONI Community Community McGILL neuGRID Community UCLA FBF outGRID
CBRAIN as a Web Service - Allow interactions from clients other than web browsers. - RESTful API. - XML and JSON-based interactions.
CBRAIN-LONI Interoperability Demo
CBRAIN-LONI Interoperability Demo
CBRAIN-LONI Interoperability Demo
CBRAIN-LONI Interoperability Demo
CBRAIN as a Web Service - What’s been done: • Most key CBRAIN resources are now available through a RESTful interface. • Outside applications can now get lists of files/tasks, submit jobs, etc. - What’s left to be done: • CBRAIN’s integrated framework is meant to handle data already in the system. • I.e. Files are meant to be registered with the system, so that CBRAIN can track them, avoid redundancy, etc. • It must be decided how external systems will handle getting their data into and out of CBRAIN in a reasonable manner.
Team: tsherif@bic.mni.mcgill.ca alan.evans@mcgill.ca Montreal Neurological Institute, McGill University (Lead) http://cbrain.mcgill.ca Principal Investigator: Alan Evans Program Manager: Reza Adalat System Architect: Marc Rousseau Developers: Pierre Rioux, Tarek Sherif, Angela McCloskey, Nicolas Kassis, Samir Das, David Brownlee System Administrator: Tien Duc Nguyen McGill Office of Technology Transfer (OTT): Francoys Labonte Canada National Research Council: Louis Borgeat Consultants: Rosanne Aleong, Claude Lepage, Pierre Bellec, Andrew Janki, Robert Vincent Remote Sites: Rotman Research Institute, University of Toronto Principal Investigators: Stephen Strother and Randy McIntosh Developers: Anda Pacurar, Anita Oder, Jacques Waller Robarts Research Institute, University of Western Ontario Principal Investigators: Ravi Menon and Mel Goodale Developers: Martyn Klassen, Ronghai Tu Unité de Neuroimagerie Fonctionnelle, Université de Montréal Principal Investigators: Julien Doyon and Rick Hoge Developer: Mathieu Desrosiers Division of Neurology, University of British Colombia Principal Investigators: Jon Stoessl and Max Cynader Developers: Ryan Thomson, Nasim Vafai NCMIR, University of California San Diego, USA Principal Investigators: Mark Ellisman System Administrator: Raj Singh INM, Julich Forschungszentrum, Germany Principal Investigators: Karl Ziles and Uwe Pietrzyk Scientist: Hartmut Mohlberg CNA, Hanyang University, South Korea Principal Investigators: Jong-min Lee LONI, University of California Los Angeles, USA Principal Investigators: Arthur Toga
30
Recommend
More recommend