The Australian National Medical Research Data Storage Facility - - PowerPoint PPT Presentation

the australian national medical
SMART_READER_LITE
LIVE PREVIEW

The Australian National Medical Research Data Storage Facility - - PowerPoint PPT Presentation

#THETA2015 The Australian National Medical Research Data Storage Facility med.data.edu.au Jeff Christiansen, Mohammad Islam, Derek Van Dyk, Leonie Hellmers and Ian Gibson This work is licensed under a Crea ve


slide-1
SLIDE 1

This work is licensed under a Crea ve Commons A ribu on 4.0 Interna onal License.

  • #THETA2015

The Australian National Medical Research Data Storage Facility med.data.edu.au

Jeff Christiansen, Mohammad Islam, Derek Van Dyk, Leonie Hellmers and Ian Gibson

slide-2
SLIDE 2

med.data.edu.au

  • What is it?
  • What are the drivers?
  • Where did it come from?
  • Who has signed up?
  • Data Types
  • Why does it matter?
  • What will it do and how will it work?
  • What’s its status?
slide-3
SLIDE 3

What is it?

slide-4
SLIDE 4

What is it?

  • a National Facility to provide:
  • Highly secure petabyte-scale data storage
  • Related high-speed networked and secure

computational services

for Australian medical and health research

  • rganisations
slide-5
SLIDE 5

What are the drivers?

slide-6
SLIDE 6

What are the drivers?

  • 1. Strategic reviews

have pointed to an increased requirement to consolidate, aggregate and collaborate to maximise research

  • utcomes

www.mckeonreview.org.au/

slide-7
SLIDE 7

What are the drivers?

www.mckeonreview.org.au/

slide-8
SLIDE 8

What are the drivers?

www.mckeonreview.org.au/

slide-9
SLIDE 9

What are the drivers?

www.mckeonreview.org.au/

slide-10
SLIDE 10

What are the drivers?

  • 2. Fast growth of the volume of medical

research data

slide-11
SLIDE 11

What are the drivers?

  • 2. Fast growth of the volume of medical

research data – e.g. imaging

Jeff Lichtman/Harvard University, CC BY-NC-ND Dan Vogel, CC BY-2.0

slide-12
SLIDE 12

What are the drivers?

  • 2. Fast growth of the volume of medical

research data – e.g. genomic

Guy Cochrane, EBI

slide-13
SLIDE 13

What are the drivers?

  • 2. Fast growth of the volume of medical

research data

Illumina HiSeq XTEN

~USD 1000 per genome 0.6 terabases of sequence per day 3-4 terabytes of data per day 10 machines @ the Kinghorn Centre for Clinical Genomics (Garvan Institute)

www.businesswire.com

slide-14
SLIDE 14

What are the drivers?

  • 3. The rise of personalised medicine
slide-15
SLIDE 15

What are the drivers?

  • 3. The rise of personalised medicine

UNIVERSITY

CLINICAL and ADMINISTRATIVE RECORDS RESEARCH DATA

slide-16
SLIDE 16

Where did it come from?

slide-17
SLIDE 17

Where did it come from?

  • Funded through RDSI, RDS and ANDS

projects

  • All part of the NCRIS program
  • Over $1 billion over past few years
  • $150M planned for 2015-16
  • Longer term planning underway
slide-18
SLIDE 18

Where did it come from?

  • RDSI: established 8 large data storage

facilities nationally

slide-19
SLIDE 19

Where did it come from?

  • RDSI: established 8 large data storage

facilities nationally

  • Intended to store research data of value for

future research

  • Total storage ~50 petabytes
  • Large Collections Program
  • Medical, Imaging, Astronomy, Ecology, Genomics
slide-20
SLIDE 20

Where did it come from?

  • RDS: making RDSI storage sustainable.
slide-21
SLIDE 21

Where did it come from?

  • RDS: making RDSI storage sustainable.
  • optimising and maturing the RDSI capability

for 9 data-intensive research communities:

Medical, Climate, Genomics, Humanities, Astronomy, Ecology, Marine, Geoscience, Imaging.

  • integrate rapidly expanding data holdings with

the cloud and supercomputing infrastructure to provide a truly high-performance data using capability

slide-22
SLIDE 22

Where did it come from?

RDSI/RDS Medical

slide-23
SLIDE 23

Where did it come from?

  • ANDS: ensuring research data is:
  • Managed
  • Connected
  • Findable
  • Reusable
slide-24
SLIDE 24

Where did it come from?

  • ANDS: ensuring research data is:
  • Managed
  • Connected
  • Findable
  • Reusable
  • Data Management for all users of RDSI
  • Incl. independent Medical Research Institutes
slide-25
SLIDE 25

Who has signed up?

slide-26
SLIDE 26

Who has signed up?

  • 31 letters of support

including 20 Medical Research Institutes

slide-27
SLIDE 27

Who has signed up?

The Nold Laboratory Rapid Autopsy in Melanoma Consortium Melbourne Genomics Health Alliance

Melbourne Brain Centre Imaging Unit Melbourne femur collection Health and Biomedical Informatics Centre

53 organisations

slide-28
SLIDE 28

Data types

  • Imaging
  • MRI, CT, PET, Xray, histology…
  • Clinical
  • Surveys, pathology, video assessments, patient data,

longitudinal studies, clinical trials, eHealth, EEG, anatomy/morphology, personalised medicine, drug discovery/therapeutics…

  • “Omics”
  • Genomics, Transcriptomics, Proteomics…
  • Biobanking
  • Computational modelling
  • Health economics
slide-29
SLIDE 29

Why does it matter?

slide-30
SLIDE 30

Why does it matter?

  • Compliance and data security/privacy
  • Efficiency and cost effectiveness
  • Supporting translational impact
  • Enabling new research methods and
  • utcomes
slide-31
SLIDE 31

What will it do?

slide-32
SLIDE 32

What will it do?

  • Store data

– securely, including identifiable data

  • Describe data

– at collection and item level

  • Find data

– at collection level

  • Share data

– under appropriate conditions

  • Use data

– via a number of associated tools (TBD) – including secure options

slide-33
SLIDE 33

What will it do?

  • Store data

– securely, including identifiable (sensitive) data

  • Describe data

– at collection and item level

  • Share data

– only under appropriate conditions

  • Find data

– at collection level

  • Use data

– via a number of associated tools (TBD) – to include highly secure options

slide-34
SLIDE 34

How will it work?

slide-35
SLIDE 35

How will it work?

  • Each node is an

independent entity

  • Storage + (HPC, Cloud)
  • Connected via AARNet
  • Data custodian enters into

data storage agreement with node of their choice – typically the local one

  • Data custodians retain control over access
slide-36
SLIDE 36

How will it work?

  • Governance: Advisory Board (current)
slide-37
SLIDE 37

What’s its status?

slide-38
SLIDE 38

What’s its status?

  • Project team established
  • Advisory Board established
slide-39
SLIDE 39

What’s its status?

The Nold Laboratory Rapid Autopsy in Melanoma Consortium Melbourne Genomics Health Alliance

Melbourne Brain Centre Imaging Unit Melbourne femur collection Health and Biomedical Informatics Centre

slide-40
SLIDE 40

What’s its status?

  • 80 foundation data collections identified
  • 4 PB
  • 80 technical analyses underway
  • 53 organisations across 4 states
  • Security
  • Access control
  • Use cases
slide-41
SLIDE 41

What happens next?

slide-42
SLIDE 42

What happens next?

  • Signing up more organisations
  • Establish roadmap of services

– data, metadata standards, data management and curation, APIs, analytics

slide-43
SLIDE 43

What happens next?

  • Signing up more organisations
  • Establish roadmap of services

– data, metadata standards, data management and curation, APIs, analytics

  • Opportunity to align access methods

across consenting collections

  • Conducting research across

aggregated collections

slide-44
SLIDE 44

Thanks

slide-45
SLIDE 45

med.data.edu.au

Supported by: