data challenges and opportunities
play

data: challenges and opportunities A/Prof Zornitza Stark and Dr - PowerPoint PPT Presentation

International sharing of genomic and clinical data: challenges and opportunities A/Prof Zornitza Stark and Dr Alejandro Metke First human genome 2003 >10 years, USD$3 billion Genome sequencing: cost and time 19.5 hours 3 Genomic ic


  1. International sharing of genomic and clinical data: challenges and opportunities A/Prof Zornitza Stark and Dr Alejandro Metke

  2. First human genome 2003 >10 years, USD$3 billion

  3. Genome sequencing: cost and time 19.5 hours 3

  4. Genomic ic testin ing in in healt lthcare: next xt 5 years 60,000,000 patients 4

  5. Population Common screening disease Drug Rare Infectious Cancer response disease disease 5

  6. The world is changing Percentage of whole genomes and exomes that are funded by healthcare systems 2012 2018 2022 ~1% ~20% >80% Areas of clinical uptake: infectious disease, cancer, rare disease, common/chronic

  7. The GA4GH Ecosystem Global Alliance members include: Universities and research institutes (22%) 3000+ Academic medical centers and health systems (10%) Subscribers Disease advocacy organizations and patient groups (4%) 550+ Consortia and professional societies (13%) Organizational Members Funders and agencies (5%) Life science and information technology companies (46%) 70+ Countries

  8. Global Participation 70+ Countries represented Afghanistan Finland Mexico South Korea Argentina France Morocco Spain Australia Georgia Nepal Sri Lanka Austria Germany Netherlands Sudan Belgium Ghana New Zealand Sweden Botswana Greece Nicaragua Switzerland Niger Brazil Hong Kong Taiwan Cameroon India Nigeria Tanzania Tunisia Canada Ireland Norway China Israel Peru Turkey Colombia Italy Philippines Uganda Congo Japan Portugal Ukraine Costa Rica Kenya Qatar United Kingdom Croatia Luxembourg Russian Federation United States Malawi Czech Republic Sierra Leone Uruguay Denmark Malaysia Singapore Venezuela Egypt Mali Slovenia Virgin Islands, U.S. Estonia Mauritius South Africa

  9. Roadmap & Leadership Roles Driver Project Champions Give high-level input on GA4GH activities and Roadmap tools Act as ‘team leads’ (e.g. appointing Contributors on Work Streams) ● ● ALL DPCs: attend bi-annual in-person SC meetings Support GA4GH tool implementation within the Driver Projects ● ● 1 representative DPC from each project: join bi-annual SC calls ● Work Stream Contributors Work Stream Leads Actively contribute to ● “Community - minded” ● development of leaders with bandwidth to deliverables ensure delivery of tools at Represent needs of Driver ● the expected rate Projects on WS calls Ensure balanced input ● Liaise between WS ● from multiple DPs on tool activities & DPCs development Chair WS calls ● Participate in quarterly SC ● meetings 9

  10. Direct Engagement Indirect Engagement

  11. Data Sharing Challenges Legal and ethical Genomic data Clinical data

  12. A new paradigm FROM TO Data Copying Data Visiting

  13. Federation Healthcare data Open research data with research use analysis analysis Aggregate data globally Analyse data locally (via VMs) Download, analyse locally Collate analyses Continues for basic research New approach for both research and healthcare

  14. Core Principles of Data Sharing Enable international data sharing Promote sharing across the translational continuum (discovery research, clinical trials, healthcare system, diagnostic labs, industry) Encourages technology-enabled federated approaches (bring analysis to the data) Promote interoperability Scientific: Standards adoption; transparent documentation - Technical: Standardized file formats, variant calling protocols, variant & gene annotation - Ethical: Consent policies to ensure data can be shared internationally -

  15. Global Learning for Health Interoperable Healthcare Research Genomic APIs , standards & Knowledge frameworks to Exchanges support global data sharing 15

  16. Real World Driver Projects: Develop and test standards, tools and frameworks for data sharing

  17. Program Two Data Management Work Flow and Capabilities ‘Shariant’ Platform Data Quality Assessment Data Access Agreements Data Governance Policies Quality reports on BAM + VCF data produced Classified variants Data Access Committee by and curation evidence shared across laboratories qprofiler software Data Access and Approvals System Approvals issued automatically (low-sensitivity) Classified Variants or via data access review (high-sensitivity) Genomic Data Repository Clinical Sequencing Reported & Unreported Laboratories Curation Evidence Phase 1 genomic data store Access to Access to Upload Genomic Data individual-level summary-level data Genomic Data VCF data + BAM Associated Metadata FASTQ Phase 2 comprehensive genomic data catalogue Metadata laboratory, sequencer, library Gen-Phen Database preparation etc ‘Variant Atlas’ Genotypes Genotypes and Phenotypes available for interactive summary-level queries and visualisations Consent + CTRL Dynamic Consent Australian Genomics Study Standardised Clinical Phenotypes Database Flagship Patient phenotype data coded in Platform Clinical Phenotypes SNOMED / HPO terms and represented in FHIR format Patient phenotype data Phenotypes Secondary use of data

  18. Genomics England: 100,000 Genomes Project

  19. UK National Genomics Informatics Service (NGIS) An evolution of the 100,000 Genomes Project platform Delivery Partners National Genomic ic In Informatics Service Zone Digital Optum GeL Elucidata 2 NHS England’s Illumina NPEx National Congenica Genomics Unit Secondary & Genomic Illumina Fabric Longitudinal Hu Test bs Clinical Data Directory 3 Service Identifiable De-Identified Sample Tracking NPEx Data Data Service Service 6 Value Extraction Community 11 10 Data Management Service 1 Genomic WGS National Lab Hubs Sequencing National Research 7 Genomic Service Support Genomic 9 National Test Service BioInformatics Data Ordering Service Service GMCs Store MDTs 8 Diagnostic returns Decision Support Service eConsent Service Panel Assigner Pedigree Tool 4 Primary 5 Care NHS Trusts Authentication Service

  20. Clinical data sharing: harmonizing data capture and exchange

  21. Analysis Variant-level Case-level Diagnostics pipelines interpretation interpretation Clinical data Normal Disease Matchmaking Research variation cohorts

  22. The problem Time Relevant Effort High quality Skill Accurate Scalability Machine-readable Interoperable Risk: poor quality, incomplete data = suboptimal interpretation

  23. Mapping, harmonization, exchange Clinical data NGIS model

  24. Consensus clinical data Common with other tests: Name, surname • DOB • Gender: phenotypic/karyotypic • Contact details • Identifiers: study/hospital • Referring clinician/centre/contact details •

  25. Consensus clinical data Unique to genetics/genomics: Consent status +/-additional findings, data sharing, research • Pedigree/consanguinity • Additional family members to be tested and affected/unaffected • status/consent status Suspected clinical diagnosis/gene • Gene panels for prioritized analysis •

  26. Consensus phenotypic data Phenotype using standard terminology: HPO • (Relevant prior tests: genetic and non-genetic) •

  27. Phenotype capture

  28. Phenotype capture Acute Care: HPO terms in REDCap (via Ontoserver and REDCap Ontology Module)

  29. Ethnicity Understanding rare variation Clinical diagnostics: Is this variant actually rare/absent? Errors in interpretation More VOUS, false pos, false negs Research: Responsibility to build diverse and representative datasets

  30. Capturing ethnicity: problems Ascertainment heterogeneity and ambiguity Different levels of granularity Population identifiers: Geographical? Racial? Cultural? Political? Multicultural societies and mixed ancestries Lack of standards/ontologies. Are census codes fit for purpose?

  31. Capturing ethnicity: current status 12 REA categories: 16 REA categories: • • European (non-Finnish) Chinese • • European (Finnish) White: British • • Sub-Saharan Africa White: Irish • • Asian White: any other • • North African/Middle Eastern Asian or British Asian: Pakistani • • Other Oceanian Asian or British Asian: Bangladeshi • • People of the Americas Asian or British Asian: Indian • • Maori/Pacific Islander Asian or British Asian: any other • • Aboriginal/Torres Strait Islander Black or British Black: Carribean • • Australian/New Zealander Black or British Black: African • • Ashkenazi Jewish Black or British Black: any other • • Sephardic Jewish Mixed: x 4

  32. Capturing ethnicity: a way forward?

  33. Human Ancestry Ontology

  34. Human Ancestry Ontology

  35. Pedigree Consanguinity? Affected 1 st degree relatives Suspected mode of inheritance

  36. Building tools to support implementation

  37. GA4GH Deliverable #1: Definition of phenotype models for different clinical domains Clinical & Phenotypic with driver projects Data Capture Deliverable #2: Work Stream Phenopackets on FHIR 2019 Roadmap Deliverable #3: Pedigree representation

Recommend


More recommend