big data phenomics in the va
play

Big Data Phenomics in the VA Mary Whooley MD Director, VA - PowerPoint PPT Presentation

Big Data Phenomics in the VA Mary Whooley MD Director, VA Measurement Science QUERI San Francisco VA Health Care System University of California, San Francisco Kelly Cho PhD MPH Phenomics Lead, Million Veteran Program VA Boston Health Care


  1. Big Data Phenomics in the VA Mary Whooley MD Director, VA Measurement Science QUERI San Francisco VA Health Care System University of California, San Francisco Kelly Cho PhD MPH Phenomics Lead, Million Veteran Program VA Boston Health Care System Harvard Medical School Academy Health Annual Research Meeting June 27, 2017

  2. Outline • Importance of data standardization and interoperability • PCORnet and the Observational Medical Outcomes Partnership (OMOP) Common Data Model • Million Veteran Program (use case) • Coding algorithms for computable phenotypes 2

  3. 3

  4. Data Big Data are Messy entry Data analysis Data Data coding Data harmonization organization 4

  5. VA Information Systems Technology Architecture (VistA) VA hospitals and clinics 5

  6. Example: How can we identify uncontrolled diabetics?

  7. Logical Observation Identifiers Names and Codes http://loinc.org

  8. Example: How can we identify uncontrolled diabetics?

  9. VA Corporate Data Warehouse Data Tables 9

  10. Data Big Data are Messy entry Data analysis Data Data coding Data harmonization organization 10

  11. Outline • Importance of data standardization and interoperability • PCORnet and the Observational Medical Outcomes Partnership (OMOP) Common Data Model • Million Veteran Program (use case) • Coding algorithms for computable phenotypes 11

  12. http://www.pcornet.org/

  13. 13

  14. http://pscanner.ucsd.edu/ 14

  15. http://pscanner.ucsd.edu/ 15

  16. 2000 to present • 16 million unique patients • 11 million w/ at least one encounter • 5 million deaths • 3 billion procedures • 2.5 billion conditions Abstract presented Nov 2015 • 973,000 providers Am Medical Informatics Assoc

  17. Mapping to Observational Medical Outcomes Partnership (OMOP) Common Data Model Query using the same SQL code SQL = Structured Query Language

  18. Observational Outcomes Partnership (OMOP) Common Data Model Implementations > 600 million patients worldwide 18

  19. Outline • Importance of data standardization and interoperability • PCORnet and the Observational Medical Outcomes Partnership (OMOP) Common Data Model • Million Veteran Program (use case) • Coding algorithms for computable phenotypes 19

  20. 20

  21. Million Veteran Program (MVP) • National VA research initiative aiming to enroll one million users of the VHA in an observational cohort • Over 500,000 patients already enrolled • Blood collection for genotyping and storage • Access to electronic medical record • Goal is to create database of genomic, military exposure, lifestyle and electronic health information

  22. Currently enrolling at >50 VHA Facilities Principal Investigators: John Concato MD MS MPH J. Michael Gaziano MD MPH 22

  23. Genome-wide association study (GWAS): identify genotype(s) associated with specified phenotype Strength of association with computable phenotype 1 2 3 4 5 6 7 8 9 10 . . . . . . . . . . . . . . . . . . 22 23 Chromosome (genotype)

  24. Genome-wide association study (GWAS): identify genotype(s) associated with specified phenotype Strength of association with gene (on chromosome 6) linked computable phenotype with specified phenotype 1 2 3 4 5 6 7 8 9 10 . . . . . . . . . . . . . . . . . . 22 23 Chromosome (genotype)

  25. Outline • Importance of data standardization and interoperability • PCORnet and the Observational Medical Outcomes Partnership (OMOP) Common Data Model • Million Veteran Program (use case) • Coding algorithms for computable phenotypes 25

  26. What is a computable phenotype? Electronic Health Record Unstructured data • Visit notes Structured data -Signs/symptoms • ICD9/10 codes -Smoking/alcohol • CPT codes -Employment • Prescriptions + • Radiology reports = Computable • Lab results • Discharge summary Phenotype • Vital signs • Pathology reports 26

  27. Phenotype Algorithms – https://phekb.org/phenotypes Phenotype Methods Owner CPT Codes, ICD 9 Codes, Atrial Fibrillation Natural Language Processing Vanderbilt Dementia ICD 9 Codes, Medications eMERGE Univ Washington CPT, ICD 9 Codes, Labs, Meds, Heart Failure Natural Language Processing eMERGE Mayo Coronary Disease CPT Codes, ICD 9 Codes PCORI MidSouth CDRN Sleep Apnea CPT Codes, ICD 9 Codes Beth Israel Deaconess Type 2 Diabetes ICD 9 Codes, Labs, Medications eMERGE Northwestern Venous CPT, ICD 9 Codes, Vital Signs Thromboembolism Natural Language Processing eMERGE Mayo

  28. Predicted Data Electronic Training Cases + Mart Health Set Non-cases Record Validation Set 1. Identify cases 2. Iteratively 3. Validate and non-cases refine & test final algorithm (often requires (probabilistic classification chart review) approach) algorithm 28

  29. J Am Med Inform Assoc 2013 Genome Medicine 2015 29

  30. MVP Phenomics Group Mission: 1) to provide a phenotyping framework for MVP Phenomics Science 2) to manage and coordinate resources for MVP phenotyping projects 3) to play a leading role towards “ Mapping the Human Phenome” Organization: Kelly Cho PhD MPH Lead, MVP Phenotyping Scott DuVall PhD Lead, MVP-VINCI Collaboration Jackie Honerlaw RN MPH Manager, Phenomics Core Kevin Malohi BS Manager, VINCI Data Services Mai Nguyen PhD Manager, MVP Data Analytics Anne Ho MPH Lead, MVP Data Management David Gagnon MD PhD Lead, Biostatistics and Data Science 30

  31. Summary – Big Data Phenomics in the VA • Big data are messy • VA EHR data have been mapped to national VA Corporate Data Warehouse (CDW) • CDW data have been transformed to OMOP Common Data Model • Million Veteran Program actively using these data • Phenotype algorithms can be shared at PheKB.org 31

  32. 32

Recommend


More recommend