drn oc updates
play

DRN OC Updates October 5, 2015 Agenda Discussion of revised CDM - PowerPoint PPT Presentation

DRN OC Updates October 5, 2015 Agenda Discussion of revised CDM Implementation FAQs: Shelley Rusincovitch Phase 2 data characterization update: Laura Qualls Defining analysis-ready: Lesley Curtis 2 Revised CDM implementation FAQs Shelley


  1. DRN OC Updates October 5, 2015

  2. Agenda Discussion of revised CDM Implementation FAQs: Shelley Rusincovitch Phase 2 data characterization update: Laura Qualls Defining analysis-ready: Lesley Curtis 2

  3. Revised CDM implementation FAQs Shelley Rusincovitch

  4. Relational Database Management Systems (RDBMS’s) and SAS In Phase II, different PCORnet activities will be based within the RDBMS and SAS instances at each datamart  For example, menu-driven querying within the RDBMS; data characterization within SAS

  5. Question: In order to support both SAS queries and menu-driven SQL queries, will we need SAS datasets or a relational database management systems (RDBMS) database or both? Response: Both the RDBMS and SAS instances need to be present. Therefore, this question is actually about the data stores. 6

  6. Data Stores for RDBMS-SAS Each site has 2 basic options: 1. Most straightforward configuration: the site stores their data in 2 parallel instances: an RDBMS schema, and a SAS dataset collection 2. Option for advanced technical teams: The site configures their SAS instance to run distributed SAS programs against 1 data store in their RDBMS tables Essential for each site to work with their institution’s SAS technical team to determine the optimal SAS configuration at the site Read the complete SAS FAQs at https://pcornet.centraldesktop.com/p/aQAAAAACXx_y 7

  7. Configuration A: Parallel Data Stores RDBMS Instance PCORnet Distributed (such as Oracle, SQL Server, etc) Activity Distributed SQL CDM Data Stored Query Configuration B: Stand-alone RDBMS Data Store in RDBMS Tables Institution’s Firewall RDBMS Instance PCORnet Distributed (such as Oracle, SQL Server, etc) Activity Distributed SAS Distributed SQL CDM Data Stored Module CDM Data Stored Query in a SAS Dataset in RDBMS Tables Institution’s Firewall Collection SAS Instance Both configurations Distributed SAS SAS Platform need the SAS platform Module SAS Instance

  8. Considerations for RDBMS-SAS configurations Running distributed SAS programs against RDBMS tables is an advanced technical setup  This configuration may result in suboptimal performance, systems resource use, and response time: • As SAS reads the data, the RDBMS optimizer may not be able to compensate for these scans; performance optimization needs to be considered within the SAS basis as well  The PCORnet SAS programs still need to run without modification (except to change the libname to point it to the correct data source) 9

  9. Data Store Backups Sites should be able to quickly (within 1-2 weeks) revert to and use the prior analysis-ready data store  Important for situations such as the current datamart refresh has issues, is found to be unusable, etc. Will be dependent upon site’s configuration  For parallel RDBMS-SAS data stores, sites may choose to archive SAS dataset collection  For stand-alone RDBMS data store, RDBMS tables would be archived

  10. Next Areas Data characterization and analysis-ready classification Role of the HARVEST table Datamart refresh expectations  Participation in specific study activities

  11. Phase 2 Data Characterization Laura Qualls

  12. Phase 2 data characterization Foundation for “analysis - ready” Approach  Built upon the foundation of Phase 1 data characterization  Iterative changes to the query package to characterize additional tables  Enhanced analytic tools to expedite the characterization process and facilitate comparisons between DataMarts and between DataMart refreshes. Query distribution specifics  DRN Query Tool (PopMedNet) file distribution  SAS programs that expect SAS version 9.3+ and SAS data types (esp. for dates and times) 13

  13. Phase 2 data characterization, continued Query package v3.0  Similar to Phase 1 data characterization package  Characterizes the 7 expected tables (Demographic, Enrollment, Encounter, Diagnosis, Procedures, Vital, and Harvest)  Includes approximately 15 new analytic queries, including some cross- table queries (e.g. patients with at least 1 diagnosis code and 1 vital measurement in the past year)  Includes DataMart metadata (HARVEST table; SAS version & installed components; operating system; etc.) 14

  14. Phase 2 data characterization timeline October 2015  Release code package for beta-testing query execution and response (not the full data characterization process) November 2015  Refine code based on beta-testing results; finalize query package December 2015-January 2016  Develop data characterization tools and reports February 2016  Phase 2 data characterization onboarding begins  Locked/static DataMart required  Schedule TBD; DataMarts participating in demonstration projects will be prioritized Estimated time to onboard each DataMart  Approximately 8-12 weeks  Timeline will vary depending on query response time and number of issues identified 15

  15. Defining Analysis-Ready Data for CDRNs DRN Operations Center Lesley Curtis

  16. What do we mean by analysis-ready? Data that support feasibility assessments, prep-to-research queries, and dashboard metrics Data that support interventional and observational CER studies with minimal additional curation (research queries) Key assumption: Our definition of analysis-ready will evolve as demonstration projects get underway and use of the data increases! 17

  17. Analysis-ready according to the PFA Full range of quality-checked data for a population of 1m Transformed into the current version of the PCORnet CDM Able to execute SAS queries against CDRN data without modification 18

  18. Analysis-ready according to the PFA* and FAQs Full range of quality-checked data for a population of 1m* Transformed into the current version of the PCORnet CDM* Able to execute SAS queries against CDRN data without modification* Shifting dates is not recommended Locked, static instance (RDBMS and SAS dataset collection) 19

  19. Proposed requirements for feasibility and PTR query, dashboard metric readiness DRN onboarding complete (PopMedNet) DataMarts in PCORnet CDM v3.0  SAS and RDBMS Static DataMart queryable with SAS Unselected population Initial data characterization complete with no significant issues  No primary key violations  PCORnet CDM tables populated for DEMOGRAPHIC, ENCOUNTER, DIAGNOSIS, PROCEDURES, HARVEST, ENROLLMENT, VITAL 20

  20. Requirements for ‘research - ready’ Meets all requirements for feasibility and PTR query readiness No date obfuscation Ability to link with claims or actual linked datasets No other significant findings in data characterization  High level of completeness for most data fields  No significant errors in data mapping 21

  21. Feasibility, PTR, and dashboard Research 22

  22. Data stoplight for feasibility and PTR queries, dashboard metrics Foundational requirements not met Not Data characterization incomplete approved Foundational requirements met Proceed with Data characterization complete Data issues identified for resolution with next refresh caution Foundational requirements met Data characterization complete Approved No significant data issues identified 23

  23. Data stoplight for research queries Not Approved for feasibility and approved PTR queries, dashboard metrics Proceed Linkage to claims possible for specific studies with Date obfuscation caution Incomplete data at the field level Linked population is ready for analysis Approved No date obfuscation High level of completeness for most fields 24

  24. Data stoplight for research queries Establish a uniform, objective standard Determination of whether a given site/datamart is ready to participate in a given study will depend on the requirements of the protocol  ‘Proceed with caution’ may be sufficient for some studies while ‘approved’ may be insufficient for others. 25

  25. Next steps Develop objective metrics for each requirement  A few are straightforward (no date obfuscation), most are not  Review, refine with EC Clarify implications of ‘not approved’ Present to CDRN PIs, datamart technical teams Implement with SAS-based data characterization process Consider upstream ways to streamline the number of analysis- ready assessments. 26

Recommend


More recommend