Ensuring the Quality of Data for Multi-Site Health Services Research Bradley G Hammill Duke School of Medicine & Duke Clinical Research Institute brad.hammill@duke.edu
Quality Across the Clinical Data Flow Electronic Study- Clinical Research Health Specific Encounter Database Record Dataset Many possible data quality intervention points
The PCORnet Experience Distributed Research Network – 13 Clinical Data Research Networks (CDRNs) comprising 80+ sites – Use of Common Data Model (CDM) – Primarily electronic health record data – Control of data is local, not central – Queries are used to generate summary results for return Electronic PCORnet Study- Clinical Health Common Specific Encounter Record Data Model Dataset
PCORnet Research Process Step 1: Research Needs • Purpose : State needs for addressing study objective(s) Step 2: Business Specifications • Purpose : Translate high-level clinical concepts to low(er)-level clinical concepts to obtain from the data. Fill in any gaps posed by the research question. Step 3: T echnical Specifications • Purpose : Provide detailed instructions for defining concepts based on the PCORnet CDM Step 4: SAS Query • Purpose : Generate the SAS query to be sent to sites for execution “ PCORnet Query Programming Guidelines”
ADAPTABLE Trial A spirin D osing: A P atient-centric T rial A ssessing B enefits and L ong-Term E ffectiveness – Pragmatic clinical trial – Demonstration project of PCORnet – Leveraging EHR data – 20+ sites “…designed to reflect ‘real - world’ medical care by recruiting broad populations of patients, embedding the General query strategy trial into the usual healthcare setting, and leveraging data from health systems – 1-2 beta tests (limited sites) before official distribution to produce results that can be readily used to improve patient care.”
Step 1: Research Needs Among the enrolled population, describe medical history and prevalent conditions at baseline – Ex. Prior cardiac revascularization Among the enrolled population, summarize the rate of concurrent medication usage, at baseline and throughout the course of the trial – Ex. Aldosterone antagonist Among the enrolled population, compare event rates between treatment groups – Ex. Bleeding w/transfusion
Step 2: Business Specifications Define population – Enrolled patients Define relevant time periods – History: 1 year prior to enrollment – Follow-up: Up to 2.5 years following enrollment – Medication reporting: At baseline & every 6 months List of specific procedures that make up a concept – Prior cardiac revascularization PCI, CABG, other? – Transfusion Whole blood, red blood cells, other?
Step 2: Business Specifications List specific drug names and ingredients for each medication – Aldosterone antagonist Brand names: Inspra, Aldactone Ingredients: eplerenone, spironolactone List specific diagnoses that make up a concept – Bleeding Intracranial hemorrhage Gastrointestinal hemorrhage Other? Other important things – Medication usage Prescription or dispensing that covers a date
Step 3: T echnical Specifications Codes, codes, codes (+ some logic) Some things to keep in mind – Do not make assumptions about the data (esp. based on your site’s data or experience with claims data) – Do specify comprehensive code lists – Do use validated algorithms where possible, but… – Do not limit yourself to validated algorithms – Do pre-test all queries – Do have a plan for site variability in data & results – Be flexible
Step 3: T echnical Specifications
Coding: Transfusions Specific transfusions (whole blood, red blood cells) Issues: ICD-9-CM (Px) 99.03, 99.04 Study period crosses ICD-10 ICD-10-PCS implementation date / 01-Oct-2015 3023[0|3][H|N|P]1, 3024[0|3][H|N|P]1, 3025[0|3][H|N|P]1, 3026[0|3][H|N|P]1 While at most sites other/non-spec HCPCS P9010, P9011, P9016, P9021, P9022, P9038, P9039, transfusions are ~25% of all transfusions, some sites are as high as 75% P9040, P9051, P9054, P9057, P9058 Other or non-specific transfusions Annual transfusion rates in pre-test query (general CV population) were about 3%? ICD-9-CM (Px) 99.0x (except above) What to do with sites with <0.5%? ICD-10-PCS [Many] What about revenue center codes? HCPCS P90xx (except above) CPT 36430
Coding: Major Bleeding Specific diagnosis codes ( too many to list ) Issues: ICD-9-CM (Dx) Study period crosses ICD-10 ICD-10-CM implementation date / 01-Oct-2015 Additional logic Annual major bleeding rates in pre-test query (general CV population) were about – Type of encounter: Inpatient 5%? No concerning site outliers. – Diagnosis type: Primary Validation studies have shown this to be a – Timing: Between enrollment date & follow-up end date less-than-reliably coded outcome Some sites in PCORnet do not have primary diagnosis indicators for IP encounters. How to handle?
Coding: Aldosterone Antagonist Specific medication codes ( too many to list ) Issues: Dispensing / National Drug Code (NDC) Use both medication tables? Not all sites Prescribing / RxNorm concept unique identifier (RxCUI) have both. Dispensing window Don’t forget to include discontinued codes. – D ISPENSE _D ATE > D ISPENSE _D ATE + D ISPENSE _S UP Which RxCUI term types to include? Prescribing window (?) o Need to know dose form? – R X _O RDER _D ATE > R X _E ND _D ATE o Need to know strength? – R X _S TART _D ATE > R X _E ND _D ATE How to handle missing end date & days – R X _O RDER _D ATE > R X _O RDER _D ATE + R X _D AYS _S UPPLY supply information in prescribing table? – R X _S TART _D ATE > R X _S TART _D ATE + R X _D AYS _S UPPLY – R X _S TART _D ATE | R X _O RDER _D ATE in a defined period
Coding: Other issues Study population – Trials = Enrolled – Observational = Loyalty cohort? Relevant time periods – Medical history look-back – “Current” lab values Procedures in the CDM – Sites have made many different decisions – Ex: Injected medications, E&M codes, any many more
Dealing with Data & Coding Variability Understand the diversity – Data characterization results – Study-specific queries fit-for-use Have a plan /fit fawr yoos/ – Write algorithms and specifications “defensively” phrase – Include multiple concept specifications Typically used to describe data that is capable of meeting specific study requirements Test queries, then re-test Can also be used to describe sites Acknowledge the reality of the data that are capable – Potentially select sites based on initial query results – Show site-specific results as part of the report
Summary Data quality requires planning Data quality results from attention to detail Data quality means acknowledging when data are less than perfect Data quality means dealing with data variability T HANK YOU ! Q UESTIONS ?
Recommend
More recommend