A Comprehensive Clinical Research Database based on CDISC ODM and i2b2 F. Meineke, S. Stäubert. M. Loebe, A. Winter MIE 2014 Istanbul 4.9.2014
C ORE U NIT D ATA C ENTRE OF THE I NTEGRATED R ESEARCH AND T REATMENT C ENTER A DIPOSITY D ISEASES • Support clinical trial conduct – Project management – Biometry/Statistics – Data management • Support care / outpatient clinic – Enhance documentation data quality – Optimize documentation processes • Clinical Informatics group Head%Prof.%Markus%Löffler – Secure Infrastructure – Integrate obesity related research data – Research Database with services and tools for data MIE$2014$/$F.$Meineke retrieval
O BESITY / A DIPOSITY • Definition – too much body fat – ICD10 E66, WHO degree • 1 BMI ≥ 30kg/m ² , 3: ≥ 40kg/m ² – one of the leading preventable Gastric(Band death causes • Treatment – diet, excercise, medication, bariatric surgery • Research examples – see www.ifb-adipositas.de/en (Strength or 3 Endurance? Surgery methods? Economics? Depression? … )
IFB R ESEARCH I NFRASTRUCTURE SAP=SAP%i.s.h.med=Hospital%Informa7on%System LIFE=Leipzig%Research%Centre%for%Civiliza7on%Diseases MIE%2014 Frank%Meineke 4
M ETHODS • IT-Landscape: Research IT- Infrastructure • Framework for Research Database: Open Archival Information System • Transport Data format: Operational Data Model • Process Data Handling : Curation Lifecycle Model • Supporting Tools : ETL, MetaDataRepository, i2b2, … • Access Model : Data Curation Continuum Model (Andrew Treloar) MIE%2014 Frank%Meineke 5
M ETHODS O PEN A RCHIVAL S TORAGE S YSTEM ISO$14721:2012$Open$archival$informa8on$system Preserva'on*Planning Descrip<ve Descrip<ve Data Informa<on Informa<on queries Management query!responses Ingest CONSUMER PRODUCER ! Access orders SIP Archival Storage AIP AIP DIP Administra'on MIE!2014 Frank!Meineke 6
R ESULTS OAIS B ASED R ESEARCH D ATABASE Archival Ingest Access Ingest Storage Access Storage Where/is Metadata? n*jobs? Tabular Moving Data target? ETL For/each one/response. query… Tabular Research ETL ETL Data SQL Data Base ODM ETL Source2specific StaDc2/ Proprietary Generic MIE22014 Frank2Meineke 7
R ESULTS OAIS B ASED R ESEARCH D ATABASE Archival Ingest Access Ingest Storage Access Storage ODM7Pool Standard Tabular Terminology Data ODM Import ET L ET L Meta Control I2b27for Data group71 ODM ET L SQL ET L SQL7for group72 ODM ODM Source7specific Sta@c7/ Proprietary Generic MIE72014 Frank7Meineke 8
M ETHODS : S TORAGE L AYER O PERATIONAL D ATA M ODEL • “is designed to facilitate the regulatory-compliant acquisition, archive and interchange of the metadata and data for clinical research studies” • “ All of the information that needs to be shared among different software systems during the study setup, www.cdisc.org/odm operation, analysis, submission or for long term retention as part of an archive is included ” HIS$e.g.$SAP ODM/Clinical$Trial Encounter StudyEvent Document/Form Form • Hypothesis: structured HIS Group ItemGroup Data:column Item Data (HIS, Lab, Biobank) Coding CodeList can be mapped to CDISC MIE%2014 Frank%Meineke 9 ODM and technically
I NGEST L AYER ETL FROM S OURCE TO ODM • Data Preparation – de-identification / anonymization – curation / cleaning (for non-trial data) – annotation (groups, metadata) – normalize (e.g. use of UCUM) – map structure (events/forms/…) – prepare Record Linkage (IDs) • Tools Digital'Cura+on'Center – Higly flexible, customizable generic University'of'Edinburgh importer Tabular !"#$ %&'() *+ ,&-(+)".(+/01)2* 3 MIE&2014 Frank&Meineke 10 ! "#$ %&'&( conforming file
A CCESS L AYER (1) R ELATIONAL D ATABASE / D ATAMART - SQL based ODM Rel. - Generic odm2sql ETL-job DataBase Study Scheme Pros StudyEvent (implicit) - Fine grained access rights Form (implicit) ItemGroup Table - Well suited for bio-statisticians Item Column - Support for score card CodeList Foreign generation and reporting Key?Table Cons - SQL knowledge necessary - Complex overarching joins MIE%2014 Frank%Meineke 11
A CCESS L AYER (2) D ATA W AREHOUSE - Star-Scheme / EAV - Generic odm2i2b2 ETL- non#smoking#woman#from#18/30 job BMI#greater#>#40 no#known#diabetes Pros wri;en#consent#to#be#contacted - Simple GUI (really…) Outpa@ent#visit#in#the#last#3#month - cohort-searching, # ODM I2b2 feasibility tests Answer:#pa@ent#count,#pa@ent#id#list Study 1st&level - data exploration, data StudyEvent (n.a.) Form 2nd creativity ItemGroup 3rd - overarching queries Item 4th CodeList 5th - more use cases to come MIE&2014 Frank&Meineke 12 Catalogs 5th&to&nE th Cons
D ISCUSSION • Benefits of ODM centric approach – Data is stored in standardized, stable, archivable, partly-self documenting, human readable files – Writing ETL jobs from targeting a stable file based format is much less complex then targeting an evolving proprietary database – Data quality is gained, as lack of metadata and curation deficits get visible during data conversion and data usage – The OAIS framework helps to build well defined RDB components MIE&2014 Frank&Meineke 13 • Experience so far (after 9 months)
O UTLOOK Now (Private Research Domain) - Small research database, structural comprehensive - HIS datasets imported - Pilot-phase Near Future (Shared Research Dom.) - Integrate more obesity Andrew'Treloar,'Digital'Cura3on'Con3nuum related data including selected clinical trials I&thank&you&for&your&a6en7on! - General service for IFB frank.meineke@imise.uni:leipzig.de MIE&2014 Frank&Meineke 14
Recommend
More recommend