Global surveillance of infectious diseases Open science, open data, open for all Frank M. Aarestrup www.compare-europe.eu www.genomicepidemiology.org
The charge of the light brigade
Strategy to win a war Putting up the right defense Knowing: Where What When How Communicate & Prioritise
IMPORTANCE OF INFECTIOUS DISEASES • Direct cause of 22% of all global deaths (15 million) • Huge burden on expected life length and disability (DALYs) • Devastating effects on economy: • Absence from work • Trade • Travel • Health system • Expected to increase; e.g. AMR alone causing 10 million deaths in 2050 (cancer 8.2 million in mainly older people)
Current infectious disease situation • Dynamics of common infectious diseases are changing – Demographic change, population density, anti vaccine, AMR, etc. • New diseases / variants emerge frequently – Population growth, travel, trade, climate change • Effects are difficult to predict due to complexity – Rapid flexible response • Public health, diagnostic, vaccine development and clinical response depend on global capacity for disease surveillance – Rapid sharing, comparison and analysis of data from multiple sources and using multiple methodologies
2 main aspects § Most EID come from animals, § Once introduced in people, many opportunity for contact increases opportunities for transmission § Complexity increased by food trade Wolfe et al., 2007; http://rambaut.github.io/EBOV_Visualization/Makona_1561_D3/; Gytis et al., 2017
Ambition: Reduce impact of EID by improving prediction, prevention, detection, control and treatment
Response to ID outbreaks usually fragmented and late Adapted from: Infected patients time Public Health response Preclinical research response clinical research response Courtesy Frank Deege 9
Response to ID outbreaks with improved detection and sharing of data Infected patients time Public Health response Preclinical research response clinical research response Courtesy Frank Deege 10
NGS advantages Laboratory diagnostics increasingly rely on (pathogen) genomic • information RNA / DNA are common across pathogens, therefore, methods to • analyse pathogen genomes are potentially universal Next generation sequencing capacity is developing fast, and costs are • becoming competitive Data are easy to share electronically and are in a standardized format • Ø Capturing NGS developments may provide a universal language that can be harnessed for early detection of outbreaks across disciplines and domains Ø If the technology keeps developing, less equipped labs may leapfrog
Preparing for Global Surveillance - Center for Genomic Epidemiology (CGE) Provide a proof of concept of combining bioinformatics with global • epidemiology in real-time (2010-2016) Provide foundation for web based solutions (plug and play tool) • – Rapid sequence assembly and annotation • What is it • How dangerous is it • Have we seen it before • Can it be treated www.genomicepidemiology.org •
CGE tool box http://cge.cbs.dtu.dk/services/all.php This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 643476.
CGE Batch upload – simple upload and ”all in one go” 1 12 June 4 2017
Uploaded data and analysis easy to search and download
User Statistics Moving tools to the data – integration into ENA repository Until now: >1,000,000 submissions from 15,000 IP- adresses in +100 countries
Why a central public repository? Besides the language and altruistic issues • The data comparison problem • Allowing easy transfer between levels of access including public • Allow access to bioinformatics for the frontline • Allowing for constantly improving the analytic pipelines
Data comparison problem Global repositories > 0.1 - 10 Pb data Client Internet ~1-100 Gb data ~1-10Gb/hour Bring the tools to the data
mcr-1 • Mcr-1 gene added to ResFinder database on Nov. 24 • Nov. 25 – screening of online available human and food isolates initiated using an iphone on a plane to Paris – 5 hits • Nov. 25-26 the authorities notified and the screening extended to +3,000 available genomes • Nov. 29 meeting with the authorities – please wait • Dec. 2 press release and submission of manuscript • Dec. 9 publication
Establishing and improving real-time surveillance • Rapid sharing – Online bioinformatic tools • >1,000 jobs per day – Facilities for rapid sharing • Natural reservoirs – Major issues with individual samples National ethical committees 20 The Rio Convention The Nagoya Protocol
Global sewage surveillance - 2016 India Brazil 12 June 2017
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 643476.
30 28 26 24 22 R² = 0,30873 20 18 16 14 12 10 0 50 100 150 200 250 300 350 400 450 AMR This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 643476.
Copenhagen according to sewage Samples collected since Nov. 2015 Sequenced routinely since June 2016 Released publicly real-time Very preliminary data Drug use the previous year
Global sewage 2017
50,000 laboratories 1000 cities 1000 slaugtherhouses Global surveillance of: Explanatory varaibles: - Healthy humans - Trade (food) - Healthy animals - Travel (flight) - Diseased humans & animals - Demographic (World bank) This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 643476.
Building a global genomic epidemiological infrastructure – making data assessable Raw NGS Own analysis Own data data Online bioinformatic tools Online epidemiological tools (Bioinformartics for dummies) (epidemiology for dummies) - Assembly - Mapping Search and - Phylogeny download Data: - World Bank (demographic/economic) - WHO (health) Sharing site - FAO (food production and - Own data trade) - Shared data - Flight connections - Public data - Antimicrobial use
Research is not enough • E-learning – Massive Open Online Courses • Ring-trials • Advisory service 46 laboratories in 22 countries have provided data for at least one of the PT components • No free ride – If you will not share – pay
Challenges • Scientific – Make sense ( or even analyse ) +50 million genes among exabytes data explained by +2,000 variables and global connectivities • False discovery rate – min. 5 billion using clasical statistics • Sustainability/analytic pipeline – 500 million € to establish, 50 million €/y to run • Sharing of data • Scientific credit • Political and other consequences – When we report diseases where they should not exist!
How to get involved • Build a database and work with us • Come with some interesting questions and engage us • Make a pipeline and challenge us • Support us financially – 500 million € to establish, 50 million €/y to run – I will be standing outside accepting collections
Our vision: one system serves all Guiding principles: - Cross sector, cross domain, open source (not commercial) - Interaction with the rest of the world (all inclusive) - Data for action (actionable outputs) - Central repository (tools to the data) (ENA, DDJ,NCBI) There can be no real-time disease surveillance without real-time data sharing This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 643476.
Complete sequencing of food products Kingdom Phylum Class Family Genus Species IKEA-BigMac Lyngby-Whopper Eukaryota Chordata Mammalia Bovidae Bos Bos mutus 47,728767 55,138006 Eukaryota Chordata Mammalia Bovidae Bubalus Bubalus bubalis 16,571103 16,052984 Eukaryota Chordata Mammalia Bovidae Capra Capra hircus 0,779368 0,858108 Eukaryota Chordata Mammalia Bovidae Ovis Ovis aries 0,362244 0,592654 Vespertilionid Eukaryota Chordata Mammalia ae Myotis Myotis davidii 0,286175 0,062029 Eukaryota Chordata Mammalia Muridae Rattus Rattus norvegicus 0,165231 0,151688 Eukaryota Chordata Mammalia Macropodidae Macropus Macropus eugenii 0,153902 0,189751 Eukaryota Chordata Mammalia Bovidae Pantholops Pantholops hodgsonii 0,129919 0,102347 Eukaryota Chordata Mammalia Bovidae Bos Bos taurus 0,088133 0,203566 Balaenopterid Eukaryota Chordata Mammalia ae Balaenoptera Balaenoptera acutorostrata 0,077098 0,045253 Eukaryota Chordata Mammalia Ursidae Ailuropoda Ailuropoda melanoleuca 0,012359 0,008599
Recommend
More recommend