A CSIR led team India consortium with global partnership for - PowerPoint PPT Presentation

Open Source Drug Discovery (OSDD) Connecting Minds & Machines A CSIR led team India consortium with global partnership for affordable healthcare for all Anshu Bhardwaj Scientist & Community Builder OSDD, CSIR India National Knowledge Network “First Annual Workshop” “The e -Infrastructure of India ” 31 st Oct – 1 st Nov 2012

OSDD Focus : Tropical Neglected Diseases First Disease Target : Tuberculosis; Now extended to Malaria Tuberculosis (TB) is one of leading causes of fatality, ranking second only to HIV as the killer infectious disease of adults worldwide.  At least one person in New TB cases 2010 the world is newly infected with TB bacilli every second  Over 1000 deaths a day or 3 deaths every 2 mins No New TB Drugs past 50 years Source : http://www.globalhealthfacts.org/data/topic/map.aspx?ind=12

Research Spending Per New Drug Company Number of drugs R&D Spending Per Total R&D Spending approved Drug ($Mil) 1997-2011 ($Mil) AstraZeneca 5 11,790.93 58,955 GlaxoSmithKline 10 8,170.81 81,708 Sanofi 8 7,909.26 63,274 Roche Holding AG 11 7,803.77 85,841 Pfizer Inc. 14 7,727.03 108,178 Johnson & Johnson 15 5,885.65 88,285 Eli Lilly & Co. 11 4,577.04 50,347 Abbott Laboratories 8 4,496.21 35,970 Merck & Co Inc 16 4,209.99 67,360 Bristol-Myers Squibb Co. 11 4,152.26 45,675 Novartis AG 21 3,983.13 83,646 Amgen Inc. 9 3,692.14 33,229 Slate’s Bad Math : $55 million on each new drugs Source: http://www.forbes.com/sites/matthewherper/2012/02/10/the-truly-staggering-cost-of-inventing-new-drugs/

Drug Discovery is a Long Risky process with Low Probability of Success http://www.bayerpharma.com/en/research-and-development/processes/index.php

Prediction of non-toxic targets & inhibitors x Efficacy Inhibitor should target the right protein in the pathogen ( Mycobacterium tuberculosis ) Toxicity Inhibitor should not target any crucial protein in host (Human)

Biology is complex !! From a mathematical point of view, to create an accurate model of a single mammalian cell may require generating and then solving somewhere between 100,000 to one million equations The human brain can only process Need automation & new seven pieces of data at a time!!! technology to address the complexity http://news.vanderbilt.edu/2011/10/robot-biologist/

Predictive Science in the Drug Discovery (DD) Process Systems Level Models for DD Virtual Screening - Target Identification for selected - Pharmacomodeling targets& Models for predicting antiTB - Off-target binding predictions and mutagenic properties Predicting toxicity and Systems metabolism Biology for of drugs predicting - Drug-targets MOA HPC for OSDD Community by Garuda/ CMMACS Prediction tools and models to prioritize candidates molecules

Why Open Source Drug discovery ?  Many eye balls make the bug shallow!  Lack of market incentive for TB  Successful Open Source Models  Human Genome Sequencing Initiative  Open Source Software Initiative (eg: Linux OS)  Android  The WWW

Real Innovation lies in “Innovating how we innovate”… “We cannot solve our problems with the same thinking we used when we created them .” Albert Einstein

Open TB Drug Discovery Platform Informatics to Experimental Validation to Clinical Trials Target Validation Systems Chem- of insiilico Biology informatics targets OSDD Assay Mtb Strain Chem and Screening Developm- and Clone Directed Facility ent Repository Synthesis Target Lead Lead Identificati Identificati- Optimizati- on for on on Leads Safety In vivo DMPK Pharmacol- efficacy ogy Pre- Pharmco- Clinical Phase I-III genomics Candidate

Unconventional Collaborative Network Pharmacogenomics Data upload expert Virtual Screening Disease experts OSDD portal Gene/Protein Virtual Lab Expression Analysis Mathematical modeling Administrator Manages server Computer Scientists

Shaping Science 2.0 OSDD Semantic Web Architecture

OSDD Platform Released : April 2010 System Architecture Collaborative tools to accelerate neglected diseases research” in the book “Collaborative Computational Technologies for Biomedical Research”. Wiley and Sons. May 2011

Scientific Workflow Management Systems Experimental data from biology and chemistry needs to be managed and analyzed systematically Large datasets and compute intensive analyses needs compute infrastructure http://galaxyproject.org/ http://www.taverna.org.uk/ http://www.tavaxy.org/ https://kepler-project.org/

Weka Workflow a. Convert CSV to test and train files b. Convert both CSVs to arff files: output_file1 is always train file and output_file2 is test file. c. Select two input files for Classifier. Change the parameters in right side panel for each tool d. Evaluate model file: Classifier will be Misc -> SerializedClassifier

Customized workflow with grid infrastructure & applications APIs to submit workflow method to lab note book http://sysborg2.osdd.net Electronic lab note books Jobs are invoked from APIs to extract files from Customized Galaxy and lab note books APIs to submit results submitted to Gridway to lab note book Input file + parameters Gridway Gridway runner meta Job template PBS scheduler Customized Job Status may be LRM Torque checked using DRMAA API Clusters Programs More than 250 applications integrated

Custom APIs for importing input files from OSDD’s open lab note book into Galaxy Get data customized for extracting files from open lab note book

Custom APIs for exporting results to OSDD’s Open lab note book  Workflows and the result of the workflows are stored as separate lab note books  Lab note book has details of the experiments performed  Results of one experiment may be invoked for analysis in another experiment  All versions of the workflow and the results are stored  Flexibility to execute nested workflows

List of >250 modules integrated as web services by OSDD Community S. No Resources Clients 1 KEGG: Kyoto Encyclopedia of Genes and Genomes 60 2 GetEntry: DDBJ sequence search by accessionID 43 3 GPSR : tools 33 4 PDB : Protein Data Bank 30 5 BioModel:mathematical models of biological DB 25 6 Gtps : Gene Trek in Prokaryote Space 8 WSDbfetch: retrieve entries from biological dbs using 7 7 entry identifiers or accession no. 8 Gibv: Genome Information Broker for Viruses 7 9 DDBJ :DNA Data bank of Japan 7 10 Mafft: a multiple sequence alignment program 4 11 Fasta:- DDBJ database 4 12 Ensembl : maintains automatic annotation 4 13 VecScreen vector contamination 4 14 OMIM:Online Mendelian Inheritance in man 4 15 Gtop: Gene-product Informatics 3 16 GO: Gene Ontology 3 17 SPS : Splicing Profile based Score 2 18 GIBIS: Genome Information Broker for Insertion Sequence 1 19 RefSeq: database of sequence 1 20 GIB: Genome Information Broker 1 21 GIBEnv- DDBJ database 1 22 TxSearch: Database indexing & searching 1

Ongoing: Cheminformatics Community of About 400 PubChem ChEMBL DrugBank HT Virtual screening Cheminformatics Experimental Curated molecule Data Mining Models Assays datasets and Analysis Other Active Communities: • OSDD Women Scientists Forum • OSDD Junior Scientists Forum

Background and Premise

Why are we doing this?

Crowd-Sourcing Large-Scale Data-Driven Cheminformatics Analysis Bioassay Datasets Standard Machine Learning re-ususable based People models/ Computational Publications Models Computational Tools and Resources

Data amplification in Cheminformatics Pubchem Bioassay data (approx. 1 lakh molecules/ dataset Potential Screen Successful PubChem Hits Models (30 million) 6000 descriptors /molecule o Down sizing and random validation require multiple calculation for validation of results o Cross validation up to 50+ time for each experiment

The Problem

C- DAC’s Garuda Grid – Indian Grid Computing Initiative • C-DAC is R&D organization under Ministry of Communication & Information Technology, India • C- DAC’s Garuda Grid is targeted at providing a facility for the scientific community, which would enable them to seamlessly access the distributed resources • Compute Power of GARUDA: ~ 70TFs (6000 CPUs) • Currently there are 55 Garuda Partners • Has NKN (National Knowledge Network) connectivity at 10Gbps

OSDD-Garuda Interface Internet/NKN Results NKN

Weka in Galaxy

OSDD – Garuda Activities • Created OSDD Virtual organization and 70 users registered under this VO. • Garuda Portal customized to support OSDD requirements • Galaxy – a biology workbench has been customized as per OSDD requirements • JNU Head node was set up for hosting Galaxy • Common data has been uploaded to Data Location for accessibility through Galaxy and Portal by all OSDD users • Three cluster resources have been provided for OSDD activities – Hyderabad Cluster with 320 CPUs – Chennai Cluster with 304 CPUs – Param Yuva at Pune with 4368 CPUs • Hand-holding users from the community & resolving their queries

A CSIR led team India consortium with global partnership for - PowerPoint PPT Presentation

Open Source Drug Discovery (OSDD) Connecting Minds & Machines A CSIR led team India consortium with global partnership for affordable healthcare for all Anshu Bhardwaj Scientist & Community Builder OSDD, CSIR India National Knowledge

LED Enhancements: Federal LED Enhancements: Federal Workers Workers 2009 LED Partnership

27 November 2013 Llewellyn van Wyk Principal Researcher CSIR lvwyk@csir.co.za Presentation

Pawel K. Olszewski, PhD pawel@waikato.ac.nz TEAM TEAM TEAM TEAM TEAM TEAM TEAM TEAM TEAM

UVC LED WITH UVC LED DISINFECTION IS Safe for disinfection Fast objects Energy and UVC LED

Intellectual Property Fundamentals of Patents R. R. HIRWANI Former Head- CSIR IP Directorate

Programme for Infrastructure Development in Africa Progress & Way Forward ICA Meeting

DEA- CSIR Special Needs and Skills Development Programme 2014-2016 Presenter: Kelly

CSIR Industry Meet 2019 T echnologies Glance INNOVATE DEVELOP DELIVER Ksheer Tester Hand held

SANReN Network overview. Ntuthuko Sambo CSIR Meraka Institute 3 July 2012 Roles Vision and

Jenkins + CVMS : Distributed Development, Centralised Delivery Bruce Becker | bbecker@csir.co.za

Presented By: V. John Sundar CSIR Central Leather Research Institute, India & Chennai

Advanced Technology Consortium May 2013 What is BayTech? A Public-Private Consortium Bay Area

PML Consortium EMA/FDA Workshop on PML July 2011 PML Consortium and Funding 1 1 PML

Welcome Welcome and Introductions NICU Consortium Partner Updates NICU Consortium Committee

Consortium Update Consortium Update Jason M. Coposky June 9-12, 2020 @jason_coposky iRODS User

Consortium Update Consortium Update June 13-15, 2017 Jason Coposky @jason_coposky iRODS User

Finding new therapeutic targets through genetics & sequencing Judy H. Cho, M.D. Yale

Online Knowledge-Based Support Vector Machines Gautam Kunapuli 1 , Kristin P. Bennett 2 , Amina

ADAPTED SPAULDING PYRAMID Making Isolation: How does it work? Patient Isolation- Creating

Secondary structure prediction of RNA complexes Audrey Legendre , Eric Angel, Fariza Tahi

CSCI 2570 Introduction to Nanocomputing Synthetic Biology John E Savage What is Synthetic

Processes using High Performance Computing Challenges and Perspectives Divya Nayar Centre for

Formal Executable Descriptions of Biological Systems Pierpaolo Degano Dipartimento di

5. Cognitive Development Throughout the Lifespan 5.1 Thinking 5.2 Piagets Cognitive

Sambuz

Useful Links

Newsletter

Mail Us

A CSIR led team India consortium with global partnership for - PowerPoint PPT Presentation

Open Source Drug Discovery (OSDD) Connecting Minds & Machines A CSIR led team India consortium with global partnership for affordable healthcare for all Anshu Bhardwaj Scientist & Community Builder OSDD, CSIR India National Knowledge

LED Enhancements: Federal LED Enhancements: Federal Workers Workers 2009 LED Partnership

27 November 2013 Llewellyn van Wyk Principal Researcher CSIR lvwyk@csir.co.za Presentation

Pawel K. Olszewski, PhD pawel@waikato.ac.nz TEAM TEAM TEAM TEAM TEAM TEAM TEAM TEAM TEAM

UVC LED WITH UVC LED DISINFECTION IS Safe for disinfection Fast objects Energy and UVC LED

Intellectual Property Fundamentals of Patents R. R. HIRWANI Former Head- CSIR IP Directorate

Programme for Infrastructure Development in Africa Progress &amp; Way Forward ICA Meeting

DEA- CSIR Special Needs and Skills Development Programme 2014-2016 Presenter: Kelly

CSIR Industry Meet 2019 T echnologies Glance INNOVATE DEVELOP DELIVER Ksheer Tester Hand held

SANReN Network overview. Ntuthuko Sambo CSIR Meraka Institute 3 July 2012 Roles Vision and

Jenkins + CVMS : Distributed Development, Centralised Delivery Bruce Becker | bbecker@csir.co.za

Presented By: V. John Sundar CSIR Central Leather Research Institute, India &amp; Chennai

Advanced Technology Consortium May 2013 What is BayTech? A Public-Private Consortium Bay Area

PML Consortium EMA/FDA Workshop on PML July 2011 PML Consortium and Funding 1 1 PML

Welcome Welcome and Introductions NICU Consortium Partner Updates NICU Consortium Committee

Consortium Update Consortium Update Jason M. Coposky June 9-12, 2020 @jason_coposky iRODS User

Consortium Update Consortium Update June 13-15, 2017 Jason Coposky @jason_coposky iRODS User

Finding new therapeutic targets through genetics &amp; sequencing Judy H. Cho, M.D. Yale

Online Knowledge-Based Support Vector Machines Gautam Kunapuli 1 , Kristin P. Bennett 2 , Amina

ADAPTED SPAULDING PYRAMID Making Isolation: How does it work? Patient Isolation- Creating

Secondary structure prediction of RNA complexes Audrey Legendre , Eric Angel, Fariza Tahi

CSCI 2570 Introduction to Nanocomputing Synthetic Biology John E Savage What is Synthetic

Processes using High Performance Computing Challenges and Perspectives Divya Nayar Centre for

Formal Executable Descriptions of Biological Systems Pierpaolo Degano Dipartimento di

5. Cognitive Development Throughout the Lifespan 5.1 Thinking 5.2 Piagets Cognitive

Sambuz

Useful Links

Newsletter

Mail Us

Programme for Infrastructure Development in Africa Progress & Way Forward ICA Meeting

Presented By: V. John Sundar CSIR Central Leather Research Institute, India & Chennai

Finding new therapeutic targets through genetics & sequencing Judy H. Cho, M.D. Yale