g-INFO portal Doan Trung Tung Aurélien BERNARD, Ana Lucia DA-COSTA, Vincent BLOCH, Thanh-Hoa LE, Yannick LEGRE, Lydia MAIGNE, Jean SALZEMANN, Hong-Quang NGUYEN, Vincent BRETON 1
Outline Introduction Overview of g-INFO Implementation of g-INFO Conclusions and perspectives 2
Why g-INFO? H5N1 (avian flu) 262 deaths 436 cases WHO - July 2009 287 deaths 486 cases WHO – March 2010 3
Influenza surveillance Data collection Data processing in batch mode BioHealthBase General phylogenetic pipelines NCBI Specific phylogenetic LosAlamos pipelines g-INFO: Grid-based Deployment of International phylogenetic tools on Network for Flu clusters / grids Oservation 4
Global Surveillance Network g-INFO’s overview 5
g-INFO’s goals Integration of influenza virus data sources into a federation of databases Automatic phylogenetic pipelines Specific molecular epidemiology studies 6
Architecture of g-INFO system Each data provider has its own server(s) to store his data Data provider export only selected data to a data grid interface server The data exported is integrated in a common schema on the interface servers Providers can keep the privilege of granting access rights to their data 7
Architecture of g-INFO system Epidemiologic pipelines will be deployed on the grid BLAST Alignment Phylogenetic trees Visualisation ... and more 8
Phylogenetic workflow g-INFO’s implementation 9
Data collection >ABV25634 MKAILLVLLCAFAATNADTLCIGYHANNSTDTVDTVLEKNVTVTHSVNLLEDSHNGKLCRLGGIAPLQLG KCNIAGWLLGNPECDLLLTVSSWSYIVETSNSDNGTCYPGDFIDYEELREQLSSVSSFEKFEIFPKTSSW PNHETTRGVTAACPYAGASSFYRNLLWLVKKENSYPKLSKSYVNNKGKEVLVLWGVHHPPTSTDQQSLYQ NADAYVSVGSSKYDRRFTPEIAARPKVRGQAGRMNYYWTLLEPGDTITFEATGNLVAPRYAFALNRGSES GIITSDAPVHDCDTKCQTPHGAINSSLPFQNIHPVTIGECPKYVKSTKLRMVTGLRNIPSIQSRGLFGAI AGFIEGGWTGLIDGWYGYHHQNGQGSGYAADQKSTQNAIDGITNKVNSVIEKMNTQFTVVGKEFNNLERR IKNLNKKVDDGFLDVWTYNAELLVLLENERTLDFHDSNVKNLYEKARSQLRNNAKEIGNGCFEFYHKCDD ACMESVRNGTYDYPKYSEESKLNREEIDGVKLESMMVYQILAIYSTVASSLVLLVSLGAISFWMCSNGSL QCRICI Grid DB FTP NCBI Metadata Sequences Daily Protein, Nucleotide, updates Coding region IDs 10
WISDOM Production Environment 11
Integration of g-INFO into WPE g-INFO Data Manager Job Manager Data collection Job Submitter Amga Data service Task Manager WISDOM Information System MUSCLE Gblocks Amga PhyML BLAST 12
Automatic phylogenetic pipeline Task g-INFO Manager pipeline Job Job g-INFO g-INFO portal database Job Job Wisdom IS Manager 13
Automatic phylogenetic pipeline NCBI > Run daily a phylogenetic workflow on the grid Prepare Data in correct format AMGA Metadata Alignment + Curation Sequences ( Muscle + Gblocks ) Protein, Nucleotide, Coding region IDs Phylogenetic Visualisation tool Analysis ( PhyML ) Grid portal 14
Manual phylogenetic workflow MOTEUR MOTEUR Task desktop web Manager tool services Job Job g-INFO g-INFO portal database Job Job Wisdom IS Manager 15
Workflow execution example 16
Phylogenetic workflow g-INFO portal 17
g-INFO portal Collaboration among IFI, IOIT and HPC: IFI: web services to interact between the portal and the system IOIT: design and develop the portal HPC: visualization tool Technologies: JSF 2.0, Ajax, web services, Java aplet, … 18
g-INFO portal MOTEUR Task web Manager services Job Job g-INFO Intermediate g-INFO portal web services database Job JSF 2.0 Web services Ajax Job JDBC Wisdom IS Manager 19
g-INFO portal – home page 20
g-INFO portal - search 21
g-INFO portal – search results 22
g-INFO portal – search results 23
g-INFO portal – working sessions 24
g-INFO portal – define working session template 25
g-INFO portal – define working session template 26
g-INFO portal – define working session template 27
g-INFO portal – define working session template 28
g-INFO portal – run working session 29
g-INFO portal – run working session Pipeline01 result01 Pipeline02 result02 Input01 Input02 Pipeline03 result03 Input03 WorkingSession Pipeline04 result04 inputN PipelineN resultN 30
g-INFO portal – visualization 31
Conclusions A success in terms of international collaboration An example of developping grid application in Vietnam A complementary service for the public health research community 32
Perspectives Provide more tools and pipelines Import other database resources Improve system’s performance We are expecting the research community to contribute with more useful tools Can be applied for other emerging diseases 33
Thank you! 34
Recommend
More recommend