RobinViz: Reliability Oriented Bioinformatic Networks Visualization http://code.google.com/p/robinviz/ A. E. Aladağ, C. Erten, M. Sözdinler
Overview • PPI Related Concepts • PPI Prediction and Verification • RobinViz o Visualization Model o Layout Algorithms o Extra Features • Final Remarks
PPI Networks: Proteins • Proteins: o Amino acid polymers linked with peptide bonds o Control, mediate most processes in the cell: Enzymatic Transport Structural roles etc. P38 (MAPK1) from PDB
PPI Networks: Interactions • Proteins function via interactions P38 interactions with other kinase activity proteins from RobinViz
PPI Networks: Prediction • How to Predict Interactions? o Experimental: AP-MS SDS-PAGE Y2H ........ From Junker at al. '08 o Computational: Gene Order Coevolutionary Profiling Coexpression Analysis
PPI Networks: Verification • Problems o Many False Positives/Negatives o 80K Yeast protein interactions in all types o Only 3% agreement by more than one type • How to Verify Interactions? o Co-function o Co-localization o Co-expression • Reliability Assignment o Computational methods using Verification concepts Orthology
PPI Networks: Visualization • Graph Visualization • Issues: o Very large graphs Thousands of nodes Tens of thousands of interactions o Models for integration of Interaction verification Interaction reliabilities • Previous Tools: o Cytoscape [Shannon et al. 03'] o GenePro [Vlasblom et al. '06] o SpringScape [Ebbels et al. '06] o Lack verification integration and/or reliabilities o No real-time data retrieval from major databases
RobinViz: Underlying PPI Network • Edge-weighted, undirected graph o Construct SQLite: Naming o User-specified multiple PPI: Organism, Experiment type Retrieve from BioGrid Unify naming and merge o Interaction Reliabilities: Specified Organisms Retrieve from HitPredict Unify naming and merge
RobinViz: Visualization Model • Underlying PPI network, G = (V, E) • Two-level visualization o Central view graph, G c = (V c , E c ) Node/Edge weighted u in V c subset of V (u,v) in E c union of edges from E o Peripheral view graphs Edge weighted Subgraph of G induced by u in V c • Determine o V C and weights of nodes in V c o Mapping M: V c → P(V), where P power set
RobinViz: Visualization Model
RobinViz: Central View • Co-ontology o V c : (User) GO categories from AmiGO o M: (User) Annotation repositories from GO Each repository: gene i → cat x , cat y , cat z , ... Convert: cat x → gene i , gene j , gene k Filter with V c , unify names, merge sources o Node Weights: PPI hit ratio
RobinViz: Central View • Co-expression o V c : (User) Expression matrix from GEO (User) Biclustering algorithm: CC, BIMAX, REAL Set of genes in each resulting bicluster o M: Same as co-ontology o Node Weights: PPI hit ratio H-value Functional enrichment
RobinViz: Bidirectional Verification • From trustworthy central clusters to mistrusted PPI False Negatives False Positives
RobinViz: Bidirectional Verification • From trustworthy PPI to mistrusted central clusters False Negatives False Positives
RobinViz: Detailed Analysis • Co-ontology o Detailed online information via AmiGO o GO Table within the system
RobinViz: Detailed Analysis • Co-expression o GO Table, Functional Enrichment Table o Heatmap and Parallel Plots for biclustering results
RobinViz: Layout Algorithms • Spring Embedder Layout (Weighted)
RobinViz: Layout Algorithms • Spring Embedder Layout o Energy based: Nodes connected with springs o Heuristic for minimizing energy of the system o Edges short, non-edges long, symmetric layout o Random initial positions o For k iterations For each edge (u,v) Displace u, v proportional to F attr between u, v For each node u For each node v Displace u proportional to F rep between u,v o Straight-line edges for each edge (u,v) o Running time Θ(|V| 2 k). In practice Θ(|V| 3 ) • Reliabilities: Weighted modification in force formula
RobinViz: Layout Algorithms • Sugiyama-style Layout (Weighted)
RobinViz: Layout Algorithms • Sugiyama-style Layout (Weighted) o Layer assignment: Modify Coffman-Graham '72 Longest path with promotion heuristic Minimizing weighted edge lengths, drawing area o Order within layers Layer-by-layer sweep Weighted crossing minimization o Coordinate assignment x-coord: 2-bends with Brandes et al. '02. y-coord: Proportional to density of weighted edges
RobinViz: Other Features • Node Coloring o 1 st level in: Process, Compartment, Function, All o Peripheral (Func) Central (Func) Central (All) • Detailed 1-hop/2-hop neighborhoods • Zooming/Animation/Selection Focus • Search Panel • Save/Load Session, Automatic Database Update
RobinViz: Libraries, Databases etc. • Code and Libraries o Graph Layout: C++ LEDA o GUI and Data Processing: Python and PyQt4 o Settings Files: YAML o Website Parsing: BeautifulSoup o ~30.000 lines of code o Windows and Linux versions • Databases o PPI Networks: BioGrid o Interaction Reliabilities: HitPredict o GO Tree: GeneOntology.org TermDB o GO Associations: GeneOntology.org Association o Gene Expression: GEO (NCBI) • Web Services o BioGrid.org and AmiGO
Future Work • RobinViz Extensions: o Hypothetical network creation: Nomenclature o Embedded Network Analysis o Hierarchical Clustering o Integration of other popular databases • Graph unions, new definitions • PPI network prediction from multiple species
Recommend
More recommend