4/24/09 CSCI1950‐Z Computa4onal Methods for Biology Lecture 22 Ben Raphael April 22, 2009 hGp://cs.brown.edu/courses/csci1950‐z/ Outline Network alignment and querying • PathBLAST • Color coding and randomized algorithms. 1
4/24/09 PathBLAST • Goal: iden4fy conserved pathways (chains) • Idea: can be done efficiently by dynamic programming if networks are DAGs A B C D A X’ D’ ’ Score: match + gap + mismatch + match Kelley et al (2003) Why paths? 2
4/24/09 PathBLAST (Kelley, et al. PNAS 2003) • Find conserved pathways in protein interac4on maps of two species • Model & Scoring: (Whiteboard) PathBLAST Scoring 3
4/24/09 PathBLAST Problem: Networks are neither acyclic nor directed • • Solu4on: Randomize Impose random ordering on nodes, perform DP; repeat many 4mes 5 2 1 2 4 1 4 1 2 5 3 4 3 3 5 • On average, highest scoring path preserved in 2/L! subgraphs • Finds conserved paths of length L within networks of size n in O ( L ! n ) expected 4me • Drawbacks – Computa4onally expensive – Restricts search to specific topology Kelley et al (2003) PathBLAST 4
4/24/09 PathBLAST: Computa4onal Formula4on • I = {start ver4ces}, e.g. receptors. • Goal: Find highest scoring paths I v for all v in G. v ScoG, et al. JCB 2006 PathBLAST: Computa4onal Formula4on • Given: – Undirected weighted graph G = (V, E, w) – Set of start ver4ces I, and end vertex v, • Find: a minimum‐weight simple path P = (v 1 ,e 1 ,v 2 , e 2 , …, e k‐1 , v k ) star4ng in I and ending at v: – v 1 in I and v k = v. • Recall: Simple path v i ≠ v j if i ≠ j • NP‐hard in general (reduc4on from v TSP) • Let w k ( v ) = weight of above. – Dynamic programming solu4on (whiteboard) ScoG, et al. JCB 2006 5
4/24/09 Color‐coding (Alon, Yuster, & Zwick) • Assign each vertex random color between 1 and k. • Colorful path: path w/ dis4nct colors . l • Colorful path simple path. • Goal : find colorful paths – Dynamic programming solu4on (whiteboard) • High‐scoring path not discovered when two ver4ces have same color. • Repeat for many random colorings. (How many?) v Adding extra constraints • Require a protein: assign it a unique color. • Require a specific number of proteins from a set T: W(v, S, c) = min. weight of path … (same as above) and contains exactly c ver4ces in T. • Order constraint on proteins in path – Membrane proteins transcrip4on factors. 6
4/24/09 Adding extra constraints Rooted trees: Rooted at v. Every leaf is in I. v Color‐coding (Alon, Yuster, & Zwick) • Extends to many other cases of subgraph isomorphism problem : – Does a graph G have a subgraph isomorphic to graph H? • H = simple path of length k . • H = simple cycle of length k . • H = tree. • H = graph of fixed (bounded) tree‐width 7
4/24/09 Addi4onal Problems 1. Efficient querying of a network (e.g. QNET) 2. Find conserved subgraphs Heavy subgraphs in product graph 3. Mul4ple network alignment Sources • Kelley BP, Sharan R, Karp RM, SiGler T, Root DE, Stockwell BR, Ideker T. (2003) Conserved pathways within bacteria and yeast as revealed by global protein network alignment. Proc Natl Acad Sci U S A. 100(20):11394‐9. • ScoG J, Ideker T, Karp RM, Sharan R. (2006) Efficient algorithms for detec4ng signaling pathways in protein interac4on networks. J Comput Biol .. 13(2):133‐44. • Alon, N., Yuster, R., and Zwick, U. (1995). Color‐ coding. J. ACM 42, 4. 8
Recommend
More recommend