Non-Homogeneous Hidden Markov Model Qingyuan Liu
Introduction (Why Homogeneous HMM) • Classify new sequences into new family • Add related sequences into MSA • Compute MSA for groups of related sequence
Introduction (Building a HMM) • Seed sequences for HMM building • Ultra-large multiple sequence alignment using Phylogeny-aware Profiles (UPP) • Parameter – Emission Probability – Transition Probability
Background (Long Indels) • HMM can not deal with long indels. • Example: 10 consecutive residue loss • Assume 0.5 for each deletion transition probability • 0.5 10 is extremely small
Significance • Cause: Emission probability is fixed • Do HMM non-homogeneously instead – Emission probability is not fixed – Different parameters for different cases
Project • Literature review for how to build a non- homogeneous HMM. • Propose ideas for how to build an non-homogeneous HMM for MSA • Literature review for other possible MSA methods to deal with long indels • Combined tree- and profile-based alignment • Simulation based approach • Group-to-group sequence alignment
Literature: • Sarkar, Abhra, Anindya Bhadra, and Bani K. Mallick. "Nonparametric Bayesian Approaches to Non-homogeneous Hidden Markov Models.” (n.d.): n. pag. 8 May 2012. Web. 4 Apr. 2017. • Ghavidel, Fatemeh Zamanzad, Jargen Claesen, and Tomasz Burzykowski. "A Nonhomogeneous Hidden Markov Model for Gene Mapping Based on Next- Generation Sequencing Data." Journal of Computational Biology 22.2 (2015): 178-88. Web. • Grzegorczyk, Marco. "A Non-homogeneous Dynamic Bayesian Network with a Hidden Markov Model Dependency Structure among the Temporal Data Points." Machine Learning 102.2 (2015): 155-207. Web. • Gowri-Shankar, V., & Rattray, M. (2007). A Reversible Jump Method for Bayesian Phylogenetic Inference with a Nonhomogeneous Substitution Model. Molecular Biology and Evolution,24 (6), 1286-1299. doi:10.1093/molbev/ msm046. • Aalen, O., & Johansen, S. (1978). An Empirical Transition Matrix for Non- Homogeneous Markov Chains Based on Censored Observations. Scandinavian Journal of Statistics, 5 (3), 141-150. Retrieved from http://www.jstor.org/stable/ 4615704
Literature: • Vassiliou, P. G. (1997). The evolution of the theory of non ‐ homogeneous Markov systems. Applied Stochastic Models and Data Analysis,13 (34), 159-176. doi:10.1002/(sici)1099-0747(199709/12)13:3/4<159::aid-asm309>3.3.co;2-h • Loytynoja, A., & Goldman, N. (2008). A model of evolution and structure for multiple sequence alignment. Philosophical Transactions of the Royal Society B: Biological Sciences,363 (1512), 3913-3919. doi:10.1098/rstb.2008.0170 • Karin, E. L., Rabin, A., Ashkenazy, H., Shkedy, D., Avram, O., Cartwright, R. A., & Pupko, T. (2015). Inferring Indel Parameters using a Simulation-based Approach. Genome Biology and Evolution,7 (12), 3226-3238. doi:10.1093/gbe/ evv212 • Yamada, S., Gotoh, O., & Yamana, H. (2006). BMC Bioinformatics,7 (1), 524. doi:10.1186/1471-2105-7-524
Recommend
More recommend