miRNA Discovery & Prediction Algorithms Sergei Lebedev October - PowerPoint PPT Presentation

miRNA Discovery & Prediction Algorithms Sergei Lebedev October 13, 2012

What is miRNA? • microRNA or miRNA, ≈ 22 nucleotide-long non-coding RNA; • mostly expressed in a tissue-specific manner and play crucial roles in cell proliferation, apoptosis and differentiation during cell development; • thought to be involved in post-transcriptional control in plants and animals; • linked to disease 1 , for example hsa-miR-126 is associated with retinoblastoma, breast cancer, lung cancer, kidney cancer, asthma etc. 1 See http://www.mir2disease.org for details. 1 / 11

miRNA in action: nucleus [1] • pri-miRNA is transcribed by RNA polymerase II and seem to possess promoter and enchancer regions, similar to protein coding genes; • pri-miRNA is then cleaved into (possibly multiple) pre-miRNA by an enzyme complex Drosha . 2 / 11

miRNA in action: cytoplasm [1] • Dicer removes the stem-loop, leaving two complementary sequences: miRNA and miRNA*, the latter is not known to have any regulatory function. • Mature miRNA base-pairs with 3’ UTR of target mRNAs and blocks protein syntesis or causes mRNA degradation. 3 / 11

miRNA identification • Biological methods: northern blots, qRT-PCR 2 , micro arrays, RNA-seq or miRNA-seq. • Bioinformatics to the rescue! the usual strategy: first sequence everything, RNA-seq in this case, then try to make sense of whatever the result is. • In this talk: miRDeep [2], MiRAlign [3], MiRank [4]. • A lot of existing tools out of scope, most can be described with a one liner: “We’ve developed a novel method for miRNA identification, based on machine learning approach, SVM, HMM!” . 2 RT for reverse transcription, not real-time. 4 / 11

mirDeep 5 / 11

MiRAlign 6 / 11

miRank: overview • Treat miRNA identification problem as a problem of information retrieval, where novel miRNAs are to be retrieved from a set of candidates by the known query samples – “true” miRNAs. • More formally, given a set of known pre-miRNAs X Q as query samples and a set of putative candidates X U as unknown samples , rank X U with respect to X Q . • To do so, compute the relevancy values f i ∈ [0 , 1] for all unknown samples, assuming f i = 1 for query samples. • After that, simply select n ranked samples, which constitute to predicted pre-miRNA. • Makes sense, right? 7 / 11

miRank: how does it work? • miRank models belief propagation process by doing Markov random walks on a graph, where each vertex corresponds to either known pre-miRNA or a putative candidate and two vertices are connected by an edge if the two vertices are “close to each other” . • Each edge on the graph is assigned a weight w ij , proportional to the Euclidean distance between the samples v i and v j (see next slide on how samples are represented). • When a random walker transits from v i to v j it transmits the relevancy information of v i to v j by the following update rule: w ij f ( k +1) p ij f ( k ) � � = α + p ij f j p ij = i j deg ( v ij ) x j ∈ X U x j ∈ X Q 8 / 11

miRank: features Global • normalized minimum free energy of folding (MFE); • normalized no. of paired nucleotides on both arms; • normalized loop length. Local – RNAFold GUAGCACUAAAGUGCUUAUAGUGCAGGUAGUGUUUAGUUAUCUACUGCAUUAUGAGCACUUAAAGUACUGC ((((.(((.(((((((((((((((((.(((((......)).))))))))))))))))))))..))).)))) • Each nucleotide is either paired, denoted by a bracket ( – 5’ arm, ) – 3’ arm, or unpaired – . ; • Each local feature is a “word” of length 3, further distinguished by the nucleotide in the middle position, examples: ((. , .((. 9 / 11

miRank: good parts, bad parts & magic • The method doesn’t require any genomic annotations, except for the set of query samples. • ≈ 75% precision and ≈ 70% recall even with very few query samples (1, 5) – hard to validate, because the source code was never released. • The notion of similarity between query samples, which defines the graph structure is unclear, even though it looks critical for algorithm performance. • Two user-specified parameters, n – number of predicted samples and α – the weight of unknown samples in the relevancy value. How do they affect precision-recall and how to choose them? • Overall, it seems like miRank isn’t used much by biologists 3 . 3 http://www.ncbi.nlm.nih.gov/pubmed?linkname=pubmed_pubmed_ citedin&from_uid=18586744 10 / 11

References K. Chen and N. Rajewsky. The evolution of gene regulation by transcription factors and microRNAs. Nat. Rev. Genet. , 8(2):93–103, Feb 2007. M. R. Friedlander, W. Chen, C. Adamidi, J. Maaskola, R. Einspanier, S. Knespel, and N. Rajewsky. Discovering microRNAs from deep sequencing data using miRDeep. Nat. Biotechnol. , 26(4):407–415, Apr 2008. X. Wang, J. Zhang, F. Li, J. Gu, T. He, X. Zhang, and Y. Li. MicroRNA identification based on sequence and structure alignment. Bioinformatics , 21(18):3610–3614, Sep 2005. Y. Xu, X. Zhou, and W. Zhang. MicroRNA prediction with a novel ranking algorithm based on random walks. Bioinformatics , 24(13):i50–58, Jul 2008. 11 / 11

miRNA Discovery & Prediction Algorithms Sergei Lebedev October - PowerPoint PPT Presentation

miRNA Discovery & Prediction Algorithms Sergei Lebedev October 13, 2012 What is miRNA? microRNA or miRNA, 22 nucleotide-long non-coding RNA; mostly expressed in a tissue-specific manner and play crucial roles in cell

Using Base Pairing Probabilities for MiRNA Recognition Yet Another SVM for MiRNA Recognition:

UNESCO Discovery Centre reference image of education space UNESCO Discovery Centre Discovery

miRNA in Tumor Tissues An exploration of the article: MicroRNA Expression Signature of Human

EPIK miRNA Panel and Individuals Assays Better by Design www.bioline.com Introduction What

Better appreciation of true biological miRNA expression differences using an improved version of

ComiR: A New Efficient Tool for Predicting Multiple miRNA Targets Claudia Coronnello, PhD Dept.

Strongly + -cc forcing Generalising MA Our work Mirna D zamonja, Tutorial 3, including

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Branch Prediction Branch Prediction vs vs Execution Time Execution Time Prediction

Using lasso and related estimators for prediction Di Liu StataCorp July 12, 2019 1 / 20

Prediction and Odds 18.05 Spring 2017 Probabilistic Prediction Also called probabilistic

Using Stata 16s lasso features for prediction and inference Di Liu StataCorp 1 / 50

CS 104 Computer Organization and Design Branch Prediction CS104:Branch Prediction 1 Branch

Exercise 7a: Additional Intra Prediction Modes Implement Additional Block Prediction Modes Add

Fuzzy Logic Interval Clustering for Drug Discovery PREDICTION ACCURACY FOR DRUG DISCOVERY

From Search to Discovery in our Future Library From Search to Discovery W e see a spectrum of

Data-Centric Execution of Speculative Parallel Programs MA MARK JEFFREY, SUVINAY SUBRAMANIAN,

microbol Kickoff Conference Working Group 2: Qualification Frameworks and ECTS Prof. Ann

Macro-selection and micro-editing: a Wim Hacking, Roger Lemmens Statistics Netherlands prototype

A Cloud Benchmark Suite Combining Micro and Application Benchmarks Joel Scheuner, Philipp Leitner

V a l i d a t i o n o f a c o n t e x t a n a l y s i s me t h o d

Outline CSE 527 What is it Lecture 17, 11/24/04 How is it Represented RNA Secondary

Absolute notions in model theory Syntactic and semantic notions Absolutness from model theory

Descriptive and combinatorial set theory Introduction Singular cardinals, at singular cardinals

Sambuz

Useful Links

Newsletter

Mail Us