Systematic Annotation Mark Voorhies 4/5/2012 Mark Voorhies - PowerPoint PPT Presentation

Systematic Annotation Mark Voorhies 4/5/2012 Mark Voorhies Systematic Annotation

Review RTFM PNAS 95:14863 Mark Voorhies Systematic Annotation

The Gene Ontology Three directed acyclic graphs (aspects): Biological Process Molecular Function Subcellular Component Mark Voorhies Systematic Annotation

The Gene Ontology Mark Voorhies Systematic Annotation

The AmiGO browser Mark Voorhies Systematic Annotation

The Gene Ontology How might we annotate genes with GO terms? How do we calculate the significance of the GO terms associated with a particular group of genes? Mark Voorhies Systematic Annotation

Associating GO terms How might we annotate genes with GO terms? Mark Voorhies Systematic Annotation

Associating GO terms How might we annotate genes with GO terms? By sequence homology ( e.g. , BLAST) By domain homology ( e.g. , InterProScan) Mapping from an annotated relative ( e.g. , INPARANOID) Human curation of the literature ( e.g. , SGD) Mark Voorhies Systematic Annotation

Associating GO terms: Evidence codes Experimental EXP: Inferred from Experiment IDA: Inferred from Direct Assay IPI: Inferred from Physical Interaction IMP: Inferred from Mutant Phenotype IGI: Inferred from Genetic Interaction IEP: Inferred from Expression Pattern Computational Analysis ISS: Inferred from Sequence or Structural Similarity ISO: Inferred from Sequence Orthology ISA: Inferred from Sequence Alignment ISM: Inferred from Sequence Model IGC: Inferred from Genomic Context RCA: inferred from Reviewed Computational Analysis Author Statement TAS: Traceable Author Statement NAS: Non-traceable Author Statement Curator Statement Evidence Codes IC: Inferred by Curator ND: No biological Data available Automatically-assigned IEA: Inferred from Electronic Annotation Obsolete NR: Not Recorded Mark Voorhies Systematic Annotation

The Gene Ontology How might we annotate genes with GO terms? How do we calculate the significance of the GO terms associated with a particular group of genes? Mark Voorhies Systematic Annotation

Sampling with replacement: Mutagenesis How many transformants do we have to screen in order to “cover” a genome? Mark Voorhies Systematic Annotation

Sampling with replacement: Mutagenesis How many transformants do we have to screen in order to “cover” a genome? Probability that a transformant has (1) disrupted gene: p m Number of genes in organsim: N g Mark Voorhies Systematic Annotation

Sampling with replacement: Mutagenesis How many transformants do we have to screen in order to “cover” a genome? Probability that a transformant has (1) disrupted gene: p m Number of genes in organsim: N g Probability that a specific gene is disrupted in a specific transformant: � 1 � = p m p d = p m (1) N g N g Mark Voorhies Systematic Annotation

Sampling with replacement: Mutagenesis How many transformants do we have to screen in order to “cover” a genome? Probability that a transformant has (1) disrupted gene: p m Number of genes in organsim: N g Probability that a specific gene is disrupted in a specific transformant: � 1 � = p m p d = p m (1) N g N g Probability of not disrupting that gene: p u = 1 − p m (2) N g Mark Voorhies Systematic Annotation

Sampling with replacement: Mutagenesis Probability of not disrupting that gene: p u = 1 − p m (3) N g Mark Voorhies Systematic Annotation

Sampling with replacement: Mutagenesis Probability of not disrupting that gene: p u = 1 − p m (3) N g The probability of not disrupting that gene n independent times is: � n � 1 − p m p u , n = (4) N g Mark Voorhies Systematic Annotation

Sampling with replacement: Mutagenesis Probability of not disrupting that gene: p u = 1 − p m (3) N g The probability of not disrupting that gene n independent times is: � n � 1 − p m p u , n = (4) N g And the probability of disrupting that gene n independent times is: � n � 1 − p m p d , n = 1 − p u , n = 1 − (5) N g Mark Voorhies Systematic Annotation

Sampling with replacement: Mutagenesis Probability of not disrupting that gene: p u = 1 − p m (3) N g The probability of not disrupting that gene n independent times is: � n � 1 − p m p u , n = (4) N g And the probability of disrupting that gene n independent times is: � n � 1 − p m p d , n = 1 − p u , n = 1 − (5) N g This is also the expected genome coverage. Mark Voorhies Systematic Annotation

Sampling with replacement: Mutagenesis 1.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.8 ● ● 0.6 p_i or coverage ● 0.4 ● 0.2 0.0 0 50000 100000 150000 200000 n Mark Voorhies Systematic Annotation

Sampling with replacement: General Cases Calculating the probability of zero events was easy. � n � 1 − p m p 0 , n = (6) N g Mark Voorhies Systematic Annotation

Sampling with replacement: General Cases Calculating the probability of zero events was easy. � n � 1 − p m p 0 , n = (6) N g What about exactly k events? Mark Voorhies Systematic Annotation

Sampling with replacement: General Cases Calculating the probability of zero events was easy. � n � 1 − p m p 0 , n = (6) N g What about exactly k events? Binomial distribution: � n � p k m (1 − p m ) n − k p k , n = (7) k Mark Voorhies Systematic Annotation

Sampling with replacement: General Cases Calculating the probability of zero events was easy. � n � 1 − p m p 0 , n = (6) N g What about exactly k events? Binomial distribution: � n � p k m (1 − p m ) n − k p k , n = (7) k What if there is more than one type of event? Mark Voorhies Systematic Annotation

Sampling with replacement: General Cases Calculating the probability of zero events was easy. � n � 1 − p m p 0 , n = (6) N g What about exactly k events? Binomial distribution: � n � p k m (1 − p m ) n − k p k , n = (7) k What if there is more than one type of event? Multinomial distribution: n ! � p k i p k 1 , k 2 ,..., n = � k i ! (8) i Mark Voorhies Systematic Annotation

Sampling without replacement: GO Annotation The binomial distribution assumes that event probabilities are constant: � n � p k m (1 − p m ) n − k p k , n = (9) k Mark Voorhies Systematic Annotation

Sampling without replacement: GO Annotation The binomial distribution assumes that event probabilities are constant: � n � p k m (1 − p m ) n − k p k , n = (9) k What if there are m virulence factors in our genome, and every time we discover one it is magically removed from our library? Mark Voorhies Systematic Annotation

Sampling without replacement: GO Annotation The binomial distribution assumes that event probabilities are constant: � n � p k m (1 − p m ) n − k p k , n = (9) k What if there are m virulence factors in our genome, and every time we discover one it is magically removed from our library? Hypergeometric distribution: � m �� N − m � k n − k p k , m , n = (10) � N � n Mark Voorhies Systematic Annotation

Sampling without replacement: GO Annotation The binomial distribution assumes that event probabilities are constant: � n � p k m (1 − p m ) n − k p k , n = (9) k What if there are m virulence factors in our genome, and every time we discover one it is magically removed from our library? Hypergeometric distribution: � m �� N − m � k n − k p k , m , n = (10) � N � n More than one disjoint type of label: � � m i � k i p k 1 , k 2 ,..., m 1 , m 2 ,..., n = (11) � N � n Mark Voorhies Systematic Annotation

Extracting gene lists from JavaTreeView Mark Voorhies Systematic Annotation

The SGD GO Slim Mapper Mark Voorhies Systematic Annotation

Multiple Hypothesis Testing http://xkcd.com/882/ Mark Voorhies Systematic Annotation

Alternatives to Hierarchical Clustering GORDER and pre-clustering by SOM Mark Voorhies Systematic Annotation

Alternatives to Hierarchical Clustering GORDER and pre-clustering by SOM Pre-calling number of clusters: k-means and k-medians Mark Voorhies Systematic Annotation

Alternatives to Hierarchical Clustering GORDER and pre-clustering by SOM Pre-calling number of clusters: k-means and k-medians Principal Component Analysis (PCA) Mark Voorhies Systematic Annotation

Homework Download PyMol Mark Voorhies Systematic Annotation

Systematic Annotation Mark Voorhies 4/5/2012 Mark Voorhies - PowerPoint PPT Presentation

Systematic Annotation Mark Voorhies 4/5/2012 Mark Voorhies Systematic Annotation Review RTFM PNAS 95:14863 Mark Voorhies Systematic Annotation The Gene Ontology Three directed acyclic graphs (aspects): Biological Process Molecular

Annotation Processing in a Kotlin World Zac Sweers @pandanomic Annotation Processing in a

Annotation and Evaluation Diana Maynard, Niraj Aswani University of Sheffield University of

Lecture 2 Annotation tools & Segmentation Summary of Part 1 Annotation theory

Assessing annotation Assessing annotation consistency in the Gene consistency in the Gene

Introduction Detecting Errors in Effects of Annotation Errors Detecting Errors in Corpus

Web Annotations Building the Experience Annotation An annotation is something added. It is not

Cha-Q 2 adding feature resolving issue adding feature resolving issue 3 Systematic Edits 4

Systematic Mapping Studies Marcel Heinz 23. Juli 2014 Marcel Heinz Systematic Mapping Studies

GENIE Systematic Errors GENIE Systematic Errors GENIE Systematic Errors Hugh Gallagher, Tufts

Polysemy in Verbs: Systematic Relations between Senses and Their Effect on Annotation Anna

Systematic Reviews 8 March 2007 Simon Gates Contents Reviewing of research Why we need

Project Simple Annotation Pipeline - Ranjit Kumaresan Simple Annotation Pipeline Run a gene

Characterization and re- -annotation annotation Characterization and re of common genes found

Resources for Computational Linguistics Annotation Tools: RSTTool &MMAX Presentation by

Bacterial Genome Annotation Lucile Soler Annotation course 9 th -11 th may 2017 Bacterial genome

Image organization, annotation, Image organization, annotation, and retrieval from a human- -

Phenotype Sequencing Marc Harper UCLA Bioinformatics, Genomics and Proteomics March 4th, 2013

physicochemical and toxicological properties of chemicals using computed molecular descriptors

Entrepreneurship does it start with a good idea? Dr Erik Lundmark What do scholars mean when

through Coverage-guided Tracing Stefan Nagy Matthew Hicks snagy2@vt.edu mdhicks2@vt.edu

Current cautions about drug development in treatment naive populations more risk than

Exercise. SNP-based drug resistance to Nevirapine drug against the HIV reverse transcriptase Marc

Predicting virus mutations through relational learning AIMM 2012 E Cilia 1 , S Teso 2 , S

in Data Mining (An overview to Multiple Instance Learning) Sebastin Ventura Soto Knowledge

Sambuz

Useful Links

Newsletter

Mail Us

Systematic Annotation Mark Voorhies 4/5/2012 Mark Voorhies - PowerPoint PPT Presentation

Systematic Annotation Mark Voorhies 4/5/2012 Mark Voorhies Systematic Annotation Review RTFM PNAS 95:14863 Mark Voorhies Systematic Annotation The Gene Ontology Three directed acyclic graphs (aspects): Biological Process Molecular

Annotation Processing in a Kotlin World Zac Sweers @pandanomic Annotation Processing in a

Annotation and Evaluation Diana Maynard, Niraj Aswani University of Sheffield University of

Lecture 2 Annotation tools &amp; Segmentation Summary of Part 1 Annotation theory

Assessing annotation Assessing annotation consistency in the Gene consistency in the Gene

Introduction Detecting Errors in Effects of Annotation Errors Detecting Errors in Corpus

Web Annotations Building the Experience Annotation An annotation is something added. It is not

Cha-Q 2 adding feature resolving issue adding feature resolving issue 3 Systematic Edits 4

Systematic Mapping Studies Marcel Heinz 23. Juli 2014 Marcel Heinz Systematic Mapping Studies

GENIE Systematic Errors GENIE Systematic Errors GENIE Systematic Errors Hugh Gallagher, Tufts

Polysemy in Verbs: Systematic Relations between Senses and Their Effect on Annotation Anna

Systematic Reviews 8 March 2007 Simon Gates Contents Reviewing of research Why we need

Project Simple Annotation Pipeline - Ranjit Kumaresan Simple Annotation Pipeline Run a gene

Characterization and re- -annotation annotation Characterization and re of common genes found

Resources for Computational Linguistics Annotation Tools: RSTTool &amp;MMAX Presentation by

Bacterial Genome Annotation Lucile Soler Annotation course 9 th -11 th may 2017 Bacterial genome

Image organization, annotation, Image organization, annotation, and retrieval from a human- -

Phenotype Sequencing Marc Harper UCLA Bioinformatics, Genomics and Proteomics March 4th, 2013

physicochemical and toxicological properties of chemicals using computed molecular descriptors

Entrepreneurship does it start with a good idea? Dr Erik Lundmark What do scholars mean when

through Coverage-guided Tracing Stefan Nagy Matthew Hicks snagy2@vt.edu mdhicks2@vt.edu

Current cautions about drug development in treatment naive populations more risk than

Exercise. SNP-based drug resistance to Nevirapine drug against the HIV reverse transcriptase Marc

Predicting virus mutations through relational learning AIMM 2012 E Cilia 1 , S Teso 2 , S

in Data Mining (An overview to Multiple Instance Learning) Sebastin Ventura Soto Knowledge

Sambuz

Useful Links

Newsletter

Mail Us

Lecture 2 Annotation tools & Segmentation Summary of Part 1 Annotation theory

Resources for Computational Linguistics Annotation Tools: RSTTool &MMAX Presentation by