connections between cs and biology computing science and
play

connections between cs and biology computing science and biology - PDF document

connections between cs and biology computing science and biology (1) biology is the science of life introduction and overview progress through observation, experimentation, theory technology in part drives advances in biology


  1. connections between cs and biology computing science and biology (1) ● biology is the science of life introduction and overview ● progress through observation, experimentation, theory ● technology in part drives advances in biology example: bacteria example: evolution and genes ● von Leewenhoek (1683) discovered that in the ● Mendel (1865) experimented with pea plants to white matter between his teeth there were show inheritance of organism's traits millions of microscopic "animals – more, in fact, than there were human beings in the united ● Avery et al. (1944) established that genes, Netherlands ... very prettily a-moving" coded in DNA, carry our hereditary information ● Lister (1867) linked bacteria with disease … today, these insights are leading to diagnoses and treatments of genetic diseases … today, we have treatments, prevention for many bacterial diseases; appreciation for roles of bacteria in our environment clues to further understanding the more we know, the more lie at the molecular level we know we don't know ● 99% of bacteria are unidentified, since they can't be cultured (grown) in a lab environment ● we don't know how many genes we have or what functions are associated with most of http://www.umaryland.edu/graduate/ mcb/images/DNA2-smallest.gif these genes new technologies, including computers, are essential to the study of molecular biology

  2. our genome goal for today ● stores our genetic information in DNA molecules ● a double-stranded bead necklace ● see some roles that computing science plays with four different kinds of beads in advancing research in molecular biology (bases, nucleotides): A,C,G, T ● beads are paired: A-T, G-C ● but first, let's look at some of the molecules ● 3 billion base pairs in each cell in the cell ● on the order of one hundred trillion cells in an adult body (including bacterial cells, which have their own genomes) ● to store the raw information in our cells would require 30 trillion CDs! proteins genes ● to keep our body functioning, proteins are constantly manufactured in our cells ● proteins are the body's activists: ● genes – segments of our DNA – contain codes carry blood, digest food, form for proteins hair, fingernails, and much more ● beads on a necklace, with 20 ● a codon – three bases of DNA – codes for different bead types (amino one amino acid acids) ● the beads fold into interesting ● the genetic code specifies the correspondence shapes between codons and amino acids http://www.nigms.nih.gov/ ● the shape is key to the function of news/science_ed/structlife/ the molecule the genetic code more on genes TTT phenylalanine TCT serine TAT tyrosine TGT cysteine TTC phenylalanine TCC serine TAC tyrosine TGC cysteine TTA leucine TCA serine TAA stop TGA stop TTG leucine TCG serine TAG stop TGG tryptophan ● by the recent estimates, humans have perhaps CTT leucine CCT proline CAT histidine CGT arginine as few as 20,000 genes CTC leucine CCC proline CAC histidine CGC arginine CTA leucine CCA proline CAA glutamine CGA arginine CTG leucine CCG proline CAG glutamine CGG arginine ● we share ATT isoleucine ACT threonine AAT asparagine AGT serine ATC isoleucine ACC threonine AAC asparagine AGC serine – 99.9% of our genome with each other ATA isoleucine ACA threonine AAA lysine AGA arginine ATG methionine (start) ACG threonine AAG lysine AGG arginine – 98% of our genome with chimpanzees GTT valine GCT alanine GAT aspartic acid GGT glycine – 50% of our genome with the roundworm GTC valine GCC alanine GAC aspartic acid GGC glycine GTA valine GCA alanine GAA glutamic acid GGA glycine GTG valine GCG alanine GAG glutamic acid GGG glycine ● mutations – changes in the bases of a gene – example: Methionine – Isoleucine – Phenelalanine – Aspartic can cause genetic diseases Acid – Glycine … is coded by ATGATCTTTGACGGG … (as well as by other codes)

  3. challenges in molecular what do computers provide? biology ● tools to determine genomic sequences ● what are our genes? ● access to data : annotated databases of ● what are our proteins? genomic and protein data ● tools for analyzing data : learning what the ● what to these proteins do? data means: what are the structure of ● what genes, proteins do other organisms have? molecules, where are the genes ● tools for visualizing data : enabling visual success in answering these questions will interpretation of data lead to understanding and ultimately to better prevention and cure of diseases let's see some concrete examples example: determining genomic examples: providing access to sequences data ● National Center for Biotechnology Information: repository of sequence data, ● on 12 April 2003, a group at the BC Cancer including whole genomes of over 800 organisms Agency's Genome Sciences Centre in Vancouver, lead by Caroline Astell, became ● Protein Information Resource: protein databases the first group worldwide to sequence the and analysis tools genomic material of the SARS virus – founded in 1984, building on work of Margaret Dayhoff, who published the first comprehensive "Atlas ● computer assembly of sequence data was a of Protein Sequence and Structure" and who major part of the effort pioneered development of computer methods for comparing protein sequences ● specialized sites for organisms example: how bacteria cause example: how organisms are disease related ● "Our laboratory is using computer-based analysis, combined with laboratory experimentation, to gain ● Charles Darwin and his successors relied on a better understanding of how some bacteria comparison of visible traits of organisms to cause disease." – Fiona Brinkman, SFU, winner guess at evolutionary tree of the 2003 B.C. Science Council Young Innovator Award ● nowadays, DNA of organisms is compared, yielding more reliable trees ● the genome sequence of a bacterium can be analyzed by computer to gain knowledge about virulent proteins produced by the bacterium ● many advantages over traditional approaches to understanding bacteria

  4. example: how organisms are what about influence of related biology on computing? "My research arose from a fascination with the diversity of ● viruses, worms have taken on new forms and behaviours of jumping meanings! which lead to systematics, spiders, ● evolution through genetic mutation is which led to phylogenetic theory and computer programming. " successful at "finding good solutions" to – Wayne Maddison, Professor and Canada Research Chair, UBC nature's "optimisation problems"; similar methods can be used in computations • Wayne and his brother ● nature's ways of communication (e.g. ants) maintain the ‘Tree of Life’ and are also emulated in computational settings MacClade websites; MacClade ● if DNA is such a remarkable means for is a tool for analyzing information storage, could DNA be used for phylogenetic trees. computing? summary resources ● molecular approach to biology, with its ● the structure of life associated vast quantities of sequence data, – http://www.nigms.nih.gov/news/science_ed/structlife/ relies on sophisticated computational tools ● sequencing SARS – databases – http://www.vanmag.com/0306/sars.html – visualization and graphics ● Fiona Brinkman's lab: – software engineering – http://www.pathogenomics.sfu.ca/brinkman/index.html – algorithms ● Wayne Maddison's ‘Tree of Life’ site: – human-computer interaction – http://tolweb.org/tree/phylogeny.html ● UBC’s Bioinformatics Centre BioTeach site: ● at the same time, nature has made its mark – http://www.bioteach.ubc.ca/Bioinformatics/ on computational methods for solving problems

Recommend


More recommend