A Genomics extravaganza
Genomics overview Genomics analysis of the structure and function of very large numbers of genes (often the entire genome) undertaken in a simultaneous fashion. Structural genomics includes the genetic mapping, physical mapping and sequencing of entire genomes. Functional genomics determines the functions of genes and other non-coding parts of the genome. The sequencing of the Human genome will leave many questions unanswered, including the function of most of the estimated 30,000-40,000 human genes. Comparative genomics is the analysis and comparison of genomes from different species. The nature and significance of differences between genomes also provides a powerful tool for determining the relationship between genotype and phenotype through comparative genomics and morphological and physiological studies.
The Genome A genome is all* of a living thing's genetic material. It is the entire set of hereditary instructions for building, running, and maintaining an organism, and passing life on to the next generation. * Mitochondria? Plasmids?
The Genome Bacterial Genome Eukaryotic Genome karyotype Genes VII, Benjamin Lewin Borrelia burgdorferi http://www.genomenewsnetwork.org
The Genome Size and Complexity Genes VII, Benjamin Lewin
The Genome Bacterial Genome Eukaryotic Genome Genes VII, Benjamin Lewin http://ww2.mcgill.ca/biology/undergra/c200a/sec2-3.htm
The Genome C Value and G Value Paradoxs I Value? The high incident of alternative splicing and post-translational chemical modifications may provide 3x the number of proteins per gene in humans versus Fly and Round Worms Genes VII, Benjamin Lewin
The Genome Human 3200 Mb coding sequences comprise less than 5% of the genome repeat sequences account for at least 50% (cot) (1) transposon-derived repeats (2) pesudogenes (3) simple sequence repeats (4) segmental duplications Typed in 10-pitch font, stretches for more than 5,000 miles . (5) blocks of tandemly repeated sequences Repeats are often described as junk but contain: evolutionary record passive markers structural components Martina McGloughlin, UC Davis
Genomics Historical Timeline 1856 Gregor Mendel 1977 Gilbert and Sanger Principles of Heredity DNA sequencing 1983 Kary Mullis Polymerase Chain Reaction 1953 Watson & Crick Molecular Structure of DNA 1986 Leroy Hood 1961 Marshall Nirenberg Automated Sequencer Triplet Code 2001 Human Genome 1970 Hamilton Smith Published Site specific restriction enzyme www.genomenewsnetwork.org
Structural Genomics Sequencing Sequenced Genomes- Over 100 completed Aeropyrum pernix Haemophilus ducreyi Pyrococcus abyssi Agrobacterium tumefaciens Haemophilus influenzae Pyrococcus furiosus Anabaena Halobacterium Pyrococcus horikoshii Anopheles gambiae Helicobacter hepaticus Ralstonia solanacearum Aquifex aeolicus Helicobacter pylori Rhodopirellula baltica Arabidopsis thaliana Homo sapiens Rickettsia conorii Archaeoglobus fulgidus Lactobacillus plantarum Rickettsia prowazekii Bacillus anthracis Lactococcus lactis Rickettsia siberica Bacillus cereus Leptospira interrogans Saccharomyces cerevisiae Bacillus halodurans Listeria innocua Salmonella typhi Bacillus subtilis Listeria monocytogenes Salmonella typhimurium Bacteroides thetaiotaomicron Magnaporthe grisea Schizosaccharomyces pombe Bifidobacterium longum Mesorhizobium loti Shewanella oneidensis Blochmannia floridanus Methanobacterium thermoautotrophicum Shigella flexneria Bordetella bronchiseptica Methanococcoides burtonii Sinorhizobium meliloti Bordetella parapertussis Methanococcus jannaschii Staphylococcus aureus Bordetella pertussis Methanogenium frigidum Staphylococcus epidermidis Borrelia burgdorferi Methanopyrus kandleri Streptococcus agalactiae Bradyrhizobium japonicum Methanosarcina acetivorans Streptococcus mutans Brucella melitensis Methanosarcina mazei Streptococcus pneumoniae Brucella suis Mus musculus Streptococcus pyogenes Buchnera aphidicola Mycobacterium bovis Streptomyces avermitilis Caenorhabditis elegans Mycobacterium leprae Streptomyces coelicolor Campylobacter jejuni Mycobacterium paratuberculosis Sulfolobus solfataricus Caulobacter crescentus Mycobacterium tuberculosis Sulfolobus tokodaii Chlamydia muridarum Mycoplasma gallisepticum Synechococcus Chlamydia trachomatis Mycoplasma genitalium Synechocystis Chlamydophila caviae Mycoplasma penetrans Thermoanaerobacter tengcongensis Chlamydophila pneumoniae Mycoplasma pneumoniae Thermoplasma acidophilum Chlorobium tepidum Mycoplasma pulmonis Thermoplasma volcanium Ciona intestinalis Neisseria meningitidis Thermosynechococcus elongatus Clostridium acetobutylicum Neurospora crassa Thermotagoa maritima Clostridium perfringens Nitrosomonas europaea Treponema pallidum Clostridium tetani Oceanobacillus iheyensis Tropheryma whipplei Corynebacterium efficiens Oryza sativa Ureaplasma urealyticum Coxiella burnetii Pasteurella multocida Vibrio cholerae Deinococcus radiodurans Plasmodium falciparum Vibrio parahaemolyticus Drosophila melanogaster Plasmodium yoelii yoelii Vibrio vulnificus Encephalitozoon cuniculi Prochlorococcus marinus Wigglesworthia glossinidia Enterococcus faecalis Pseudomonas aeruginosa Xanthomonas axonopodis Escherichia coli Pseudomonas putida Xanthomonas campestris Fugu rubripes Pseudomonas syringae Xylella fastidiosa Fusobacterium nucleatum Pyrobaculum aerophilum Yersinia pestis Guillardia theta
Structural Genomics Sequencing Some Notable Sequences http://gnn.tigr.org/sequenced_genomes/genome_guide_p1.shtml
Structural Genomics Sequencing 87 88 89 90 91 92 93 94 95 96 97 1,400,000,000 1,200,000,000 1,000,000,000 800,000,000 600,000,000 400,000,000 200,000,000 0 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 GenBank Release Numbers Growth in GenBank is exponential. http://www.ncbi.nlm.nih.gov/Genbank/GenbankOverview.html
Structural Genomics Human Genome Human Genome Project (HGP) 1990 DOE and NIH provide $3 Billion for 15 year HGP 2 major goals: 1) Map the Human Genome as well as several Model organisms E. Coil Yeast C. Elegens Drosphilia Arabidopsis thaliana 2) Sequence Entire Genomes of these Species Celera Genomics 1998 founded by Applera Corporation and Dr. J. Venter Primary goal to sequence and assemble Human Genome in 3 years
Structural Genomics HGP Mapping 2 Types: Genetic Linkage Maps- Based on Marker Recombination Frequency (inheritance) Physical Linkage Maps- Splicing matching overlapping regions http://www.web-books.com/MoBio/Free/Ch8D1.htm
Structural Genomics HGP Mapping Resolution
Structural Genomics HPG From Map to Sequencing Clone by Clone Technique •Top Down Approach • MAP • Break Genome in to 200,000bp pieces • Make BAC Library and align BAC’s to Map • Make sub-clones of 500+ bp pieces • DNA of sub-clones is Sequenced • Sequenced DNA ordered (contigs) and aligned to MAP to produce highest resolution Genomic map Over sampling is essential http://www.ornl.gov/TechResources/Human_Genome/publicat/tko/04a_img.html
Bacterial Artificial Chromosome Cloning -needs Origin of Replication -selectable marker Ampicilin / LacZ B-galactosidase http://images.google.ca/imgres?imgurl=www.blc.arizona.edu/courses/181gh
Structural Genomics HGP Sequencing Automated Sanger method- Chain Termination/Shotgun sequence read assemble Capillary Electrophoresis http://gnn.tigr.org/whats_a_genome/Chp2_2.shtml
Structural Genomics Celera Sequencing Celera’s Method- Whole Genome Shotgun • Bottom up approach • Genome Randomly Fractured (sonication) • Size selection using Gel Electrophoresis •Pieces up to 800bp inserted in Bacteria • Sequencing • Computer Assembly utilizing overlapping regions Genes VII, Benjamin Lewin
Structural Genomics Assembly
Structural Genomics HPH Versus Celera Whole Genome Shotgun -Faster but more difficult to reassemble? -No Mapping -Less expensive (less labor intensive) -requires large IT resources -works best with existing scaffolding -easier as more genomes are sequenced (homologous DNA) Clone by Clone -Slower, Labor Intensive -allows sequencing to be divided amongst labs/countries -can adapt strategy to specific portions of genome Two genomes differ at about 1 in 1250 bases HGP Celera 0.65% unidentified bases 8.7% Unidentified bases 2.84GBp 2.66Gbp
Structural Genomics Errors Maximum Error Rate for HPG 1 in 10kb -must be smaller than rate of Polymorphisms -allow pseudo genes to be identified Sources of Error -DNA Contamination (from Bacterial Clone) -Repeats causing incorrect assembly -Sampled DNA Fragments are not Random (wgss) -sequencing (read) errors Error Correction -Over-sampling 8x -Comparative Analysis Genes VII, Benjamin Lewin
Structural Genomics New sequencing high volume techniques Dna Chips Pyrosequencing http://www.pyrosequencing.se/
Functional Genomics http://www.ornl.gov/TechResources/Human_Genome/posters/chromosome/chromo20.html
Functional Genomics Genome to Protein Finding Genes Genes can be identified in Genome sequence from common structure (Open Reading Frame) Finding AA Sequence Proper AA sequence can be determined from Triplet-code
Recommend
More recommend