uncovering the wealth of grapevine genetic diversity
play

Uncovering the wealth of grapevine genetic diversity through whole - PowerPoint PPT Presentation

Uncovering the wealth of grapevine genetic diversity through whole genome sequencing and assembly Dario Cant Associate Professor and Louis P. Martini Endowed Chair Grape Genomics Outline 1. Why do we need more genome references? 2. What do


  1. Uncovering the wealth of grapevine genetic diversity through whole genome sequencing and assembly Dario Cantù Associate Professor and Louis P. Martini Endowed Chair

  2. Grape Genomics Outline 1. Why do we need more genome references? 2. What do we need to generate and study more (high quality) grape genomes? 3. Where are we now and where are we going? 4. What do we still need? 2

  3. Grape Genomics The grape genome PN40024 12X V2 Canaguier et al., 2017 Mbp Chromosome Genome size Repeats Genes 50k 70 700 500 ± 62.9 Mbp 50 ± 2.4% 36,283 ± 3,029 Assembly size (Mbp) 60 600 Content (%) 40k N. genes 50 500 30k 40 400 300 30 20k Pierozzi and Moura, 2016 3

  4. Grape Genomics The first grapevine genomes (2007) Vitis vinifera var. PN40024 (487 Mb; 29,791 protein coding genes) Vitis vinifera cv. Pinot noir ENTAV 115 (504.6 Mb; 29,585 protein coding genes) 4

  5. Grape Genomics 1. Why do we need more grape genome references? 5

  6. Grape Genomics Thompson Seedless Corvina Tannat

  7. Grape Genomics Categorization of the 1,873 genes not shared with PN40024. Number of genes found in common among Tannat, Corvina, and Pinot Noir (ENTAV 115) Flavonoid biosynthesis V. vinifera cv. Tannat Da Silva C et al. Plant Cell 2013;25:4777-4788 7

  8. Grape Genomics Syrah Muscat Riesling Maturity WUE Rotundone TDN Terpenes Cabernet Sauvignon Sauvignon blanc Cabernet Franc Wolkovich et al., 2017 Nature Climate Change Methoxypirazines 8

  9. Grape Genomics 2. What do we need to generate and study more (high quality) grape genomes? 9

  10. Grape Genomics The challenge of heterozygosity Minio et al., 2017 10

  11. Grape Genomics The challenge of heterozygosity Wild grapes are dioecious Female Male Male Carmona et al., 2008 MA Walker MA Walker MA Walker MA Walker - Obligate out-crossers, so highly heterozygous - High recessive load and strong inbreeding depression - Hermaphrodites are rare in the wild since selfing expresses deleterious recessive alleles 11

  12. Grape Genomics Cultivated varieties are hermaphroditic, but suffer of inbreeding depression Examples of “famous crosses” Pinot Sémillon Chardonnay Riesling Gouias blanc Bowers et al., 1999 Science Aligoté Carménère Sauvignon Blanc Cabernet Franc Cabernet Sauvignon Bowers and Meredith, 1997 Nat Genetics Merlot 12

  13. Grape Genomics Cultivated varieties are hermaphroditic, but suffer of inbreeding depression Forward simulations under a model of recessive selection for Number of HET deleterious three demographic scenarios and two mating systems. mutations/accession 30 kya 8 kya (demographic shift) (onset of clonal Number of HET deleterious propagation) mutations/accession Clonal propagation Zhou et al., 2017 PNAS Zhou et al., 2017 PNAS 13

  14. Grape Genomics The challenge of heterozygosity First attempts to assemble Cabernet Sauvignon (~2012) Seq Assembler Assembly N. contigs NG50 LG50 technology size Illumina PE SOAPdenovo2 631 Mb 994,414 2,325 49,584 Illumina PE SPAdes 482 Mb 245,348 7,719 15,644 Illumina PE + Celera 574 Mb 154,787 24,598 5,591 PacBio (5x) 14

  15. Grape Genomics 2. What do we need to generate and study more (high quality) grape genomes? a) Better sequencing technologies : longer reads (single molecule real time, nanopore), optical maps, and long-range scaffolding/phasing b) Methods to extract pure and high-molecular weight DNA from grape tissue c) Assembly algorithms that enables the assembly of highly contiguous diploid genomes 15

  16. Grape Genomics SMRT sequencing Cabernet Sauvignon FPS Clone 08 20 kb & 30 kb libraries P6-C4 chemistry PacBio RSII 74 cells Mean: 10.7 kbp ~140X FALCON-unzip Quiver Diploid contigs (primary + haplotigs) Chin et al., 2016 Nature Methods 16

  17. Grape Genomics Common assemblers Contigs break at loci of SNPs heterozygous structural variation Falcon-unzip Haploid consensus Falcon Chin et al., 2016 Nature Methods 17

  18. Grape Genomics Sauvignon Cabernet Blanc Franc X Cabernet Sauvignon Bowers and Meredith, Nat Genetics 1997 18

  19. Grape Genomics Primary Haplotigs contigs # of Contigs 2,609 718 Assembly Size (Mb) 591 372 N50 size (Mb) 2.17 0.768 Cumulative contig size Contig size (Mbp) relative to genome size (480 Mb) Primary contigs Seq technology Assembler Assembly size N. contigs NG50 LG50 Illumina PE SOAPdenovo2 631,320,289 994,414 2,325 49,584 Haplotigs Illumina PE SPAdes 481,817,163 245,348 7,719 15,644 Illumina PE + Celera 573,589,710 154,787 24,598 5,591 PacBio (5x) PacBio FALCON-unzip 590,964,935 718 2,767,687 53 (20 + 30 kb lib) NG values (on 480 Mb genome size) 19

  20. Grape Genomics Assembly Assembly Size Size N. seqs N. seqs N50 (L50) N50 (L50) N90 (L90) N90 (L90) (Mb) (Mb) Contigs Contigs 591 591 718 718 2.17 Mb 2.17 Mb 0.42 Mb 0.42 Mb (72) (72) (300) (300) Scaffolds (i) Scaffolds (i) 592 592 330 330 9.4 Mb 9.4 Mb 0.96 Mb 0.96 Mb (21) (21) (87) (87) Scaffolds (ii) Scaffolds (ii) 592 592 246 246 11.3 Mb 11.3 Mb 1.7 Mb 1.7 Mb (19) (19) (66) (66) Scaffolds (iv) Scaffolds (iv) 559 559 182 182 11.8 Mb 11.8 Mb 2.8 Mb 2.8 Mb (18) (18) (52) (52) Hyperscaffolds ALT1 443 56 16.5 Mb 6.9 Mb (11) (26) Hyperscaffolds ALT2 330 33 14 Mb 6.1 Mb (10) (23) Optical maps (phasing and scaffolding) Alignment to PN40024 and pseudomolecules 20

  21. Cabernet Sauvignon Chr_01 ALT1 Chr_11 ALT1 Chr_06 ALT1 Chr_16 ALT1 Chr_01 Chr_06 Chr_11 Chr_16 Chr_02 ALT1 Chr_17 ALT1 Chr_12 ALT1 Chr_07 ALT1 Chr_02 Chr_07 Chr_17 Chr_12 Grape Genomics Chr_03 ALT1 Chr_18 ALT1 Chr_13 ALT1 Chr_08 ALT1 Chr_03 Chr_08 Chr_13 Chr_18 Chr_04 ALT1 Chr_14 ALT1 Chr_09 ALT1 Chr_19 ALT1 Chr_04 Chr_09 Chr_14 Chr_19 PN40024 V2 Chr_05 ALT1 Chr_15 ALT1 Chr_10 ALT1 Chr_05 Chr_10 Chr_15

  22. Grape Genomics 68% of the whole genome is phased in two haplotypes Assembly Size (Mb) N. chr N. contigs Gaps (Mb) ALT_1 455.7 19 525 14.2 (3.1%) ALT_2 310.0 19 422 13.5 (4.3%) Protein coding sequences ALT1 ALT2 Number of CDS 29,294 16,806 1.2 1.2 Mean CDS length (kb) Mean number of exons / CDS 5 5 Size (Mb) 22

  23. Grape Genomics Structural comparison between homologous chromosomes Chromosome 08 Chromosome 09 CS Chr_08 ALT2 CS Chr_09 ALT2 CS Chr_08 ALT1 CS Chr_09 ALT1 23

  24. Grape Genomics 3. Where are we and where are we going? 24

  25. Grape Genomics Cabernet Sauvignon Myles et al., 2011 PNAS 25

  26. Grape Genomics Optimization of SMRT sequencing and FALCON assembly Different coverage (5x - 115x) and assembly parameters Contig length (Mbp) Carmenere 616 assemblies NG values (relative to 480Mb genome size)

  27. Grape Genomics Expanding the gene space 480,000 Number of alleles 240,000 120,000 60,000 30,000 1 genotype 2 genotypes 3 genotypes 4 genotypes 5 genotypes 6 genotypes PN40024.gV2.aV3 + Corvina + Tannat + Cab. Sauv + Chardonnay + Zinfandel 100 Composition (%) 98 96 94 92 90 90 Core (shared by all) Variable (shared by at least 2 genotypes but not all) Unique (only in one genotype)

  28. Grape Genomics Genomic structural variation SNPs + indels* SVs* N. variants in V. vinifera spp. sylvestris 5.36 M 0.21 M N. variants in V. vinifera spp. vinifera 4.91 M 0.19 M Total size 7.44 Mbp 14.28 Mbp * relative to Chardonnay

  29. Grape Genomics What’s Next? North American Vitis Southwest Vitis V. girdiana V. flexuosa V. treleasei V. rupestris V. riparia V. acerifolia V. arizonica V. blancoii V. bloodworthiana V. california V. monticola V. labrusca V. vulpina V. aestivalis 18 my V. cinerea V. biformis V. palmata V. shuttleworthii V. mustangensis V. nesbittiania V. popenoei V. rotundifolia 29 Wan et al. BMC Evol. Biol. , 2013 RESEARCH-PGR #1741627

  30. Grape Genomics Ren4 Ren1 Ren2 Ren5 V. romeneti Ren6 V. sylvestris/vinifer a V. cinerea x Run1 V. rupestris Ren3 Ren7 Run2 american sp. V. piazeskii (`Regent`) M. rotundundifoila

  31. Grape Genomics

  32. Grape Genomics Isolates Assembly of contigs N50 (Kb) Number of Number Species Disease Citation sequenced size (Mb) genes Blanco-Ulate et al., 2013a Eutypa lata Eutypa dieback 11 54.5 10 6,542.31 15,313 Morales-Cruz et al., 2015 Blanco-Ulate et al., 2013b Neofusicoccum Botryosphaeria 16 43.7 27 2,555.42 13,124 Morales-Cruz et al., 2015 dieback parvum Massonnet et al., 2016 Blanco-Ulate et al., 2013c Phaeoacremonium Esca complex 5 47.3 24 5,520.70 14,790 Morales-Cruz et al., 2015 minimum Massonnet et al., 2018 Phaeomoniella Esca complex 2 27.5 702 178.60 6,986 Morales-Cruz et al., 2015 chlamydospora Botryosphaeria Diplodia seriata 1 37.1 695 304.20 9,398 Morales-Cruz et al., 2015 dieback Phomopsis Diaporthe ampelina 1 47.4 2,392 132.30 10,801 Morales-Cruz et al., 2015 dieback Powdery Erysiphe necator 6,533 5 52.5 5,936 21.4 Jones et al., 2014 mildew 32

  33. Grape Genomics The vineyard metagenome meta-RNA-seq Dr. Abraham meta-DNA-seq Morales-Cruz Multi-species reference for RNA-seq read mapping Morales-Cruz et al., 2017 Mol Plant Pathology 33

Recommend


More recommend