population based detection of structural variants in
play

Population-based detection of Structural Variants in normal and - PowerPoint PPT Presentation

Population-based detection of Structural Variants in normal and aberrant genomes. Jean Monlong, PhD2 Guillaume Bourques group Research Day - June 5, 2014 Human Genetics Dept. 1 / 13 What is structural variation ? Genetic variation


  1. Population-based detection of Structural Variants in normal and aberrant genomes. Jean Monlong, PhD2 Guillaume Bourque’s group Research Day - June 5, 2014 Human Genetics Dept. 1 / 13

  2. What is structural variation ? Genetic variation involving more than 500bp. Baker 2012, Nature Methods. Raphael Lab, Brown University. Structural Variant: SV ; Copy Number Variation: CNV . 2 / 13

  3. Why is it important ? ◮ Major role in evolution. ◮ Population Genetics: widespread variation across humans. ◮ Association with diseases and cancer. SV detection using High-Throughput Sequencing ◮ Sample is sequenced. ◮ Reads are mapped to the reference genome. ◮ Unexpected patterns could be explain by presence of SVs. 3 / 13

  4. SV detection using High-Throughput Sequencing Baker 2012, Nature Methods. 4 / 13

  5. Limitation Low mappability ◮ Noisy or reduced signal in repeat-rich regions, centromeres, telomeres. ◮ Unpredictable segmentation → reduced sensitivity/specificity. ◮ Filtering problematic regions reduces the genome range tested. number of reads mapped genomic window number of reads mapped genomic window 5 / 13

  6. Objective Test the entire genome, including low-mappability regions, and detect subtle abnormal coverage. PopSV : Population-based approach Use a set of reference experiments to detect abnormal patterns. number of reads mapped sample reference tested genomic window 6 / 13

  7. PopSV : Population-based approach number of reads mapped sample reference tested genomic window Workflow 1. Genome is fragmented in bins. 2. Reads in each bin are counted, for each sample. 3. Normalization of the bin counts. 4. Each sample and each bin is tested for divergence from reference samples (Z-score). 5. P-value estimation and multiple test correction. 7 / 13

  8. CageKid : Renal Cell Cancer Whole-Genome Sequencing of 100 individuals, ∼ 40X coverage, Illumina paired-end 100bp, normal and tumor paired samples. ◮ Normal samples → reference samples. ◮ 10kb bins. ◮ Only properly paired and mapped read pairs. Validation and benchmark ◮ Germline events detected in tumor samples ? ◮ Concordant with SNP-array calls ? ◮ Twin dataset: concordant with the pedigree ? ◮ Concordant when using different bin sizes ? PopSV detected more concordant calls than other methods. 8 / 13

  9. Example: Partial tumoral event ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 4000 ● ● ● ● read coverage ● ● 2000 tumor sample: D000GMU ● abnormal ● normal 0 normal samples 100.75 100.80 100.85 100.90 100.95 101.00 position (Mb) Chr.1, overlapping CDC14A gene (cell division cycle), not detected by other approaches. 9 / 13

  10. Example: Telomeric region 6000 ● ● ● ● ● ● ● read coverage 4000 2000 normal sample: D000GQ9 ● abnormal ● normal 0 normal samples 135.11 135.13 135.15 position (Mb) Chr.10, overlapping genes (PRAP1, CALY), not detected by other approaches. 10 / 13

  11. PopSV flexibility Custom binning: repeat annotation ◮ Increased resolution in regions of interest. ◮ Promising results: enrichment in centromere/telomere. Counting discordant reads ◮ Detect excess of discordant reads. ◮ Promising results, including on repeats. 11 / 13

  12. Conclusion Robust and sensitive approach ◮ Detection in low mappability regions and partial tumoral signal. ◮ Superior to other Read-Depth methods. ◮ Wider range of the genome tested. Work in progress ◮ Explore results and application to other projects ( e.g. Pan-Cancer Analysis of Whole Genome ). ◮ Custom binning: repeat annotation, Whole-Exome Sequencing. ◮ More than an CNV caller. ◮ Excess of discordant read pairs. ◮ Combination with orthogonal approaches (PEM, Assembly). 12 / 13

  13. Acknowledgment ◮ Guillaume Bourque ◮ Mathieu Bourgey ◮ Simon Gravel ◮ Louis Letourneau ◮ Mathieu Blanchette ◮ Francois Lefebvre ◮ Mehran Karimzadeh Reghbati ◮ Eric Audemard ◮ Toby Hocking 13 / 13

Recommend


More recommend