inferring linkage information from sequencing
play

Inferring Linkage Information from Sequencing Chromatograms Bastian - PowerPoint PPT Presentation

Inferring Linkage Information from Sequencing Chromatograms Bastian Beggel Max-Planck-Institute for Informatics Saarbrcken At the limits of Sanger sequencing data YMDD Motif of HBV Two resistance AG, GT mutations Are the resistance


  1. Inferring Linkage Information from Sequencing Chromatograms Bastian Beggel Max-Planck-Institute for Informatics Saarbrücken

  2. At the limits of Sanger sequencing data YMDD Motif of HBV Two resistance AG, GT mutations Are the resistance mutations on the same strain? Bastian Beggel Slide 2

  3. What is the mixture composition? YMDD Motif of HBV Position Position Clones Comment Proportions 610 612 1 WT WT Wild type 25.0% 2 WT RAM M204V 25.0% 3 RAM WT M204I 32.5% 4 RAM RAM M204V 17.5% Bastian Beggel Slide 3

  4. Outline 1. Introduction 2. Related Work 3. Model 4. Model Evaluation Bastian Beggel Slide 4

  5. Polymerases show context dependent incorporation of dideoxy-nucleotides Sanger Sequencing (chain terminator sequencing) • PCR creates fragments of different length by incorporating chain terminators • The four dideoxy-nucleotide chain terminators are labeled with different fluorescent dyes Separation by size (e.g. Capillary electrophoresis ) Chromatogram Laser Detector Source: Bastian Beggel Slide 5 http://www.daviddarling.info/encyclopedia/D/DNA_sequencing.htm

  6. Early Chromatograms showed unequal peak heights Chromatograms from Kwok et al. 1994 Bastian Beggel Slide 6

  7. Sequence context-dependent incorporation of dideoxy-nucleotides DNA Polymerase http://spine.rutgers.edu/cellbio/assets/flash/dnapol.htm Bastian Beggel Slide 7

  8. The peak heights of a mixture are the proportion-weighted mixture of the peak heights of the underlying clonal variants Data from Carr et al. 2009: Peak heights of a dilution series Bastian Beggel Slide 8

  9. Peak height profiles vary significantly for different mixtures Artificial Experiment Mixture 1 + 4 (left) Mixture 2 + 3 (right) Bastian Beggel Slide 9

  10. Outline 1. Introduction 2. Related Work 3. Model 4. Model Evaluation Bastian Beggel Slide 10

  11. The observed chromatogram is the sum of all single molecular fluorescence impulses Observed Polymerase processes DNA on single molecular level Chromatogram 50% C 1 50% C 2 0% C 3 0% C 4 Source: http://spine.rutgers.edu/cellbio/assets/flash/dnapol.htm Bastian Beggel Slide 11

  12. The linear model assumption is combined with a Gaussian error model • Peak heights of query chromatogram h i Input Data • Peak heights of clonal variants p ji • γ B[i] · h i | M, α , γ B[i] ~N (m i = ∑α i ·p ji , σ ) Conditional Distribution • σ = const. Assumption = ∫ α ⋅ α α ( | ) ( | , ) ( ) P D M P D M P d Marginal Likelihood for P α ( ), ( ) P M Model Selection With: uniform priors Bastian Beggel Slide 12

  13. Outline 1. Introduction 2. Related Work 3. Model 4. Model Evaluation Bastian Beggel Slide 13

  14. The accuracy of the model predictions depend on the distance of the two ambiguous positions 1 base in-between Correct 93% Incorrect 0% Uncertain 7% 3 bases in-between Correct 55% Incorrect 19% Uncertain 26% Bastian Beggel Slide 14

  15. Conclusions • Sequence-context depended incorporation provides linkage information Findings • Peak height profiles of mixtures can be computed • Proportion estimates Predictions • Reconstruction of linkage • Limited accuracy • Limited range Limitations • Profiles of the clonal variants are required Bastian Beggel Slide 15

  16. Thank you for your attention • Thomas Lengauer • Alex Thielen • Maria Neumann-Fraune • Rolf Kaiser Bastian Beggel Slide 16

  17. End Bastian Beggel Slide 17

  18. Subsequent improvements lead to almost equal peak heights Chain Terminators Polymerase Protocol and Sequencer Lee LG et al. 1992 Ying Li et al. 1999 ABI 3100 sequencer • Chain terminator sequencing has required 15 years of research • Different companies use different materials/ protocols • Context-dependent incorporation of dideoxy-nucleotides was seen as a burden Bastian Beggel Slide 18

Recommend


More recommend