using dna from many samples to distinguish pedigree
play

Using DNA from many samples to distinguish pedigree relationships of - PowerPoint PPT Presentation

Using DNA from many samples to distinguish pedigree relationships of close relatives Amy L. Williams @amythewilliams February 24, 2020 Family History Technology Workshop Massive datasets: Many close relatives / small pedigrees >100,000


  1. Using DNA from many samples to distinguish pedigree relationships of close relatives Amy L. Williams @amythewilliams February 24, 2020 Family History Technology Workshop

  2. Massive datasets: Many close relatives / small pedigrees >100,000 samples > 9 million samples ~500,000 samples >14 million samples π‘œ π‘œβˆ’1 In dataset with π‘œ individuals, have π‘œ = 𝒫 π‘œ 2 pairs 2 = 2

  3. Goal: detect and reconstruct pedigrees using only DNA …

  4. Signal: Identical by descent (IBD) sharing β€’ Close (and some distant) relatives share large regions identical by descent (IBD) – Represented here as same color β€’ Each generation, parents transmit random Β½ of their genome to children οƒ˜ Relatives separated by 𝑁 generations 1 share average of 2 𝑁 of genome β€’ Average IBD sharing fractions: – Full siblings: 50%, Aunt-nephew: 25%, First cousins: 12.5%

  5. Second degree relatives: All share ~25% of genome IBD Grandparent- Avuncular (AV) Half-sibling (HS) grandchild (GP) οƒ˜ Difficult to distinguish using only data from the pairs

  6. IBD sharing rates for these relationships heavily overlap

  7. Idea: analyze IBD sharing of pair to other relatives

  8. CREST: Classification of Relationship Types Ying Qiao Jens Sannerud

  9. Approach: ratios of IBD sharing in three samples versus two 𝑆 1 = π‘€π‘“π‘œπ‘•π‘’β„Ž 𝐽𝐢𝐸 𝑦 1 ,𝑧 ∩ 𝐽𝐢𝐸 𝑦 2 ,𝑧 π‘€π‘“π‘œπ‘•π‘’β„Ž 𝐽𝐢𝐸 𝑦 1 ,𝑧 𝑆 2 = π‘€π‘“π‘œπ‘•π‘’β„Ž 𝐽𝐢𝐸 𝑦 1 ,𝑧 ∩ 𝐽𝐢𝐸 𝑦 2 ,𝑧 π‘€π‘“π‘œπ‘•π‘’β„Ž 𝐽𝐢𝐸 𝑦 2 ,𝑧 𝑦 1 For GP, expect 𝑆 1 = 1/4, 𝑆 2 = 1 𝑧 𝑦 2 Ying Qiao

  10. Approach: ratios of IBD sharing in three samples versus two 𝑆 1 = π‘€π‘“π‘œπ‘•π‘’β„Ž 𝐽𝐢𝐸 𝑦 1 ,𝑧 ∩ 𝐽𝐢𝐸 𝑦 2 ,𝑧 π‘€π‘“π‘œπ‘•π‘’β„Ž 𝐽𝐢𝐸 𝑦 1 ,𝑧 𝑆 2 = π‘€π‘“π‘œπ‘•π‘’β„Ž 𝐽𝐢𝐸 𝑦 1 ,𝑧 ∩ 𝐽𝐢𝐸 𝑦 2 ,𝑧 π‘€π‘“π‘œπ‘•π‘’β„Ž 𝐽𝐢𝐸 𝑦 2 ,𝑧 For GP, expect 𝑆 1 = 1/4, 𝑆 2 = 1 For AV, expect 𝑆 1 = 1/4, 𝑆 2 = 1/2 𝑦 1 𝑧 𝑦 2 Ying Qiao

  11. Approach: ratios of IBD sharing in three samples versus two 𝑆 1 = π‘€π‘“π‘œπ‘•π‘’β„Ž 𝐽𝐢𝐸 𝑦 1 ,𝑧 ∩ 𝐽𝐢𝐸 𝑦 2 ,𝑧 π‘€π‘“π‘œπ‘•π‘’β„Ž 𝐽𝐢𝐸 𝑦 1 ,𝑧 𝑆 2 = π‘€π‘“π‘œπ‘•π‘’β„Ž 𝐽𝐢𝐸 𝑦 1 ,𝑧 ∩ 𝐽𝐢𝐸 𝑦 2 ,𝑧 π‘€π‘“π‘œπ‘•π‘’β„Ž 𝐽𝐢𝐸 𝑦 2 ,𝑧 For GP, expect 𝑆 1 = 1/4, 𝑆 2 = 1 For AV, expect 𝑆 1 = 1/4, 𝑆 2 = 1/2 For HS, expect 𝑆 1 = 1/2, 𝑆 2 = 1/2 𝑧 𝑦 1 𝑦 2 Ying Qiao

  12. CREST uses kernel density estimators to infer relationships Trained kernel density estimators (KDEs) using simulated data Features: 𝑆 1 , 𝑆 2

  13. Can combine multiple relatives by taking union of IBD sharing 𝑧 π‘˜ ’s π‘€π‘“π‘œπ‘•π‘’β„Ž π‘˜ 𝐽𝐢𝐸 𝑦 1 ,𝑧 π‘˜ ∩ π‘˜ 𝐽𝐢𝐸 𝑦 2 ,𝑧 π‘˜ ∩ 𝐽𝐢𝐸 𝑦 1 ,𝑦 2 𝑆 𝑗 = π‘€π‘“π‘œπ‘•π‘’β„Ž π‘˜ 𝐽𝐢𝐸 𝑦 𝑗 ,𝑧 π‘˜

  14. CREST highly sensitive, highly specific Ran PADRE, CREST on 200 replicates of various pedigree structures : CREST : PADRE Qiao, Sannerud et al. (in revision, 2019)

  15. CREST infers relative types in Generation Scotland data Generation Scotland data: 205 GP, 1,949 AV, and 121 HS pairs with at least one mutual relative Given data equivalent to one first cousin (10% of genome covered by IBD regions), CREST’s sensitivity is 0.99 in GP, 0.86 in AV, and 0.95 in HS pairs Qiao, Sannerud et al. (in revision, 2019)

  16. Secondary aim: infer whether relatives are paternal or maternal Paternal Maternal Grandparent Half-siblings

  17. Key insight: males / females have different crossover locations Female rate (cM/Mb) Data from human chromosome 10 Average number of crossovers: Male rate (cM/Mb) β€’ Females: 2.04 β€’ Males: 1.27 Physical position (Mb) Genetic map from BhΓ©rer et al. (2017)

  18. CREST infers maternal / paternal type in Generation Scotland Analyzed all 848 GP and 381 HS pairs in Generation Scotland Using 𝑀𝑃𝐸 = 0 as Half-siblings boundary: β€’ 99.7% of HS β€’ 93.5% of GP Inferred correctly Grandparent-grandchild Qiao, Sannerud et al. (in revision, 2019)

  19. Conclusions β€’ CREST classifies second degree relationship types – Enabled by multi-way IBD sharing β€’ Male / female crossovers reveal the paternal / maternal type of half-siblings and grandparent-grandchild pairs β€’ Can apply to pedigree reconstruction: other methods subject to ambiguities for second degree pairs β€’ Preliminary results indicate CREST also applies to third degree pairs

  20. Acknowledgements Generation Scotland Caroline Hayward Archie Campbell Ying Qiao Jens Sannerud Nancy E. and Peter C. Meinig

  21. Approach: IBD segment ends approximate crossover locations β€’ Model IBD segments as regions flanked by two crossovers No-crossover interval: interior of IBD segment 𝑗int π‘₯ 0 π‘₯ 1 Locations of crossovers: window surrounding IBD segment ends β€’ For each IBD segment 𝑗, likelihood of parent being 𝑇 ∈ {𝐺, 𝑁} is 𝑄 𝑗 𝑇 = 𝑄 π‘₯ 0 𝑇 β‹… 𝑄 𝑗int 𝑇 β‹… 𝑄 π‘₯ 1 𝑇 β€’ Taking all IBD segments to be independent, we compute 𝑗 𝑄(𝑗|𝐺) 𝑀𝑃𝐸 = log 10 𝑗 𝑄 𝑗 𝑁 Jens Sannerud

Recommend


More recommend