Amy L. Williams Cornell University February 7, 2017 Family History - PowerPoint PPT Presentation

Inferring the genomes of mothers and fathers using genotype data from a set of siblings Amy L. Williams Cornell University February 7, 2017 Family History Technology Workshop

Children inherit two chromosome copies: Mosaic of parents’ chromosomes Squares and circles: males and females, respectively Parents have line joining them and connected to children

Can infer parents’ chromosomes from siblings … with a catch • Color coding shown is not built into data • Can get “color” by comparing siblings’ genomes: identical regions from same chromosome → same “color”

Can infer parents’ chromosomes from siblings … with a catch • Color coding shown is not built into data • Can get “color” by comparing siblings’ genomes: identical regions from same chromosome → same “color” • Example: can find dark / light green chromosomes and dark / light grey chromosomes – Works by stitching together identical regions

The catch: unclear which chromosome belongs dad / mom • Can infer a pair of chromosomes that belongs to one parent • But nothing indicates which chromosome is from dad / mom ?

The catch: unclear which chromosome belongs dad / mom • Can infer a pair of chromosomes that belongs to one parent • But nothing indicates which chromosome is from dad / mom ? • In fact, each chromosome is independent – Not just 2 possibilities: 2 22 > 4 million possibilities – Only true for autosomes: X and Y chromosomes easier

Key insight: men / women produce different mosaic patterns Y-axis unit is cM: centiMorgan 1 Morgan: interval with average of 1 crossover per generation 1 M = 100 cM Campbell et al. (2015)

Step 1: locate crossovers using only siblings • Using hidden Markov model (HMM), can identify “colors” using only sibling data – Structured problem: • Four possible chromosomes • Two per parent • Each child inherits one from each parent at each position • Get location of crossovers as small window in genome A – Example: between A and B variants B

Step 2: define model of data • Two features in data: – Number of transmitted crossovers per child – Windows in which crossovers occurred

Step 2: define model of data • Two features in data: – Number of transmitted crossovers per child – Windows in which crossovers occurred • Model for crossover number: 𝑂 ∼ Pois(𝑈) , 𝑈 = chromosome length in Morgans male / female

Step 2: define model of data • Two features in data: – Number of transmitted crossovers per child – Windows in which crossovers occurred • Model for crossover number: 𝑂 ∼ Pois(𝑈) , 𝑈 = chromosome length in Morgans male / female • Probability of crossover in window length 𝑚 Morgans: 𝑀 ∼ Exp 1 𝑄 𝑀 ≤ 𝑚 = 1 − exp −𝑚  In general, 𝑚 differs between males / females

Step 3: infer male / female origin can treat each child independently • Data are sets of crossovers inherited by 𝑜 children: 𝑌 1 = 𝑌 11 , 𝑌 12 , … 𝑌 1𝑜 𝑌 2 = 𝑌 21 , 𝑌 22 , … , 𝑌 2𝑜 𝑌 𝑞𝑑 = 𝑥 𝑞𝑑1 , 𝑥 𝑞𝑑2 , … , 𝑞 ∈ 1,2 , 𝑑 child number 𝑥 𝑞𝑑𝑘 indicate window in which crossover 𝑘 occurred • Want to compute the following (and the opposite) 𝑄 𝑌 1 , 𝑌 2 𝑇 1 = 𝐺, 𝑇 2 = 𝑁 = 𝑄 𝑌 1 𝑇 1 = 𝐺 𝑄 𝑌 2 𝑇 2 = 𝑁

Step 3: infer male / female origin can treat each child independently • Data are sets of crossovers inherited by 𝑜 children: 𝑌 1 = 𝑌 11 , 𝑌 12 , … 𝑌 1𝑜 𝑌 2 = 𝑌 21 , 𝑌 22 , … , 𝑌 2𝑜 𝑌 𝑞𝑑 = 𝑥 𝑞𝑑1 , 𝑥 𝑞𝑑2 , … , 𝑞 ∈ 1,2 , 𝑑 child number 𝑥 𝑞𝑑𝑘 indicate window in which crossover 𝑘 occurred • Want to compute the following (and the opposite) 𝑄 𝑌 1 , 𝑌 2 𝑇 1 = 𝐺, 𝑇 2 = 𝑁 = 𝑄 𝑌 1 𝑇 1 = 𝐺 𝑄 𝑌 2 𝑇 2 = 𝑁 • Can break into terms for each child: 𝑜 𝑄 𝑌 1 𝑇 1 = 𝑁 = 𝑄(𝑌 1𝑑 |𝑇 1 = 𝑁) 𝑑=1

Step 3: probabilities for each child use number, locations of crossovers • Can now apply model and get different probabilities of male / female origin for each crossover 𝑄 𝑌 1𝑑 𝑇 1 = 𝑁 = 𝑄 𝑂 𝑇 1 = 𝑌 1𝑑 × 𝑄 𝑀 ≤ 𝑆𝑓𝑑 𝑥 1𝑑𝑘 , 𝑇 1 𝑥 1𝑑𝑘 ∈ 𝑌 1𝑑 𝑆𝑓𝑑 𝑥, 𝑇 : probability of crossover in window 𝑥 in 𝑇 ∈ {𝑁, 𝐺}

Results • Data: San Antonio Family Studies – Total: 2,490 genotyped samples, 80 pedigrees – Analyzed 69 families, 3 to 12 children • Include data for both parents to check accuracy – Genotypes from 888,748 SNPs (variants) • In 1,518 chromosomes, posterior probabilities of correct configuration: Crossover Full model Poisson windows > 0.5 1,515 1,099 1,513 > 0.9 1,513 372 1,511

One issue… currently finding crossovers with parent data • These results based on finding crossovers with parent data – Is cheating, but will fix soon • For > 8 children should generally do this well  Basically perfect results

One issue… currently finding crossovers with parent data • These results based on finding crossovers with parent data – Is cheating, but will fix soon • For > 8 children should generally do this well  Basically perfect results • Fewer siblings: some portions of genome will be ambiguous – But substantial parts will not be  Will have accuracy results for only siblings in coming weeks

Applications: large datasets • Used new method Attila to identify pedigrees in large cohorts 152,095 samples × 36 × 1

Applications: large datasets • Used new method Attila to identify pedigrees in large cohorts 152,095 samples × 36 × 1 • Why not get DNA from everyone in the world? 1. Find siblings 2. Infer parents’ genomes 3. Repeat 1 & 2 for many generations

Acknowledgements Ryan O’Hern Sayantani Basu-Roy Funding: Cornell seed grant Meinig Family Investigator Award Postdoc and graduate student openings

Amy L. Williams Cornell University February 7, 2017 Family History - PowerPoint PPT Presentation

Inferring the genomes of mothers and fathers using genotype data from a set of siblings Amy L. Williams Cornell University February 7, 2017 Family History Technology Workshop Children inherit two chromosome copies: Mosaic of parents

STEM INTEGRATION IN LITERACY By: Amy Chevalier Welcome! My name is Amy Chevalier Email:

Testing Observability Amy Phillips Testing Observability | Amy Phillips | @amyjph Amy

What is NPR? Amy Gooch Amy Gooch Amy Gooch Define Photorealistic Rendering Photo: Photo:

Program Program Amy Walstien, MPMA Rich Wessels, DLI AMSD January 5, 2018 AGENDA AGENDA Why

Ashbourne Medical School Programme (AMSP) Amy Youngman Teacher of Biology, Head of Faculty and

Presented By Deborah Drum and Amy Harrop About Amy Writer and Internet Marketer Kindle

WHAT IS COLLABORATIVE LAW? Participants: Amy J. Amundsen amy@ajamundsenlaw.com Dr. Jolene

Amy y McCann nn amy my@localf ocalfoo oodmarket marketpla place.co ce.com 541-579 79-319

Amy Quayle Victoria University, College of Health and Biomedicine amy.quayle@vu.edu.au Overview

Cochabamba! Cochabamba! Bolivia s Water War s Water War Bolivia Amy Chan Amy Chan

THE COVID FUNDING DANCE PRESENTED BY: AMY BRISSON & JOSHUA DUAME INTRODUCTION: ABOUT AMY

PPS BASICS Data Navigation Accountability Instructors: Amy Arnold Human Resources

Copper Creek Master Plan Master Plan Brent McCrea Scott Christensen Amy Holt Amy Holt Jared

PPS BASICS Data Navigation Accountability Instructors: Amy Arnold Human Resources

Mindful Education for Teachers Amy Secrist Mindfulness Educator amy@mindbodyalign.com

Vaccination Amy V. Groom, MPH IHS Immunization Program Manager Amy.Groom@ihs.gov Healthcare

Jason-3 Information Briefing -2013 Walid Bannoura Jason-3 Project Manager NOAA NESDIS 1

Cosmic Microwave Background as the Backlight: Mapping Hot Gas in the Universe with the

Mechatronics Project Presentation An Inexpensive Electronic Method for Measuring Takeoff

DAMAGE SENSING IN FIBER COMPOSITES USING NON- UNIFORMLY DISPERSED CARBON NANOTUBES L.M. Gao 1* ,

Central Services, and State Assessments & Enterprise-Wide Costs Presented to Joint Committee

POST OAK HAS ALREADY MADE TWO CRITICAL DECISIONS THAT IMPACT OUR COUNTIES FUTURES The current

Connecting People, Creating Jobs & A New American Industry A New Way to Connect Reimagining

The Central Curve in Linear Programming Cynthia Vinzant, UC Berkeley joint work with Jes

Sambuz

Useful Links

Newsletter

Mail Us

Amy L. Williams Cornell University February 7, 2017 Family History - PowerPoint PPT Presentation

Inferring the genomes of mothers and fathers using genotype data from a set of siblings Amy L. Williams Cornell University February 7, 2017 Family History Technology Workshop Children inherit two chromosome copies: Mosaic of parents

STEM INTEGRATION IN LITERACY By: Amy Chevalier Welcome! My name is Amy Chevalier Email:

Testing Observability Amy Phillips Testing Observability | Amy Phillips | @amyjph Amy

What is NPR? Amy Gooch Amy Gooch Amy Gooch Define Photorealistic Rendering Photo: Photo:

Program Program Amy Walstien, MPMA Rich Wessels, DLI AMSD January 5, 2018 AGENDA AGENDA Why

Ashbourne Medical School Programme (AMSP) Amy Youngman Teacher of Biology, Head of Faculty and

Presented By Deborah Drum and Amy Harrop About Amy Writer and Internet Marketer Kindle

WHAT IS COLLABORATIVE LAW? Participants: Amy J. Amundsen amy@ajamundsenlaw.com Dr. Jolene

Amy y McCann nn amy my@localf ocalfoo oodmarket marketpla place.co ce.com 541-579 79-319

Amy Quayle Victoria University, College of Health and Biomedicine amy.quayle@vu.edu.au Overview

Cochabamba! Cochabamba! Bolivia s Water War s Water War Bolivia Amy Chan Amy Chan

THE COVID FUNDING DANCE PRESENTED BY: AMY BRISSON &amp; JOSHUA DUAME INTRODUCTION: ABOUT AMY

PPS BASICS Data Navigation Accountability Instructors: Amy Arnold Human Resources

Copper Creek Master Plan Master Plan Brent McCrea Scott Christensen Amy Holt Amy Holt Jared

PPS BASICS Data Navigation Accountability Instructors: Amy Arnold Human Resources

Mindful Education for Teachers Amy Secrist Mindfulness Educator amy@mindbodyalign.com

Vaccination Amy V. Groom, MPH IHS Immunization Program Manager Amy.Groom@ihs.gov Healthcare

Jason-3 Information Briefing -2013 Walid Bannoura Jason-3 Project Manager NOAA NESDIS 1

Cosmic Microwave Background as the Backlight: Mapping Hot Gas in the Universe with the

Mechatronics Project Presentation An Inexpensive Electronic Method for Measuring Takeoff

DAMAGE SENSING IN FIBER COMPOSITES USING NON- UNIFORMLY DISPERSED CARBON NANOTUBES L.M. Gao 1* ,

Central Services, and State Assessments &amp; Enterprise-Wide Costs Presented to Joint Committee

POST OAK HAS ALREADY MADE TWO CRITICAL DECISIONS THAT IMPACT OUR COUNTIES FUTURES The current

Connecting People, Creating Jobs &amp; A New American Industry A New Way to Connect Reimagining

The Central Curve in Linear Programming Cynthia Vinzant, UC Berkeley joint work with Jes

Sambuz

Useful Links

Newsletter

Mail Us

THE COVID FUNDING DANCE PRESENTED BY: AMY BRISSON & JOSHUA DUAME INTRODUCTION: ABOUT AMY

Central Services, and State Assessments & Enterprise-Wide Costs Presented to Joint Committee

Connecting People, Creating Jobs & A New American Industry A New Way to Connect Reimagining