http://www.genomeinterpretation.org/ Organizers Steven E. Brenner , - PowerPoint PPT Presentation

CAGI@AIMM – Update on Community Experiment on Genome Interpretation Silvio Tosatto BioComputing UP, Department of Biology, University of Padova, Italy URL: http://protein.bio.unipd.it/

http://www.genomeinterpretation.org/

Organizers Steven E. Brenner , University of California, Berkeley John Moult , IBBR, University of Maryland Susanna Repo , University of California, Berkeley http://www.genomeinterpretation.org/

Critical Assessment of Genome Interpretation CASP -like effort for human genome variation interpretation Molecular Cellular Organismal A A A T T T

Goals of the CAGI experiment • Determine the state of the art • Identify progress and innovations • Reveal bottlenecks and guide future effort • Highlight new challenges • Collaboratively develop new approaches

CAGI 2011 experiment

2011: 11 challenges, total of 114 submissions, ~160 registered on the website 2010: 6 challenges, total of 108 submissions, ~60 registered on website

CAGI 2011 participating groups CAGI 2011: 21 participating groups Participated both 2011 and 2010 CAGI 2010: 17 participating groups

Cystathionine β ‐ Synthase (CBS) single amino acid mutations Homocysteine Cystathione Cysteine Serine cystathionase CBS Treat with high dose of B6 PLP CBS variants associated with homocystinuria

Cystathionine β ‐ Synthase (CBS) single amino acid mutations Total of 84 mutations assessed experimentally Substituted Growth rate Residue 400 ng/ml PLP D140N 103 +/- 25 A207G 0 N225S 70 +/- 12 I264T 109 +/- 14 W323G 0 A357G 104 +/- 19 Dataset provided by Jasper Rine, University of California, Berkeley Assessed by Iddo Friedberg, Miami University

Probability of observing the experimental value Predictions Experimental relative growth rate Predicted relative growth rate

Experimental and predicted relative growth rate Probability of observing the exp. value

CBS Challenge – Spearman’s rank correlation

p53 core domain mutations that restore activity of inactive p53 Baronio R et al. Nucl. Acids Res. 2010; nar.gkq571 p53 Cancer Rescue mutation mutant 14,668 variations G245S N239F G245S F113L to predict G245S S240Y G245S T123P G245S N239Y Dataset provided by Rick Lathrop, and the p53 “cancer rescue” team University of California, Irvine Assessed by Gad Getz, Broad Institute

Comparing predictions to ground truth 1: Yana Bromberg Lab 2: Yana Bromberg 3: Yana Bromberg 4: SWITCH Lab, Greet De Baets 5: Rita Casadio Lab 6: George Shackelford Lab 7: Sean Mooney Lab 8: Sean Mooney Lab

ROC curves for submissions M237I 1: Yana Bromberg Lab 2: Yana Bromberg 3: Yana Bromberg 4: SWITCH Lab, Greet De Baets 5: Rita Casadio Lab 6: George Shackelford Lab 7: Sean Mooney Lab 8: Sean Mooney Lab

Identify Crohn’s disease patients from healthy individuals Exome sequences from 4 different groups sequenced on different machines in different batches Not a case/control study! Dataset provided Andre Franke, Christian-Albrechts-University Kiel Assessed by Alexander Morgan, Stanford University

Challenge: Distinguish between exomes of Chron’s disease patients and healthy individuals Exomes of 56 individuals Multifactorial or complex diseases Who has Crohn’s disease?

: 42 / 56 42 / 56 have have Crohn’ Crohn’s s disease disease Assessm ent : Assessm ent

: 42 / 56 42 / 56 have have Crohn’ Crohn’s s disease disease Assessm ent : Assessm ent #119 (ySNAP?) #94 (UniPadova)

Today • Personalized genetics has been upon us for some time • How good are we at actually identifying phenotype from whole genome?

Personal genome project (PGP) ‐ Predict individuals’ phenotype Numerical traits 33. Birth weight (in g) 34. HDL level (in mg/dL) * 35. LDL level (in mg/dL) * 36. Triglyceride level (in mg/dL) * 37. Fasting blood glucose level (in mg/dL) 38. Warfarin dose (in mg) 39. Age at Menarche 40. Annual income (in $) Dataset provided by George Church, Harvard Medical School Assessed by Sean Mooney, Buck Institute

The Submitters • s122, s123:UniPadova (2 submissions) PI: Silvio Tosatto – ANNOVAR + literature + database + expert knowledge – random prediction • s125: Netbiolab PI: Insuk Lee – SIFT + database (for population frequency) + GWAS • s126: KarchinLab PI: Rachel Karchin – Karchin: Bayes network + database (GWAS) Late Submission Shamil Sunyaev’s Lab, Harvard University

The Probabilities in the 10 Trait Name Frequency PositiveNum PGPCount 1Asthma 0.25 2 8 2Crohn's disease 0 0 8 3Ulcerative colitis 0 0 8 4Irritable bowel syndrome 0.111 1 9 5Rheumatoid arthritis 0 0 8 6Type II Diabetes 0 0 8 • Mostly zero 7Coronary artery disease 0 0 8 8Long QT Syndrome 0 0 8 9Hypertrophic cardiomyopathy 0 0 8 10Glaucoma 0.125 1 8 11Color blindness 0.125 1 8 12Bipolar disorder 0 0 8 13Celiac disease 0 0 8 14Psoriasis 0 0 8 15Lupus 0 0 8 16Breast cancer 0 0 8 17Prostate cancer 0 0 8 18Migraine 0 0 8 19Lactose intolerance 0 0 7 20Dyslexia 0.125 1 8 21Autism 0 0 8 22Osteoporosis 0 0 7 23Incontinence 0 0 8 24Kidney stones 0 0 8 25Varicose veins 0 0 8 26Sleep Apnea 0.143 1 7 27Tongue rolling (tube) 0.875 7 8 28Phenylthiocarbamide tasting 1 4 4 29Blood type - Has A antigen? 0.625 5 8 30Blood type - Has B antigen? 0.143 1 7 31Blood type - Is Rh(D) positive? 0.875 7 8 32Absolute pitch 0 0 6

The Binary Traits Results by team – only the Karchin team is statistically significant Total Predicted Submission Traits Traits Precision Recall AUC P UniPadova 228 216 0.094 0.3 0.605 0.133 UniPadova 228 228 0.118 0.095 0.405 0.923 Netbiolab 228 220 0.024 0.214 0.225 1 KarchinLab 228 228 0.652 0.714 0.896 0

The Binary Traits ‐ ROC Only S126 (Karchin lab) is statistically significant Submissions: S122: UniPadova S123: UniPadova (random) S125: Netbiolab S126: KarchinLab

Numerical traits traits Numerical We are still in the “game” phase…

Extra Questions Special questions: (a) One of the PGP10 individuals has irritable bowel syndrome. Who is that? (Answer: PGP7) (b) One of the PGP10 individuals is color ‐ blind. Which one? (Answer: PGP10) (c) One of the PGP10 individuals is not color ‐ blind but she has a color ‐ blind father and an affected son. Who is that? (Answer: PGP9) Karchin Lab got all correct, UniPadova got one correct

Some conclusions • Knowledge of individual gene is important (CBS) • Methods are highly significant (P ‐ value) but of questionable clinical applicability (r 2 ~0.7) • Different methods succeed at different challenges, and with different assessments • Predictions on the Personal Genome Project panel improved, but largely by better modeling the prior • Metapredictors unlikely to yield huge improvements currently • Unexpected success in predicting Crohn’s disease CAGI 2012 • Challenges about to be released… (September 2012) • Conference scheduled for mid-December 2012

Acknowledgements Organizers Steven E. Brenner, University of California, Berkeley John Moult, IBBR, University of Maryland Susanna Repo, University of California, Berkeley Data Providers Adam P. Arkin, UC Berkeley George Church, Harvard Medical School Andre Franke, Christian ‐ Albrechts ‐ University Kiel Joe W. Gray, OHSU Rick Lathrop, UC Irvine John Moult, University of Maryland Jasper Rine, UC Berkeley Jeremy Sanford, UC Santa Cruz Nicole Schmitt, University of Copenhagen Jay Shendure, University of Washington Michael Snyder, Stanford University Sean Tavtigian, University of Utah Assessors Rui Chen, Stanford University, Gad Getz, Broad Institute Iddo Friedberg, Miami University Website Development and Administration, Data Analysis Sean Mooney, Buck Institute Maya Zuhl, IBBR, University of Maryland Alexander A. Morgan, Stanford University   Artem Sokolov, Sri Jyothsna Yeleswarapu, Tata Consultancy Services University of California, Santa Cruz   Josh Stuart, University Gaurav Pandey, Mount Sinai School of Medicine of California, Santa Cruz   Sean Tavtigian, University of Utah

http://www.genomeinterpretation.org/ Organizers Steven E. Brenner , - PowerPoint PPT Presentation

CAGI@AIMM Update on Community Experiment on Genome Interpretation Silvio Tosatto BioComputing UP, Department of Biology, University of Padova, Italy URL: http://protein.bio.unipd.it/ http://www.genomeinterpretation.org/ Organizers Steven

www.escardio.org www.escardio.org www.escardio.org www.escardio.org www.escardio.org

www.Every-Mind.org www.Every-Mind.org www.Every-Mind.org www.Every-Mind.org

Who needs Standards... Patrick Curran: Chair, Java Community Process (patrick@jcp.org)

westgov.org #WGA17 JIM OGSBURY Executive Director westgov.org #WGA17 westgov.org

Ninja Scaning by Fyodor CanSecWest 2009 March 20, 3:50 PM

E: E: E: E: nirmal.ghorawat@icai.org nirmal.ghorawat@icai.org nirmal.ghorawat@icai.org

OFBiz Development with Docker http://ofbiz.apache.org http://docker.io/ 2015-04-15

Logic in Action Chapter 9: Proofs http://www.logicinaction.org/ ( http://www.logicinaction.org/ )

http://dx.doi.org/10.1145/2207676.2207704 http://dx.doi.org/10.1145/2663204.2663270 Visual

http://ar.wikipedia.org/wiki / http :// www . masraheon . com / . htm 3 .

PRIVACY TRENDS ITZONTARGET PRIVACY TRENDS http://www.globalconsentmanager.com/ Why should you

http://ecademy.agnessco http://ecademy.agnessco http://ecademy.agnessco http://ecademy.agnessco

www.thewebguild.org www.thewebguild.org www.thewebguild.org www.thewebguild.org

VPIM Voice Profile for Internet Mail http://www.ema.org/vpim http://www.vpim.org VPIM WG chair:

Number Systems MA1S1 Tristan McLoughlin November 27, 2013 http://en.wikipedia.org/wiki/Binary

URI Disambiguation in the Context of Linked Data http://sws.geonames.org/2510769

Complete redesign of a PDZ domain YJ. Sun, T. Hou, E. Fuentes; University of Iowa

Folate and childhood ALL Anand Chokkalingam, Ph.D. University of California Berkeley Childhood

Point of Care Testing: Taking Us Into the Future Barbara M. Goldsmith, Ph.D., FACB August 29,

Preparing for the unknown Future translational opportunities David Huntsman COI disclaimer

A Data Tracking Journey From Chronic Unknown Illness to Crystal Clear Diagnosis Quan:fied

Stroke in Children: Stroke Subtype How often does that happen? Incidence: 1 per 3,500

What is Newborn (Blood Spot) Screening? Newborn Bloodspot Screening, EVERY baby born in

Financial Distress We have nothing to disclose. Financial Toxicity Definition: is the

Sambuz

Useful Links

Newsletter

Mail Us