3d folding of chromosomal domains in relation to gene
play

3D folding of chromosomal domains in relation to gene expression - PowerPoint PPT Presentation

3D folding of chromosomal domains in relation to gene expression Marc A. Marti-Renom http://sgu.bioinfo.cipf.es Structural Genomics Unit Bioinformatics & Genomics Department Prince Felipe Research Center (CIPF), Valencia, Spain Thursday,


  1. 3D folding of chromosomal domains in relation to gene expression Marc A. Marti-Renom http://sgu.bioinfo.cipf.es Structural Genomics Unit Bioinformatics & Genomics Department Prince Felipe Research Center (CIPF), Valencia, Spain Thursday, November 25, 2010

  2. Aim Can we relate structure and expression? Simple genomes Complex genomes 2 Thursday, November 25, 2010

  3. Resolution Limited knowledge... Knowledge DNA length 100 103 106 109 nt Volume 10-9 10-6 10-3 100 103 μm3 Time 10-10 10-8 10-6 10-4 10-2 100 102 103 s Resolution 10-3 10-2 10-1 μ 3 Adapted from: Langowski and Heermann. Semin Cell Dev Biol (2007) vol. 18 (5) pp. 659-67 Thursday, November 25, 2010

  4. Integrative and iterative approach Experiments Computation 4 Thursday, November 25, 2010

  5. Structure determination Integrative Modeling Platform http://www.integrativemodeling.org Alber et al. Nature (2007) vol. 450 (7170) pp. 683-94 Biomolecular structure determination 2D-NOESY data Chromosome structure determination 5C data 5 Thursday, November 25, 2010

  6. 5C technology Detecting up to millions of interactions in parallel http://my5C.umassmed.edu Dostie et al. Genome Res (2006) vol. 16 (10) pp. 1299-309 5C “copies” the 3C library into a 5C library containing only ligation junctions Performed at high levels of multiplexing: 2,000 primers detect 1,000,000 unique interactions in 1 reaction 6 Thursday, November 25, 2010

  7. Human α -globin domain ENm008 genomic structure and environment ENCODE Consortium. Nature (2007) vol. 447 (7146) pp. 799-816 p13.3 13.2 12.3 p12.1 16p11.2 11.1 q11.2 q12.1 13 16q21 22.1 q23.1 0| 50000| 100000| 150000| 200000| 250000| 300000| 350000| 400000| 450000| 500000| LOC1001134368 RAB11FIP3 C16ORF35 SNRNP25 ARHGDIG RHBDF1 MRPL28 POLR3K LUC7L ITFG3 RGS11 PDIA2 AXIN1 TMEM8 DECR2 HB � 2 HB � 1 MPG HB � HB � HB � HS48 HS46 HS40 HS33 HS10 HS8 GM12878 CTCF K562 GM12878 diff RNA K562 GM12878 CTCF K562 GM12878 H3K4me3 K562 GM06990 DNaseI K562 The ENCODE data for ENm008 region was obtained from the UCSC Genome Browser tracks for: RefSeq annotated genes, Affymetrix/CSHL expression data (Gingeras Group at Cold Spring Harbor), Duke/NHGRI DNaseI Hypersensitivity data (Crawford Group at Duke University), and Histone Modifications by Broad Institute ChIP-seq (Bernstein Group at Broad Institute of Harvard and MIT). 7 Thursday, November 25, 2010

  8. Human α -globin domain ENm008 genomic structure and environment ENCODE Consortium. Nature (2007) vol. 447 (7146) pp. 799-816 p13.3 13.2 12.3 p12.1 16p11.2 11.1 q11.2 q12.1 13 16q21 22.1 q23.1 0| 50000| 100000| 150000| 200000| 250000| 300000| 350000| 400000| 450000| 500000| LOC1001134368 enhancer α -globin genes RAB11FIP3 C16ORF35 SNRNP25 ARHGDIG RHBDF1 MRPL28 POLR3K LUC7L ITFG3 RGS11 PDIA2 AXIN1 TMEM8 DECR2 HB � 2 HB � 1 MPG HB � HB � HB � HS48 HS46 HS40 HS33 HS10 HS8 GM12878 K562 cells: α -globin genes active 8 Thursday, November 25, 2010

  9. Integrative Modeling http://www.integrativemodeling.org P1 P2 9 Thursday, November 25, 2010

  10. Representation Harmonic 2 ( ) 0 H i , j = k d i , j " d i , j i+1 Harmonic Lower Bound i i+2 $ 0 ; 2 ( ) 0 if d i , j " d i , j lbH i , j = k d i , j # d i , j & % 0 ; & if d i , j > d i , j lbH i , j = 0 ' i+n Harmonic Upper Bound $ 0 ; 2 ( ) 0 if d i , j " d i , j ubH i , j = k d i , j # d i , j & % 0 ; & if d i , j < d i , j ubH i , j = 0 ' 10 Thursday, November 25, 2010

  11. Scoring GM12878 70 fragments 1,520 restraints Harmonic Harmonic Upper Bound Harmonic Lower Bound K562 70 fragments 1,049 restraints 11 Thursday, November 25, 2010

  12. Optimization start CREATE PARTICLES 7.00E+06 ADD RESTRAINTS 6.00E+06 IMP Objective function SIMULATED ANEALING 5.00E+06 MONTE-CARLO 4.00E+06 LOCAL CONJUGATE GRADIENT 3.00E+06 5 steps 500 rounds 2.00E+06 1.00E+06 LOWEST OBJECTIVE FUNCTION 0.0E+00 0 50 100 150 200 250 300 350 400 450 500 Iteration end 12 Thursday, November 25, 2010

  13. Clustering 13 Thursday, November 25, 2010

  14. Not just one solution GM12878 K562 14 Thursday, November 25, 2010

  15. Not just one solution and we can de-convolute them! 15 Thursday, November 25, 2010

  16. Consistency GM12878 K562 Cluster #1 Cluster #2 2780 model 314 model 910,280 IMP OF 232,673 IMP OF 150 nm 100 125 nm 80 100 nm 75 nm 60 50 nm 40 Consistency (%) 20 GM15878 0 100 150 nm 125 nm 80 100 nm 75 nm 60 50 nm 40 20 K562 0 Fragment 16 Thursday, November 25, 2010

  17. Regulatory elements GM12878 K562 Cluster #1 Cluster #2 2780 model 314 model 910,280 IMP OF 232,673 IMP OF 2.5 Promoters GM12878 Active genes CTCF sites DNaseI sites 2.0 H3K4me3 sites No-active genes Relative abundance 1.5 1,000 1.0 800 .5 Distance (nm) K562 600 .0 <50 <100 <150 <200 <250 <300 <350 <400 Distance to center (nm) 400 GM12878 2.50 Promoters K562 Active genes CTCF sites DNaseI sites 200 2.00 H3K4me3 sites No-active genes Relative abundance K562 0 1.50 GM12878 1.00 .50 .00 <50 <100 <150 <200 <250 <300 <350 <400 Distance to center (nm) 16 Thursday, November 25, 2010

  18. Compactness GM12878 K562 Cluster #1 Cluster #2 2780 model 314 model 910,280 IMP OF 232,673 IMP OF 110 Density (bp/1nm) 100 K562 GM12878 90 80 70 60 50 GM12878 40 K562 Fragment 16 Thursday, November 25, 2010

  19. Multi-loops GM12878 K562 Cluster #1 Cluster #2 2780 model 314 model 910,280 IMP OF 232,673 IMP OF 68Kb 64Kb 700 >=250 GM12878 44Kb 65Kb 50Kb 600 45Kb Distance between anchoring points (nm) Path length (nm) 52Kb 500 205 35Kb 20Kb 400 300 161 300 400 500 117 55Kb 30Kb 600 63Kb 64Kb K562 700 55Kb 68Kb 73 69Kb 16 Thursday, November 25, 2010

  20. Expression GM12878 K562 Cluster #1 Cluster #2 2780 model 314 model 910,280 IMP OF 232,673 IMP OF Increased in GM12878 = Increased in K562 16 Thursday, November 25, 2010

  21. FISH validation GM12878 K562 Cluster #1 Cluster #2 2780 model 314 model 910,280 IMP OF 232,673 IMP OF GM12878 K562 500 GM12878 K562 400 Distance (nm) 300 200 100 0 FISH M o d e l s ( 2 D ) 16 Thursday, November 25, 2010

  22. Summary 5C data results in comprehensive interaction matrices to build a consistent 3D model 17 Thursday, November 25, 2010

  23. Summary Models allow for 5C data de-convolution 18 Thursday, November 25, 2010

  24. Summary Models allow for 5C data de-convolution 20 Thursday, November 25, 2010

  25. Summary Selected models reproduce known ( and new ) interactions 20 Thursday, November 25, 2010

  26. Summary Large-scale changes in conformation correlate with gene expression of resident genes 100 nm 100 nm LOC1001134368 RAB11FIP3 C16ORF35 SNRNP25 ARHGDIG RHBDF1 MRPL28 POLR3K LUC7L ITFG3 RGS11 PDIA2 AXIN1 TMEM8 DECR2 HB � 2 HB � 1 MPG HB � HB � HB � HS48 HS46 HS40 HS33 HS10 HS8 GM12878 RNA diff K562 α -globin- Enhancer looping only in K562 Looping interaction In GM12878 cells 21 Thursday, November 25, 2010

  27. Summary The models have been partially validated by FISH 100 nm 100 nm GM12878 K562 22 Thursday, November 25, 2010

  28. Summary “ Chromatin Globule ” model � a b Factory Eraf � HBB PolII Münkel et al. JMB (1999) Osborne et al. Nat Genet (2004) Lieberman-Aiden et al. Science (2009) Phillips and Corces. Cell (2009) 23 Thursday, November 25, 2010 � �

  29. Acknowledgments Davide Baù Amartya Sanyal Bryan Lajoie Emidio Capriotti Meg Byron Postdoctoral Fellow Postdoctoral fellow Research Associate Postdoctoral fellow Bioinformatician Jeanne Lawrence Marc A. Marti-Renom Job Dekker Department of Cell Biology Program in Gene Function and Expression Structural Genomics Unit University of Massachusetts Medical School Department of Biochemistry and Molecular Pharmacology Bioinformatics and Genomics Department Worcester, MA, USA University of Massachusetts Medical School Centro de Investigación Príncipe Felipe Worcester, MA, USA Valencia, Spain D. Baù, A. Sanyal, B. Lajoie, E. Capriotti, M. Byron, J. Lawrence, J. Dekker, and M.A. Marti-Renom. Nature Structural & Molecular Biology (2010) in press (5th of December) . http://sgu.bioinfo.cipf.es http://integrativemodeling.org Thursday, November 25, 2010

Recommend


More recommend