Specificity of Protein-DNA recognition of a long DNA binding motif Francisco Melo Ledermann EMBO Global Exchange Lecture Course Structural and biophysical methods for biological macromolecules in solution Pontificia Universidad Católica de Chile , Santiago, Chile 16 October 2019 Molecular Genetics and Microbiology Department, School of Biological Sciences, Pontificia Universidad Católica de Chile, Santiago, Chile . http://melolab.org
Our question(s) … The general problem we want to address … CAN WE IDENTIFY / MODULATE THE KEY SPECIFICITY AND AFFINITY DETERMINANTS THAT DEFINE A FUNCTIONAL EVENT ? CAN WE REALLY “ UNDERSTAND ” HOW THIS PROCESS OPERATES ? BINDING BINDING AFFINITY SPECIFICITY Protein-DNA interactions FUNCTION UPON BINDING
What determines protein-DNA binding specificity ? Adapted from: Rohs et al. (2010) "Origins of Specificity in Protein-DNA Recognition" Annu. Rev. Biochem. 79, 233–269. Jen-Jacobson et al., (2000) "Structural and Thermodynamic Strategies for Site-Specific DNA Binding Proteins" Structure 8, 1015-1023.
Our methodological approach: general scheme, part I Known protein-DNA complex structure (PDB) Optimization 3D modeling Method Metropolis-Montecarlo 3D protein-DNA complex model Low energy score sequence ensemble ACGTGGTAGAAACGTGAGCCT � New DNA sequence Energy scoring functions AYYNCACAAWTTRRTTN � (knowledge-based potentials) ATTNCRCRACTTAGTTW � Sequence Patterns HMMs Experimental Binding Data Position Weight Matrix (EMSA, NGS, UPBMs) PDIdb
Our methodological approach: general scheme, part II AYYNCACAAWTTRRTTN � ATTNCRCRACTTAGTTW � HMMs Position Weight Matrix Sequence Patterns Genome scanning Complete Genome 5’ Predicted Binding Sites Experimental validation of predictions - Isothermal Titration Calorimetry Comparison with - Surface Plasmon Resonance experimental data Annotation - Fluorescence Anisotropy Tables ✖ - SAXS ✔ - in vitro transcription assays 5’ - DNA binding microarrays - One round SELEX - EMSA
PDIdb Database http://melolab.org/pdidb (2010) Norambuena and Melo. BMC Bioinformatics 11, 262, 1-12.
PDIdb: PDB complexes and interfaces definition
Knowledge-based potentials for protein-DNA interactions Protein DNA Interface database (PDIdb) - Used to derive a non-redundant list of protein-DNA complexes - NR list of complexes used to calculate knowledge-based potentials Proteína ADN d Observations Relative Frequency Capriotti et al. Bioinformatics 27,2011. # & ii f d ij = − k T ln Inverse Boltzmann Law Δ E d % ( kk f d $ ' Knowledge-based portential E cc (d) Norambuena et al. (unpublished work) Sippl, M. , J.Comp.Aided Mol.Des. 1993
Knowledge-based potentials for protein-DNA interactions d d d d E cc (d) E db (d) E ba (d) E bd (d) d ARG ASN
Comparative Modeling of DNA duplexes Full Atom Comparative Modeling of Protein-DNA Complexes - Conformational degrees of freedom in DNA duplexes - Strategy and geometrical restraints to model 3D structure of DNA duplexes - Example case (static or interactive display of the modeling process) FULL ATOM COMPARATIVE THREE-DIMENSIONAL MODELING OF DNA DUPLEXES Ibarra and Melo (unpublished work)
Full Atom Comparative Modeling of protein-DNA complexes Template Target Start 5‘-…ATG GAC CGTTTT…-3'/ Template selection 5'-…TAC CTG GCAAAA…-3' Template-target 5‘-…ATGCCACGTTTT…-3'/5‘-…TACGGTGCAAAA…-3' alignment 5‘-…ATG GAC CGTTTT…-3'/5‘-…TAC CTG GCAAAA…-3' Model building Model assessment No OK? Yes End
Building a non-redundant database of duplex DNA Selection of DNA structures O O … N = 34 CGCAAATTTGCG � CGCGAAAAAACG � AGGGGCCCCT � AGGGGCGGGGCT � ACCGACGTCGGT Non-redundant dataset … � N = 86 Multiple filters (pH, temperature, R-crys, resolution, sequence, structure clustering)
Non-redundant database of full duplex DNA Secuencias (5'-3') GCGGGCCCGC GCGGGCCCGC GAAGCTTC GAAGCTTC CATGGGCCCATG CATGGGCCCATG TCTGCGGTC TGACCGCAG CCCGGCCGGG CCCGGCCGGG A (N=12) CCGGGCCCGG CCGGGCCCGG AGGGGCGGGGCT TAGCCCCGCCCC GGCATGCC GGCATGCC GGTATACC GGTATACC CGCGGGTACCCGCG CGCGGGTACCCGCG GGGGCGCCCC GGGGCGCCCC AGGGGCCCCT AGGGGCCCCT CGCGTTAACGCG CGCGTTAACGCG CCGGCGCCGG CCGGCGCCGG CGCAGAATTCGCG CGCAGAATTCGCG CGCAAATTTGCG CGCAAATTTGCG CGCGAAAAAACG CGTTTTTTCGCG ACCGAATTCGGT ACCGAATTCGGT GCTTAATTCG CGAATTAAGC CCATTAATGG CCATTAATGG B (N=17) CCGATATCGG CCGATATCGG CCTGCGCAGG CCTGCGCAGG CCGAGCTCGG CCGAGCTCGG CCTAATTAGG CCTAATTAGG CCGCTAGCGG CCGCTAGCGG CGCGATATCGCG CGCGATATCGCG ACCGACGTCGGT ACCGACGTCGGT ACCGGTACCGGT ACCGGTACCGGT CGCGAATTCGCG CGCGAATTCGCG CGCACG CGTGCG CACGCG CGCGTG Z (N=5) CGCGCG CGCGCG CCCGGG CCCGGG CACACG CGTGTG
Derivation of geometrical restraints for the 3D modeling of full duplex DNA Calculation Counting of occurrences for each Histogram building geometrical restraint Non-redundant database of duplex DNA Fitting No No Poli- OK OK Gaussian Spline BEGIN ? ? gaussian a Yes Yes END
Fitting of continuous mathematical functions to discrete experimental data
Conformational degrees of freedom in DNA strands There are many degrees of freedom in a single RNA or DNA chains - For a single DNA strand, we have 12 rotational bonds: - Sugar phosphate backbone: α , β , γ , δ , ε , ζ , ν 0, ν 1, ν 2, ν 3, ν 4 - Glycosidic bond: χ
Conformational degrees of freedom in DNA duplexes Base pairing parameters Base pair step parameters 3 distances 3 angles 3 distances 3 angles
ProteinDNA and FreeDNA backbone dihedral angle restraints
ProteinDNA and FreeDNA ribose dihedral angle restraints
ProteinDNA and FreeDNA glycosidic dihedral angle restraints
ProteinDNA and FreeDNA glycosidic dihedral angle restraints
Geometrical restraints to model 3D DNA duplex conformation Distance restraints that define the geometry of a base pair AT BASE PAIR CG BASE PAIR O6 N4 O4 N6 N1 N3 G T C A N3 N1 C1’ C1’ C1’ O2 N2 C1’ H-BOND DONORS, ACCEPTORS
Comparative modeling: detailed flowchart Modeller >P1;M1dsz sequence:M1dsz:.:. :. : .: .::: jelltjeeelltjel/jtlejjtttlejjtl* >P1;1kb2 DNAUtils structureX:1kb2:.:. :. : .: .::: jllttjejlellttj/leejjtjltleejjl* DNAModel alignment Generate topology Transfer template coordinates Model building Template structure Optimization of nucleotide stereochemistry Optimization of base pair geometry refinement Optimized Model
Comparative modeling: model optimization ✔ CHARMM forcefield Dihedral restraints Non-bonded terms (from knowledge-based potential)
DNAviz PyMol Plugin
Calculation of Solvent Accessible Surface Area (SASA) and Δ SASA SASA( i ) SASA( i ) Δ SASA or Individual Complex BSA Structures Structure Atoms with Δ SASA > 0 DNA Protein
Calculation of Δ SASA for DNA bases and backbone atoms SASA Complex SASA DNA backbone Individual Structures SASA Complex DNA bases DNA Bases DNA backbone Atpms with Protein Δ SASA > 0 (2015) Ribeiro, J., Melo, F. and Schüller A “PDIviz: analysis and visualization of protein–DNA binding interfaces” Bioinformatics 31, 2751-2753.
MarA and Rob: our experimental working models (AraC/XylS TF family) MarA -mar complex (crystal structure 1BL0) Why we have chosen MarA as a working model ? 1. Monomeric transcription factor 2. Single structural domain (2 HTH motifs) 3. Asymmetric (not palindromic) binding site 4. Long binding site, known as marbox (21 bps) 5. 28 known marboxes by experiment (+5 putatives) 6. MarA is ambidexter 7. Crystal Structure of MarA-mar complex available 8. Several in vitro transcription assays with many promoters 9. Complete alanine-scanning and functional tests 10. DNA microarrays with MarA expressed constitutively in vivo Rob -micF complex 11. EMSA assays of MarA and several promoters (crystal structure 1D5Y) 12. Restricted only to Enterobacteria (highly specific binding mode ?) 13. Clinically relevant (antibiotic resistance, stress tolerance)
Marboxes known to be bound by MarA proteins with high/medium affinity
Marboxes known: fuzzy pattern AYNGCACNNWNNRYYAAACN
MarA and Rob proteins complexed to mar and micF DNA marboxes
MarA , Rob, RobDBD and Chimera protein constructs Natural proteins (E. coli K12) MarA MarA, 128 AAs Rob Rob DBD, 128 AAs Rob Regulatory Domain, 170 AAs Artificial proteins RobDBD RobDBD, 128 AAs Chimera MarA, 128 AAs Rob Regulatory Domain, 170 AAs
Marboxes DNA sequence logo and box A, spacer and box B definitions DNA base DNA backbone interaction interaction Interaction with base and backbone
mar and micF duplex DNA used in EMSAs with MarA and Rob proteins DNA SEQUENCE BoxB BoxA Sp (35 bp) mar GACCGA TGCCACGTTTTGCTAAA TCGAGGTGTTAG micf GACCGA CAGCACTGAATGTCAAA ACGAGGTGTTAG mut GACCGA CATTGTTTTTTGCACTC AAGAGGTGTTAG marmicF AB GACCGA CAGCACGTTTTGTCAAA TCGAGGTGTTAG � micFmar AB GACCGA TGCCACTGAATGCTAAA ACGAGGTGTTAG marmut B GACCGA TGCCACGTTTTGCACTC TCGAGGTGTTAG micFmut B GACCGA CAGCACTGAATGCACTC ACGAGGTGTTAG marmut A GACCGA CATTGTGTTTTGCTAAA TCGAGGTGTTAG micFmut A GACCGA CATTGTTGAATGTCAAA ACGAGGTGTTAG marmicF sp GACCGA TGCCACTGAATGCTAAA TCGAGGTGTTAG micFmar sp GACCGA CAGCACGTTTTGTCAAA ACGAGGTGTTAG mar -bs GACCGA TGCTAAAGTTTTGCCAC TCGAGGTGTTAG � micF -bs GACCGA TGTCAAATGAACAGCAC ACGAGGTGTTAG �
Recommend
More recommend