genetics - Do transcription factors bind to both strands of DNA?

Thursday, 26 September 2019

genetics - Do transcription factors bind to both strands of DNA?

Do transcription factors (or generally proteins) bind to only single strand of DNA or both strands? Since it can have non covalent bonds to both strands in theory. I would like to know the mechanism. Any reference books, papers or links will be helpful.

Answer

The short summary is that typical TFs bind and read both strands together, as a basepair sequence. Some proteins instead recognise a site on the helix by its shape and flexibility. ssDNA-binding proteins obviously bind one strand but they do this in a non-specific manner. RNA-binding proteins recognise the sequence on a single strand by inserting intercalating planar residues between bases! All of this binding is non-covalent.

Transcription factors recognise sites in dsDNA, with DNA-binding domains. The rest of the protein might surround (partially, to varying degree) the negative outer surface of the dsDNA double helix with positively-charged surface, in order to hold it on to DNA as it scans (perhaps) along its length.

DNA-binding domains: major groove

enter image description here

The following domains are found in many transcription factors, and they all recognise both strands. More correctly, they recognise basepairs and their orientation. The first 5 pages of this lecture slideshow demonstrate that the chemical groups on the side of basepairs, accessible in the major groove, allow proteins to distinguish A:T, T:A, C:G & G:C by the order of hydrogen-bond donors, acceptors, and a methyl group.

Hence, TFs recognise a sequence of basepairs - oriented such that one strand is (e.g.) pTpCpApG, and the complementary strand is pCpTpGpA; and the bulk of the protein may 'sit' on one strand or the other - or a nearby gene may locally define one strand or the other as the coding strand but this does not mean that this one strand is read.

These are common domains that all recognise basepairs in the major groove by interactions with residues on a probing aplha-helix.

TATA-binding protein: minor groove

TATA-binding protein (TBP) is a different, interesting case. It binds the 'TATA-box' via the minor groove, where the exposed chemical groups only distinguish [A/T] from [C/G], but not their orientation. This means that the sequences on each strand cannot be easily read from the minor groove. TBP instead recognises the shape and flexibility of the double-helix at the TATA-box, 'grips' it by the minor groove and bends the DNA, which aids the melting of the strands to the transcription 'bubble'. enter image description here

The TATA-box sequence is usually pTpApTpApApA on the coding strand upstream of the transcriptional start. This is the convention when giving the sequence of a TF-binding site, but you couldn't say that TBP actually reads TATAAA - it doesn't!

Here is another, similar set of lecture slides.

Even better, here is the same material covered in a popular textbook.