## How do transcription factors find their binding sites?

In 1970, Riggs et al noted that the rate at which the lac repressor found its target on DNA in vitro was about 1000 fold higher than seemed allowed by 3D diffusion. This conundrum has obsessed quantitative biologists for a long time and a number of elegant solutions to the problem have been formulated, foremost through the pioneering work by von Hippel and Berg. The problem of target location touches on many themes that we discussed over the last few weeks.

### Search in three dimensions and diffusion limited rates

How do molecules find each other by diffusion, i.e., what determines the rate of diffusion limited reactions? In an enzymatic reaction, the total production rate per unit volume is

$$ \kappa [S][E] \quad \mathrm{with\, units}\quad \frac{stuff}{volume\times time}$$

where \(\kappa\) is the rate constant and [E] and [S]. One what quantities will the rate \(\kappa\) depend? Obvious contenders are

- the relative diffusion constant \(D_{3D}\) with units \(length^2/time\)
- the size of the reactants \(r\) with units \(length\)

Hence we expect a scaling

$$ \kappa \sim D_{3D} r \quad [length^3/time] $$

A more careful calculation will reveal that if we assume spherical particles with radii \(r_A\) and \(r_B\) and diffusion constants \(D_A\) and \(D_B\), the diffusion limit reaction rate is

$$ \kappa = 4 \pi (D_A+D_B)(r_a + r_b) $$

One way to understand this relation is to think of an reaction area \(4\pi (r_a + r_b)^2\) and a diffusive flux through that area with magnitude \((D_A+D_B)/(r_a + r_b)\) since the gradient of the concentration profile around and absorbing sphere is inversely proportional to the radius. This radius should not be interpreted as the radius of the molecules, but the range of interactions.

In practice, reactions require that reactants are positioned at particular angles to each other which will reduce these rates below the diffusion limit. This effect is partly counter-balanced by the unspecific electrostatic attraction between the molecules.

Applying this reasoning to the problem of transcription facter/DNA association, we can ignore the diffusion of DNA (it is a large slow molecule). Furthermore, the radius of reaction should be of the same order as the distance between basepairs since the TFs recognize specific DNA sequences and hence have to be in register with the DNA to 0.3nm. Together with an in-vitro diffusion constant of a transcription factor of about \(100\mu m^2/s\), this results in a rate estimate

$$ \kappa_{D} = 4\pi \times 100 \times 3\times 10^{-4} \mu m^3/s \approx 0.4 \mu m^3/s \approx 2\times 10^8 M^{-1}s^{-1} $$

This might be a little high since we have not accounted for the orientation, but the measured association rates are about 100fold higher. Clearly, diffusion limitation is not what is limiting this association and some additional mechanisms have to be at play.

#### Excurs: perfectly absorbing sphere

To calculate the rate of at which two particles hit each other, it is useful to observe the following simplifications: The only quantity that matters is the distance \(\vec{r}\) between them. This distance diffuses with a combined diffusion constant \(D=D_A+D_B\). The boundary condition then requires that \(P(|\vec{r}|=R)=0\) with \(R=r_A+r_B\).

We are looking for an isotropic solution with constant flux which required

$$ r^2 D \frac{\partial P(r)}{\partial r} = C $$

with solution \(P(r) = x(1-\frac{R}{r}) \). This confirms our reasoning above that the gradient of the concentration profile on the surface is \(\sim (r_A+r_B)^{-1}\).

### Search in one dimension

Early on, Riggs et al suggested that the massive speed up in the association rate might be due to sliding of the TF along the DNA until it found the specific target site. Once the TF has found the DNA, its binds unspecifically and "scans" the DNA for the right target site. Since this scan in un-directed and diffusive, the typical length of DNA that the TF has scanned will increase as \(l \approx \sqrt{2D_{1D} t}\) with the time since association.

This recapitulates what we discussed earlier that diffusive search in one dimension is fast on short time scales but ends up revisiting the same places as time goes on. 3D diffusion, on the other hand, is unlikely to revisit the same place because reassociation on a random DNA coil will typically transfer the TF to a completely new section of the genome.

Diffusion of transcription factors on DNA has been measured using single molecule techniques (e.g., by fluorescently labeling the TF). van Oijen et al found a diffusion constant of

$$D_{1D} \approx 10^5-10^6 bp^2/s = 0.01-0.1\mu m^2/s $$

hence about a factor of 100-1000 smaller that free 3D diffusion. Nonetheless, searching in one dimension greatly accelerates search.

### Optimal combined search

If combined 1D/3D diffusion was the mechanism by which TFs find their target, how should they be dividing their time between 3D and 1D search? Following Mirny et al, the total time until the target is found can be expressed as the sum over multiple rounds of 3D/1D search

$$ t_s = \sum_{i=1}^K (\tau_{1D,i} + \tau_{3D,i}) $$

and the typical number of rounds necessary would be \(\bar{K} = L/l \) where L is the length of the genome and l is the length searched in a single round. Since \(l\sim \sqrt{2D_{1D} \tau_{1D}}\), we obtain for the average search time

$$ t_s = \frac{L}{\sqrt{2D_{1D} \tau_{1D}} }(\tau_{1D}+\tau_{3D}) $$

The search time is minimal when

$$ \frac{d t_s}{d \tau_{1D}} = \frac{L}{2\sqrt{2D_{1D}}}(\tau_{1D}^{-1/2}-\tau_{3D}\tau_{1D}^{-3/2}) = 0$$

which requires \(\tau_{1D} = \tau_{3D}\), i.e., the TF should spend equal times on the DNA and in solution. The mean search time is therefore

$$ t_s = L\sqrt{\frac{2\tau_{3D}}{D_{1D}}} $$

#### Why is there an optimum?

One dimensional diffusion alone would be extremely inefficient since it takes very long time to scan the entire genome. Very short \(\tau_{1D}\), on the other hand, corresponds to very limited scanning of bases and in the limit of \(\tau_{1D}\to 0\) corresponds to pure 3D search. Hence it is plausible that an optimum should exist.

### Intersegment transfer

The 1D diffusion mechanism suggests that transcription factors stay close to DNA by unspecific electrostatic interactions. In addition to 3D diffusion, there is another way by which the TF can explore DNA sequence space.

In the excercise session from last weeks, you should have found that the typical time for a diffusive walker to return to the origin is proportional to \(s^{-3/2}\). In other words, most contacts between two DNA sites in 3D involve sequences in close proximity, but very long range jumps exist.

## Assignments

- The efficiency of intersegment transfer depends on the distribution of long range contacts among sites on the genome. For a random coil in 3D, this is \(\sim s^{-3/2}\). In a bacterium, DNA is compressed and has a denser conformation than the typical random coil. Does this make intersegment transfer more or less efficient?
- How does the distribution of distances between self-contacts behave qualitatively? Analyze its behavior qualitatively using the properties of solutions to the diffusion equation in a closed compartment.