Mutations arise at random in the genome, but the rates of different mutations are not necessarily the same and the process can be reasonably modeled by a matrix of rates going from each state to any other state. Furthermore, mutations are filtered by selection. While providing the raw material for evolution. the evolutionary rate matrix is not necessarily equal to the mutation matrix. A common way to model sequence evolution is via a general time reversible model, by which one means the most general stochastic (probability conserving) matrix which is time reversible (known in physics as detailed balance condition). Let's denote the transition matrix by \(Q_{ij}\). It has \(n^2\) parameters, but we require probability conservation
$$ \sum_i Q_{ij} = 0 \quad \Rightarrow \quad Q_{ii} = - \sum_{k\neq i} Q_{ki}$$
and detailed balance
$$ Q{ij}\pi_j = Q_{ji}\pi_i$$
where \(\pi\) are the equilibrium frequencies given by the right eigenvector of \(Q_{ij}\pi_j\) with eigenvalue 0. Hence we have \(n(n-1)/2\) constraints on the matrix \(Q_{ij}\) and are left with \(n(n-1)/2\) free parameters. It is convienient to decompose \(Q_{ij}\) into a symmetric matrix \(W_{ij}\) and the equilbrium frequencies
$$Q_{ij} = \pi_i W_{ij} $$
which automatically satisfies the detailed balance condition. This is readily seen in this little numerical example
import numpy as np
W = np.array([[0, 1, 2],[1, 0, 1], [2, 1, 0]], dtype=float)
p = np.array([0.5, 0.3, 0.2])
tmpQ = np.diag(p).dot(W) #\delta_{ik}p_i W_{kj}
np.fill_diagonal(W, -np.sum(tmpQ, axis=0)/p)
Q = np.diag(p).dot(W)
# this is symmetric!
print(Q.dot(np.diag(p))-(Q.dot(np.diag(p)).T))
To solve for the time evolution, we need an eigen decomposition of the model. The largest eigenvalue of the matrix is zero and the corresponding right eigenvector is the vector of equilbrium concentrations.
rev, V = np.linalg.eig(Q)
lev, levec = np.linalg.eig(Q.T)
print(V[:,0]/np.sum(np.abs(V[:,0]))) #equals p
print(levec[:,0]/np.sum(np.abs(levec[:,0]))) #unit vector
v_inv = np.linalg.inv(V)
print(Q - revec.dot(np.diag(rev).dot(v_inv)))
Now the probability of transitioning from one sequence to another is given by
$$ P(a | b, t) = exp(Q_{ij}t){a,b} = \sum_k V$$}e^{\lambda_k t} V^{-1}_{kb
where \(U\) is the matrix of right eigenvectors. For large \(t\), the goes towards the stationary distribution corresponding to eigenvalue 0.
The diagonalized matrix can be written as
$$ Q_{ij} = \pi_i W_{ij} = \sum_k V_{ik} \lambda_k V^{-1}_{kj} $$
Dividing by \(\pi_i\) we find
$$ W_{ij} = \sum_k V_{ik}/\pi_i \lambda_k V^{-1}_{kj} $$
Similar, we can define the symmetric propagator
$$ G(a,b|t) = sum_k V_{ak}/\pi_ae^{\lambda_k t} V^{-1}_{kb} $$
that connects two points in a tree.