Richard Neher
Biozentrum, University of Basel

  • rely on host to replicate
  • little more than genome + capsid
  • genomes typically 5-200k bases (+exceptions)
  • most abundant organisms on earth $\sim 10^{31}$
Phylogenetic analysis
Evolution of HIV

  • Chimp → human transmission around 1900 gave rise to HIV-1 group M
  • ~100 million infected people since
  • subtypes differ at 10-20% of their genome
  • HIV-1 evolves ~0.1% per year
HIV infection

  • $10^8$ cells are infected every day
  • the virus repeatedly escapes immune recognition
  • integrates into T-cells as latent provirus
HIV-1 evolution within one individual

HIV-1 sequencing before and after therapy

Population sequencing to track all mutations above 1%

Zanini et al, eLife, 2015; antibody data from Richman et al, 2003

Evolution in different parts of the genome

  • envelope changes fastest, enzymes lowest
  • identical rate of synonymous evolution
  • diversity saturates where evolution is fast
  • synonymous mutations stay at low frequency
Does HIV evolve during therapy?

No evidence of ongoing replication

No evidence of ongoing replication

Influenza A/H3N2 virus evolution

Human seasonal influenza viruses

  • Influenza virus evolves to avoid human immunity
  • Vaccines need frequent updates


Fitness variation in rapidly adapting populations

strong selection

Bursts in a tree ↔ high fitness genotypes

Can we read fitness of a tree?

Predicting evolution

Given the branching pattern:

  • can we predict fitness?
  • pick the closest relative of the future?
Fitness inference from trees

$$P(\mathbf{x}|T) = \frac{1}{Z(T)} p_0(x_0) \prod_{i=0}^{n_{int}} g(x_{i_1}, t_{i_1}| x_i, t_i)g(x_{i_2}, t_{i_2}| x_i, t_i)$$
Validate on simulation data

  • simulate evolution
  • sample sequences
  • reconstruct trees
  • infer fitness
  • predict ancestor of future
  • compare to truth
Validation on simulated data

Prediction of the dominating H3N2 influenza strain

  • no influenza specific input
  • how can the model be improved? (see model by Luksza & Laessig)
  • what other context might this apply?
  • integrate data from many different sources
  • analyze those data in near real time
  • disseminate results in an intuitive yet informative way
  • provide actionable insights


