Virus evolution and the predictability of next year's flu
Richard Neher
Biozentrum, University of Basel
slides at
tobacco mosaic virus
(Thomas Splettstoesser, wikipedia)
bacteria phage
(adenosine, wikipedia)
influenza virus
human immunodeficiency virus
- rely on host to replicate
- little more than genome + capsid
- genomes typically 5-200k bases (+exceptions)
- most abundant organisms on earth $\sim 10^{31}$
Evolution of HIV
- Chimp → human transmission around 1900 gave rise to HIV-1 group M
- ~100 million infected people since
- subtypes differ at 10-20% of their genome
- HIV-1 evolves ~0.1% per year
image: Sharp and Hahn, CSH Persp. Med.
HIV infection
- $10^8$ cells are infected every day
- the virus repeatedly escapes immune recognition
- integrates into T-cells as latent provirus
image: wikipedia
HIV-1 evolution within one individual
silouhette:, Zanini at al, 2015. Collaboration with Jan Albert and his group
HIV acknowledgments
- Fabio Zanini
- Jan Albert
- Johanna Brodin
- Christa Lanz
- Göran Bratt
- Lina Thebo
- Vadim Puller
Population sequencing to track all mutations above 1%
- diverge at 0.1-1% per year
- almost whole genome coverage in 10 patients
- full data set at
Zanini et al, eLife, 2015; antibody data from Richman et al, 2003
Diversity and hitchhiking
- envelope changes fastest, enzymes lowest
- identical rate of synonymous evolution
- diversity saturates where evolution is fast
- synonymous mutations stay at low frequency
Zanini et al, eLife, 2015
Mutation rates and diversity at neutral sites
Zanini et al, Virus Evolution, 2017
Inference of fitness costs
- mutation away from preferred state at rate $\mu$
- selection against non-preferred state with strength $s$
- variant frequency dynamics: $\frac{d x}{dt} = \mu -s x $
- equilibrium frequency: $\bar{x} = \mu/s $
- fitness cost: $s = \mu/\bar{x}$
Zanini et al, Virus Evolution, 2017
Inference of fitness costs
- Frequencies of costly mutations decorrelate fast $\frac{d x}{dt} = \mu -s x $
- $\Rightarrow$ average many samples to obtain accurate estimates
- Assumption: The global consensus is the preferred state
- Only use sites that initially agree with consensus
- Only use sites that don't chance majority nucleotide
Fitness landscape of HIV-1
Zanini et al, Virus Evolution, 2017
Selection on RNA structures and regulatory sites
Zanini et al, Virus Evolution, 2017
Theoretical framework for virus evolution -- population genetics
evolutionary processes ↔ trees ↔ genetic diversity
Neutral models and beyond
Neutral models
- all individuals are identical → same offspring distribution
- Kingman coalesence and diffusion theory are dual descriptions
- everything is easy to calculate
- perturbations like background selection can be included
But: neutral models not suitable for RNA viruses!
Influenza and Theory acknowledgments
- Boris Shraiman
- Colin Russell
- Trevor Bedford
- Oskar Hallatschek
Clonal interference and traveling waves
RN, Annual Reviews, 2013; Desai & Fisher; Brunet & Derrida; Kessler & Levine
Neutral/Kingman coalescent
strong selection
Bolthausen-Sznitman Coalescent
RN, Hallatschek, PNAS, 2013; see also Brunet and Derrida, PRE, 2007
Bursts in a tree ↔ high fitness genotypes
Can we read fitness of a tree?
- Influenza virus evolves to avoid human immunity
- Vaccines need frequent updates
Predicting evolution
Given the branching pattern:
- can we predict fitness?
- pick the closest relative of the future?
RN, Russell, Shraiman, eLife, 2014
Fitness inference from trees
$$P(\mathbf{x}|T) = \frac{1}{Z(T)} p_0(x_0) \prod_{i=0}^{n_{int}} g(x_{i_1}, t_{i_1}| x_i, t_i)g(x_{i_2}, t_{i_2}| x_i, t_i)$$
RN, Russell, Shraiman, eLife, 2014
Validate on simulation data
- simulate evolution
- sample sequences
- reconstruct trees
- infer fitness
- predict ancestor of future
- compare to truth
RN, Russell, Shraiman, eLife, 2014
Validation on simulated data
RN, Russell, Shraiman, eLife, 2014
Prediction of the dominating H3N2 influenza strain
- no influenza specific input
- how can the model be improved? (see model by Luksza & Laessig)
- what other context might this apply?
RN, Russell, Shraiman, eLife, 2014
- Trevor Bedford
- Colin Megill
- Pavel Sagulenko
- Sidney Bell
- James Hadfield
- Wei Ding
- RNA virus evolution can be observed directly
- Rapidly adapting population require new population genetic models
- Those model can be used to infer fit clades
- Future influenza population can be anticipated
- Automated real-time analysis can help fight the spread of disease
HIV acknowledgments
- Fabio Zanini
- Jan Albert
- Johanna Brodin
- Christa Lanz
- Göran Bratt
- Lina Thebo
- Vadim Puller
Influenza and Theory acknowledgments
- Boris Shraiman
- Colin Russell
- Trevor Bedford
- Oskar Hallatschek
- Trevor Bedford
- Colin Megill
- Pavel Sagulenko
- Sidney Bell
- James Hadfield
- Wei Ding