Evolution of HIV
- Chimp → human transmission around 1900 gave rise to HIV-1 group M
- ~100 million infected people since
- subtypes differ at 10-20% of their genome
- HIV-1 evolves ~0.1% per year
image: Sharp and Hahn, CSH Persp. Med.
HIV infection
- $10^8$ cells are infected every day
- the virus repeatedly escapes immune recognition
- integrates into T-cells as latent provirus
image: wikipedia
HIV acknowledgments
- Fabio Zanini
- Jan Albert
- Johanna Brodin
- Christa Lanz
- Göran Bratt
- Lina Thebo
- Vadim Puller
HIV-1 evolution within one individual
silouhette: clipartfest.com, Zanini at al, 2015. Collaboration with Jan Albert and his group
Population sequencing to track all mutations above 1%
Zanini et al, eLife, 2015; antibody data from Richman et al, 2003
Inference of fitness costs
- mutation away from preferred state with rate $\mu$
- selection against non-preferred state with strength $s$
- variant frequency dynamics: $\frac{d x}{dt} = \mu -s x $
- equilibrium frequency: $\bar{x} = \mu/s $
- fitness cost: $s = \mu/\bar{x}$
- Split the genome into categories from high to low conservation
- Fit model of minor variation to each category
- $\Rightarrow$ harmonic average fitness cost in category
Fitness landscape of HIV-1
Zanini et al, Virus Evolution, 2017
Selection on RNA structures and regulatory sites
Zanini et al, Virus Evolution, 2017
Influenza A/H3N2 virus evolution
Joint work with....
- Boris Shraiman
- Colin Russell
- Trevor Bedford
- Influenza virus evolves to avoid human immunity
- Vaccines need frequent updates
Fitness variation in rapidly adapting populations
RN, Annual Reviews, 2013; Desai & Fisher; Brunet & Derride; Kessler & Levine
strong selection
RN, Hallatschek, PNAS, 2013; see also Brunet and Derrida, PRE, 2007; Desai, Walczak, Fisher, Genetics, 2013
Bursts in a tree ↔ high fitness genotypes
Can we read fitness of a tree?
Predicting evolution
Given the branching pattern:
- can we predict fitness?
- pick the closest relative of the future?
RN, Russell, Shraiman, eLife, 2014
Fitness inference from trees
$$P(\mathbf{x}|T) = \frac{1}{Z(T)} p_0(x_0) \prod_{i=0}^{n_{int}} g(x_{i_1}, t_{i_1}| x_i, t_i)g(x_{i_2}, t_{i_2}| x_i, t_i)$$
RN, Russell, Shraiman, eLife, 2014
Validate on simulation data
- simulate evolution
- sample sequences
- reconstruct trees
- infer fitness
- predict ancestor of future
- compare to truth
RN, Russell, Shraiman, eLife, 2014
Validation on simulated data
RN, Russell, Shraiman, eLife, 2014
Prediction of the dominating H3N2 influenza strain
- no influenza specific input
- how can the model be improved? (see model by Luksza & Laessig)
- what other context might this apply?
RN, Russell, Shraiman, eLife, 2014
nextstrain.org
- integrate data from many different sources
- analyze those data in near real time
- disseminate results in an intuitive yet informative way
- provide actionable insights
nextstrain.org
- Trevor Bedford
- Colin Megill
- Pavel Sagulenko
- Sidney Bell
- James Hadfield
- Wei Ding