Applied evolutionary biology: tracking and predicting the spread of disease

Richard Neher
Biozentrum, University of Basel

slides at

Sequences record the spread of pathogens

The resolution is limited by the number of mutations!
images by Trevor Bedford

Human seasonal influenza viruses

slide by Trevor Bedford

Influenza seasonality - USA

  • Influenza viruses evolve to avoid human immunity
  • Vaccines need frequent updates

joint work with Trevor Bedford & his lab

Beyond tracking: can we predict?

slide by Trevor Bedford

Clonal interference and traveling waves

RN, Annual Reviews, 2013; Desai & Fisher; Brunet & Derrida; Kessler & Levine

Influenza virus prediction by Luksza & Lässig

Matzusaki et al, JVI, 2014
  • Epitope mutations: association with antigenic change
  • Non-epitope mutations: likely deleterious
  • Nonlinear component: synonymous mutations
$$W = \frac{x_i(t+1)}{x_i(t)} = e^{f_0 + \alpha f_{ep} + \beta f_{ne} + \gamma f_{nl}}$$

Typical tree

Bolthausen-Sznitman Coalescent

RN, Hallatschek, PNAS, 2013; see also Brunet and Derrida, PRE, 2007

Predicting evolution

Given the branching pattern:

  • can we predict fitness?
  • pick the closest relative of the future?
RN, Russell, Shraiman, eLife, 2014

Fitness inference from trees

$$P(\mathbf{x}|T) = \frac{1}{Z(T)} p_0(x_0) \prod_{i=0}^{n_{int}} g(x_{i_1}, t_{i_1}| x_i, t_i)g(x_{i_2}, t_{i_2}| x_i, t_i)$$
RN, Russell, Shraiman, eLife, 2014

Validation on simulated data

RN, Russell, Shraiman, eLife, 2014

Prediction of the dominating H3N2 influenza strain

  • no influenza specific input
  • how can the model be improved? (see model by Luksza & Laessig)
  • what other context might this apply?
RN, Russell, Shraiman, eLife, 2014

Hemagglutination Inhibition assays

Slide by Trevor Bedford

HI data sets

  • Long list of distances between sera and viruses
  • Tables are sparse, only close by pairs
  • Structure of space is not immediately clear
  • MDS in 2 or 3 dimensions
Smith et al, Science 2002
Slide by Trevor Bedford

Integrating antigenic and molecular evolution

  • $H_{a\beta} = v_a + p_\beta + \sum_{i\in (a,b)} d_i$
  • each branch contributes $d_i$ to antigenic distance
  • sparse solution for $d_i$ through $l_1$ regularization
  • related model where $d_i$ are associated with substitutions
RN et al, PNAS, 2016

Integrating antigenic and molecular evolution

  • MDS: $(d+1)$ parameters per virus
  • Tree model: $2$ parameters per virus
  • Sparse solution
    → identify branches or substitutions that cause titer drop
RN et al, PNAS, 2016

Are antigenic distances tree-like?

Rate of antigenic evolution

  • Cumulative antigenic evolution since the root: $\sum_i d_i$
  • A/H3N2 evolves faster antigenically
  • A/H3N2 has a more rapid population turn-over
  • Proportion of children is high in B vs A/H3N2 infections

How many sites are involved?

K158N/N189K 3.64
K158R 2.31
K189N 2.18
S157L 1.29
V186G 1.25
S193F 1.2
K140I 1.1
F159Y 1.08
K144D 1.08
K145N 0.91
S159Y 0.89
I25V 0.88
Q1L 0.85
K145S 0.85
K144N 0.85
N145S 0.85
N8D 0.73
T212S 0.69
N188D 0.65

Predicting successful influenza clades

Predicting successful influenza clades

HI distances on the phylogenetic tree

joint work with Trevor Bedford & his lab

NextStrain architecture

Using treetime to rapidly compute timetrees


  • Evolutionary biology can help track and fight disease
  • Theory shows how to infer fit clades
  • Future influenza population can be anticipated
  • Automated real-time analysis can make up-to-date analysis available to every body

Influenza and Theory acknowledgments

  • Boris Shraiman
  • Colin Russell
  • Trevor Bedford
  • Oskar Hallatschek

  • Trevor Bedford
  • Colin Megill
  • Pavel Sagulenko
  • Sidney Bell
  • James Hadfield
  • Wei Ding