Ernst Haeckel, 1879
A quantitative understanding of evolution?
- How repeatable is evolution?
- How gradual is evolution?
- How predictable is evolution?
- What are the relevant parameters?
(population size, mutation rates etc)
- How do the dynamics depend on parameters?
Challenges
- Most data are static snap shots.
- Genotype-phenotype map is largely unknown.
- Ecology and environments are complex and variable.
Experimental evolution -- Lenski experiment
Experiment started 1927, one drop every 10 years. wikipedia.org
Rich Lenski, Ben Good, Michael Desai et al
Influenza A/H3N2
- Influenza viruses evolve to avoid human immunity
- Vaccines need frequent updates
Rapidly evolving RNA viruses -- HIV
silouhette: clipartfest.com, Richman et al, 2003.
Evolution of HIV
- Chimp → human transmission around 1900 gave rise to HIV-1 group M
- ~100 million infected people since
- subtypes differ at 10-20% of their genome
- HIV-1 evolves ~0.1% per year
image: Sharp and Hahn, CSH Persp. Med.
HIV infection
- $10^8$ cells are infected every day
- the virus repeatedly escapes immune recognition
- integrates into T-cells as latent provirus
image: wikipedia
HIV acknowledgments
- Fabio Zanini
- Jan Albert
- Johanna Brodin
- Christa Lanz
- Göran Bratt
- Lina Thebo
- Vadim Puller
HIV-1 evolution within one individual
silouhette: clipartfest.com, Zanini at al, 2015. Collaboration with Jan Albert and his group
Population sequencing to track all mutations above 1%
Zanini et al, eLife, 2015; antibody data from Richman et al, 2003
Diversity and rates of change
- envelope changes fastest, enzymes slowest
- identical rate of synonymous evolution
- diversity saturates where evolution is fast
- synonymous mutations stay at low frequency
Zanini et al, eLife, 2015
Estimating the date of infection from diversity data
- diversity at 3rd positions increases almost linearly in time
- can be used to predict date of infection
- critical to estimate incidence
- Multiple founder viruses cause over estimation
- Degraded samples cause under estimation
Puller et al, PLoS Comp Bio, 2017
Inference of fitness costs
- mutation away from preferred state with rate $\mu$
- selection against non-preferred state with strength $s$
- variant frequency dynamics: $\frac{d x}{dt} = \mu -s x $
- equilibrium frequency: $\bar{x} = \mu/s $
- fitness cost: $s = \mu/\bar{x}$
Fitness landscape of HIV-1
Zanini et al, Virus Evolution, 2017
Selection on RNA structures and regulatory sites
- red: mutations that don't change protein sequence
- blue: all mutations
Zanini et al, Virus Evolution, 2017
Influenza A/H3N2 virus evolution
Joint work with....
- Boris Shraiman
- Colin Russell
- Trevor Bedford
Prediction of the dominating H3N2 influenza strain
RN, Russell, Shraiman, eLife, 2014
nextstrain.org
- Trevor Bedford
- Colin Megill
- Pavel Sagulenko
- Sidney Bell
- James Hadfield
- Wei Ding
- Emma Hodcroft
nextstrain.org
- integrate data from many different sources
- analyze those data in near real time
- disseminate results in an intuitive yet informative way
- provide actionable insights
What about more complicated things than viruses?
nextTB: real-time molecular epidemiology of TB
Collaboration with Sebastian Gagneux and colleages at the STPH
- 10s of thousand MTB genomes have been sequenced
- Elucidate transmission routes at the local and global level
- Integrate with drug resistance data
Pan-genomes of bacteria
- much larger genomes
- vertical and horizontal transmission
- gene gain and loss
- genome rearrangements
- variation of divergence along the genome
Pan-genome statistics and filters
Species trees and gene trees
Genome assembly with short reads
- 10s of millions of short reads (<500bp)
- Too short to bridge repetitive elements
- → assemblies are fragmented into 100s of "contigs"
(really terrible example)
Images: illumina.com, github.com/rrwick
Acknowledgements
Group
- Fabio Zanini
- Pavel Sagulenko
- Vadim Puller
- Wei Ding
- Sanda Dejanic
- Emma Hodcroft
- Nicholas Noll
- Eric Ulrich
Karolinska Institute
- Jan Albert
- Lina Thebo
- Johanna Brodin
FHCRC
- Trevor Bedford
- James Hadfield
- Sidney Bell
- Colin Megill
Swiss TPH
- Sebastian Gagneux
- Chloé Loiseau
- Fabrizio Menardo
USB
BZ
- Leo Faletti
- sciCore
- workshops
- Boris Shraiman (UCSB)
- Colin Russell (AMC)
- Oskar Hallatschek (UCB)