Real-time phylogenetic analysis of emerging pathogens

Richard Neher
Biozentrum, University of Basel

slides at

Sequences record the spread of pathogens

images by Trevor Bedford

Human seasonal influenza A viruses

slide by Trevor Bedford

  • Influenza viruses evolve to avoid human immunity
  • Vaccines need frequent updates

GISRS and GISAID -- Influenza virus surveillance

  • comprehensive coverage of the world
  • timely sharing of data -- often within 2-3 weeks of sampling
  • hundreds of sequences per week (in peak months)
→ requires continuous analysis and easy dissemination
→ interpretable and intuitive visualization

joint work with Trevor Bedford & his lab

joint project with Trevor Bedford & his lab

Nextstrain architecture

Using treetime to rapidly compute timetrees

Enterovirus D68 -- with Jan Albert and Robert Dyrdak

  • Non-polio enterovirus
  • Large outbreak in 2014 with severe neurological symptoms in young children (acute flaccid myelitis)
  • Another outbreak in 2016
  • Outbreaks tend to start in late summer/fall
  • Several reports of EV-D68 outbreaks in the past 6 weeks
    (155 AFM cases in the US as of yesterday)

Whole genome deep sequencing

  • Geographic spread and phylogenetic patterns?
  • Within host diversity?
  • Transmission bottlenecks/multiplicity of infection?

joint work with Jan Albert & his lab

Phylodynamic analysis

  • EV-D68 outbreaks come from distinct clades.
  • The evolutionary rate is very high
    → a lot of power to study transmission chains
  • Most variation is synonymous
Dyrdak et al, biorxiv

Whole genome deep sequencing of Enterovirus D68

  • Amplified in 4 overlapping segments
  • Illumina sequenced to high coverage
Dyrdak et al, biorxiv

iSNV frequency accuracy and sequencing errors

  • iSNV frequencies reproducible above 1%
  • background at around 1/1000
Dyrdak et al, biorxiv

With-in host diversity

  • Above 0.5%, iSNVs are biological
  • Most samples have few iSNVs, three had more than 20
Dyrdak et al, biorxiv

Dual infections

  • A set of iSNVs at very similar frequencies in full linkage
  • Suggest infection with two related variants
  • 3 out of 50 samples: Implies high prevalence
Dyrdak et al, biorxiv


  • Pathogen sequence data contain information on spread and transmission
  • Timely sharing is key
  • Integration of sequence data with epidemiological data
  • Near real-time analysis
  • Dissemination of results in an intuitive yet informative way


  • Trevor Bedford
  • Colin Megill
  • Pavel Sagulenko
  • Sidney Bell
  • James Hadfield
  • Wei Ding
  • Emma Hodcroft
  • Sanda Dejanic


  • Robert Dyrdak
  • Jan Albert
  • Lina Thebo
  • Emma Hodcroft