Richard A. Neher and Trevor Bedford
bioRxiv, vol. , 286187, 2018
The rapid development of sequencing technologies has to led to an explosion of pathogen sequence data that are increasingly collected as part of routine surveillance or clinical diagnostics. In public health, sequence data is used to reconstruct the evolution of pathogens, anticipate future spread, and target interventions. In clinical settings, whole genome sequences identify pathogens at the strain level, can be used to predict phenotypes such as drug resistance and virulence, and inform treatment by linking to closely related cases. However, the vast majority of sequence data are only used for specific narrow applications such as typing. Comprehensive analysis of these data could provide detailed insight into outbreak dynamics, but is not routinely done since fast, robust, and interpretable analysis work-flows are not in place. Here, we review recent developments in real-time analysis of pathogen sequence data with a particular focus on visualization and integration of sequence and phenotypic data.