Within-host evolution of HIV and population genetics of rapid adaptation
				  
					Richard Neher  
					Biozentrum, University of Basel 
				
 
					slides at neherlab.org/201802_IST.html  
				
			 
	        
     
    Evolution of HIV 
    
    
    
        
            Chimp → human transmission around 1900 gave rise to HIV-1 group M 
            ~100 million infected people since 
            subtypes differ at 10-20% of their genome 
            HIV-1 evolves ~0.1% per year 
         
     
     
    image: Sharp and Hahn, CSH Persp. Med. 
 
     
    HIV infection 
    
    
    
    
    
        $10^8$ cells are infected every day 
        the virus repeatedly escapes immune recognition 
        integrates into T-cells as latent provirus  
       
    image: wikipedia  
 
     
    HIV-1 evolution within one individual 
    
    
    
    silouhette: clipartfest.com, Zanini at al, 2015. Collaboration with Jan Albert and his group 
 
     
    Population sequencing to track all mutations above 1% 
    
        
    
    Zanini et al, eLife, 2015; antibody data from Richman et al, 2003 
 
     
    Diversity and hitchhiking 
    
         
        
        
            
                envelope changes fastest, enzymes lowest 
                identical rate of synonymous evolution 
                diversity saturates where evolution is fast 
                synonymous mutations stay at low frequency 
             
         
     
    
         
    
    
     
    
    Zanini et al, eLife, 2015 
 
    Mutation rates and diversity at neutral sites 
         
    Zanini et al, Virus Evolution, 2017 
 
     
    Frequent reversion of previously beneficial mutations 
    
    
    
    HIV escapes immune systems 
    most mutations are costly 
    humans selects for different mutations 
    compensation or reversion? 
     
     
     
    
    
    
    
     
    
    
     
    
    
     
    
    Zanini et al, eLife, 2015 
 
     
    Inference of fitness costs 
    
    
        
            mutation away from preferred state with rate $\mu$ 
            selection against non-preferred state with strength $s$ 
            variant frequency dynamics: $\frac{d x}{dt} = \mu -s x $ 
            equilibrium frequency:  $\bar{x} = \mu/s $ 
            fitness cost:  $s = \mu/\bar{x}$ 
         
     
     
    
 
     
    Inference of fitness costs 
    
    
        
            Frequencies of costly mutations decorrelate fast $\frac{d x}{dt} = \mu -s x $  
            $\Rightarrow$ average many samples to obtain accurate estimates
             
            Assumption: The global consensus is the preferred state 
            Only use sites that initially agree with consensus 
            Only use sites that don't chance majority nucleotide 
         
     
     
    
 
    Fitness landscape of HIV-1 
     
    
    Zanini et al, Virus Evolution, 2017 
 
    Selection on RNA structures and regulatory sites 
     
    
    Zanini et al, Virus Evolution, 2017 
 
    The distribution of fitness costs 
     
    
    Zanini et al, Virus Evolution, 2017 
 
     
    Fitness variation in rapidly adapting populations 
    
    RN, Annual Reviews, 2013; Desai & Fisher; Brunet & Derride; Kessler & Levine 
 
     
    
        
Neutral/Kingman coalescent 
    
    
        
strong selection 
    
    
    
    
        
Bolthausen-Sznitman Coalescent 
    
    
    
    RN, Hallatschek, PNAS, 2013; see also Brunet and Derrida, PRE, 2007; Desai, Walczak, Fisher, Genetics, 2013 
 
     
    Traveling waves and the Bolthausen-Snitman coalescent 
    
         
    
        
        Branching process approximation: $P(n_i, t|x_i)$ 
            
                Does a sample (blue dots) have a common ancestor $\tau$ generations ago?  
$\quad Q_b = \langle \sum_i \left(\frac{n_i}{\sum_j n_j}\right)^b\rangle \approx \frac{\tau-T_c}{T_c(b-1)} $
         
            
                All other merger rates are also consistent with the Bolthausen-Sznitman coalescent:  $\quad\lambda_{b,k} = \frac{(k-2)!(b-k)!}{T_c (b-1)!}$ 
         
     
     
    RN, Hallatschek, PNAS, 2013; see also Brunet and Derrida, PRE, 2007 
 
     
    U-shaped polarized site frequency spectra 
    
    
    
    RN, Hallatschek, PNAS, 2013 
 
     
    Universality -- adaptation and deleterious mutations 
    
    
    
    RN, Hallatschek, PNAS, 2013 
 
     
    Zanini et al, eLife, 2016 
 
     
    Extension to sexual populations 
    
    
        
            
                $T_{MRCA}$ determined by $\sigma_b$ 
                Block length $\zeta_b$ is determined by $T_{MRCA}$ 
                Fitness variation $\sigma_b$ is determined by block length 
             
            → self-consistent solution required
         
     
    RN, Kessinger, Shraiman PNAS, 2013 
 
     
    $T_{MRCA}$ and SFS 
    
    
    RN, Kessinger, Shraiman PNAS, 2013 
    
        
            
                Fitness diversity in block: $\sigma_b = \frac{\mu \langle s^2\rangle}{2\rho}$ 
                Qualitative change behavior around $N\sigma_b$ 
                Total rate of adaptation: $\sim L\sqrt{\rho \mu \langle s^2\rangle \log N}$ 
             
         
     
 
    Bursts in a tree ↔ high fitness genotypes 
    Can we read fitness of a tree? 
 
    
         
        
        
        
        
            
                Influenza virus evolves to avoid human immunity 
                Vaccines need frequent updates 
             
         
         
        
     
 
     
    Predicting evolution 
    
    
    
        Given the branching pattern: 
        
            can we predict fitness? 
            pick the closest relative of the future? 
         
     
     
    RN, Russell, Shraiman, eLife, 2014 
 
     
    Fitness inference from trees 
    
     
    
    
        $$P(\mathbf{x}|T) = \frac{1}{Z(T)} p_0(x_0) \prod_{i=0}^{n_{int}} g(x_{i_1}, t_{i_1}| x_i, t_i)g(x_{i_2}, t_{i_2}| x_i, t_i)$$ 
    
    RN, Russell, Shraiman, eLife, 2014 
 
     
    Validate on simulation data 
    
    
        
            simulate evolution 
            sample sequences 
            reconstruct trees 
            infer fitness 
            predict ancestor of future 
            compare to truth 
         
     
     
    
    RN, Russell, Shraiman, eLife, 2014 
 
     
    Validation on simulated data 
    
     
    
    RN, Russell, Shraiman, eLife, 2014 
 
    Prediction of the dominating H3N2 influenza strain  
    
     
    
        no influenza specific input 
        how can the model be improved? (see model by Luksza & Laessig) 
        what other context might this apply? 
      
    RN, Russell, Shraiman, eLife, 2014 
 
Summary 
    
        RNA virus evolution can be observed directly 
        Extensive reversion to preferred amino acid sequence 
        Rapidly adapting population require new population genetic models 
        Those model can be used to infer fit clades 
        Future influenza population can be anticipated 
        Automated real-time analysis can help fight the spread of disease 
     
 
     
    HIV acknowledgments 
    
    
    
    
        Fabio Zanini 
        Jan Albert 
        Johanna Brodin 
        Christa Lanz 
        Göran Bratt 
        Lina Thebo 
        Vadim Puller 
      
     
    
 
     
    Influenza and Theory acknowledgments 
    
    
    
    
        Boris Shraiman 
        Colin Russell 
        Trevor Bedford 
        Oskar Hallatschek 
      
     
    
 
     
    nextstrain.org 
    
    
    
    
    Trevor Bedford 
    Colin Megill 
    Pavel Sagulenko 
    Sidney Bell 
    James Hadfield 
    Wei Ding