Endemic infection can shape epidemic exposure: using breakthroughs in statistical ecology to better understand co-infection patterns

Throughout our lives, we are exposed and infected by a diverse community of pathogens from viruses and bacteria to parasitic worms. In humans, what combination of pathogens you are infected by matters as these organisms can interact with each other in remarkable ways that can alter the outcome of an infection. For example, people co-infected by HIV (human immunodeficiency virus) and tuberculosis (tb – a disease caused by Mycobacterium bacteria) experience heightened symptoms of each pathogen and are a much higher risk of dying compared to people infected by just one of these pathogens. HIV interferes with the immune system that not only allows tb to grow faster but also increases the chances of that individual transmitting the bacteria. This is an example of a positive or ‘facilitative interaction’ between pathogens in ecological speak. In contrast, pathogens can compete as well (a negative interaction) and is some cases this can protect us from disease. For example, co-infection between certain parasitic worms can actually be protective of malaria (see Nacher, 2011 below). Further, we know it is possible that interactions between pathogens can be dependent on the order of infection  (see Hoverman et al. for more on this). But how do we test for these specific interactions, particularly in wildlife? Humans and wildlife are exposed and infected by a diverse range of organisms; how could we work out which ones to test? It is unfeasible to test every combination in the lab and even then, how would we know what combination actually occurs in the wild?

In this paper, we harnessed recent advances in ecological statistics and network theory to quantify associations between pathogens in a wild population of lions in the Serengeti in Tanzania. We label them associations as we can’t be 100% sure that they actually represent real interactions between pathogens (you’d need to do lab experiments for that which are difficult to do for wildlife). Based on over 10 years of exposure and infection data from a wide variety of pathogens that infect lions, we were able to establish which pathogens were positively or negatively associated with others. As we have been monitoring these lions often since birth, we were able to deduce the likely order of infection or exposure and work out if a pathogen that a lion was exposed to early in life could impact which pathogen they were exposed to as adults. These statistical methods are also useful as they can start to untangle if these associations could be just due to environmental factors (i.e. the lion got co-infected by two pathogens because of an ecological preference of these pathogens) rather than a potential biological mechanism.

The associations we found using these methods were often surprising but reflected what has been established in human lab-based studies which is promising. For example, we found a strong negative association between Rift Valley Fever (RVF -a mosquito-borne virus that infects lion as well as cattle and sheep leading to sometimes devastating economic loss) and felid equivalent to HIV (FIV). FIV infects nearly 100% of lions as cubs, whereas RVF infection is more likely to occur later in life. Interestingly RVF has similar molecular machinery to a group of viruses that are known to inhibit the growth of HIV, so it is possible that the same mechanism exists for lions as well. Similarly, we found a strong negative association between feline coronavirus (in the virus family that causes severe acute respiratory syndrome or SARS in humans) and one type of FIV also. Coronaviruses are considered possible candidate vaccines for HIV, so again laboratory work from human medicine provided some support for our findings.

We didn’t just find negative associations either, we also detected strong positive interaction between the tick-borne Babesia protozoans and canine distemper virus (CDV). This co-infection pattern has been identified previously and is likely the underlying factor that caused this lion population to crash by over 33% in the 1990s. Lions are may be able to withstand a CDV epidemic in isolation but when combined with Babesia in a co-infection, this can lead to serious population declines for this species (see Munson et al for some more details).  Our study shows that it didn’t matter which species of Babesia either, all of the species we included had these strong positive associations with CDV.

We can’t prove conclusively that these pathogens actually interact within a lion based on these statistical methods alone. However, we can provide a valuable ‘shortlist’ of possible interactions that occur in a wild population that can be tested using cell-level experiments in a lab – we obviously don’t want to actually test these hypotheses out on lions themselves. Given how common interactions between pathogens are and the potentially positive or negative outcomes of them for the host, our approach coupled with lab-work can provide important insights to understanding pathogen dynamics in wild populations.

Nacher (20111): https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3192711/
Hoverman et al (2013): https://www.ncbi.nlm.nih.gov/pubmed/23754306?dopt=Abstract

Munson et al (2008) : https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0002545

A link to the paper here: https://onlinelibrary.wiley.com/doi/full/10.1111/ele.13250

Co-occurence modelling and parasites

It’s increasingly recognized that multiparasitism (being infected by multiple parasites at the same time) is commonplace and what particular set of parasites you are infected with can have direct implications for health (and are interesting in their own right). However, quantifying the complex interactions between co-occurring parasites is tricky. For example, are the co-occurrence of particular parasites just related to age i.e. as you get older you simply accrue more infection? Or are the parasites (via the immune system) facilitating (or prohibiting) the invasion of others or is it another reason entirely? Answering these questions is important but choosing the appropriate analytical solution is a little daunting. Species co-occurrence patterns have been studied of other organisms for a long time so there are many approaches.

So what are the options? Broadly, I recognize three distinct approaches: 1. Network-based models. 2. Probabilistic models and 3. Joint species distribution models. Each I will talk a little bit about and point out briefly some pros and cons about each approach. See the resources below for links to some of the methods/papers that use the method.

Network-based models.

Co-occurrence networks are networks of pathogens connected by edges (the connecting lines) which represent when those particular infections were sampled together. These methods look at the network structure by, for example, examining how connected certain pathogens are (i.e. degree) or by assessing which pathogens in the network cluster together more often than expected by chance (i.e. how modular the network is). Pros: relatively straightforward to analyze, a good way to view co-infection patterns (iGraph in R is great), not restricted to assessing just pairs of pathogens. Cons: difficult to overlay potentially confounding factors (e.g., age, but see the new and exciting MRFcov package from Nick Clark), hard to test for associations between pathogens across scales & difficult to incorporate trait or phylogenetic information.

Null and probabilistic models 

Basically, these methods ask do two species co-occur more or less often by chance. There is a large number of methods in this category and much debate to how robust these methods are (see Gotelli 2000), but the Veech 2013 method is my favorite as its distribution free. Pros: Easy to interpret, fast to run with low error rates. Cons: Can only assess pairs (exception: the screening approach of Vaumorin but you have to have < 10 pathogens) and can’t control for confounding effects or test for associations between pathogens across scales, null models can have extreme Type I errors (see Harris, 2016)  .

Joint distribution modeling

The last category and one I have used the most! Basically, this method quantifies the distribution of each parasite in your data to environmental (and host) variables using Bayesian hierarchical mixed modeling and then explores between-parasite relationships in the residual variation.  There are nice packages in R to help you apply this approach (BORAL and HMSC are my favorites). Pros:  Enables you to assess co-occurrence patterns after controlling for confounding factors and to assess these patterns easily across scales, they are flexible and can deal with parasite abundance data (i.e more than just presence/absence of a parasite) & you get useful niche models as a bonus. Also can easily incorporate parasite phylogenetic and functional trait data. Cons: an only assess pairs, &and it doesn’t provide coefficients for the strength of the co-occurrence patterns (just significantly different from zero).


Elise Vaumourin has a nice review article: https://parasitesandvectors.biomedcentral.com/articles/10.1186/s13071-015-1167-9

Network approaches– Modularity algorithm: https://arxiv.org/abs/cond-mat/0408187. Igraph:  http://kateto.net/networks-r-igraph.

Interesting paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3973251/.

The Nick Clark MRFCov approach: https://github.com/nicholasjclark/MRFcov

Harris, 2016 for Markov networks: https://esajournals.onlinelibrary.wiley.com/doi/full/10.1002/ecy.1605

Probabilistic models – The Veech paper: http://ecology.wp.txstate.edu/files/2013/11/Veech_2013_GEB.pdf. Vaumourin et al (2014) https://www.ncbi.nlm.nih.gov/pubmed/24860791

Joint distribution modeling

HMSC: https://onlinelibrary.wiley.com/doi/full/10.1111/ele.12757.

BORAL: https://besjournals.onlinelibrary.wiley.com/doi/abs/10.1111/2041-210X.12514

Cool papers using the approach: https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/1365-2656.12708