It was a great privilege to be highly commended for the Journal of Animal Ecology Elton Prize for outstanding papers by early career researchers. It also gave me an opportunity to write a blog about said paper which you can find here: https://journalofanimalecology.wordpress.com/2018/05/23/disease-ecology-the-lions-share/?platform=hootsuite
Adding animal silhouettes to figures seems to be increasingly on trend in ecology. I have no empirical evidence to back up this claim, but it seems like every article in a high impact journal has at least one figure that incorporates silhouettes of species. I too am guilty of adding them – I find them a useful visual tool, but in the past, I’ve had to create them myself using photoshop. No more! PhyloPic (http://www.phylopic.org/about/) provides an easy to search collection reusable silhouette images of organisms from beetles to dinosaurs.
Resources like this are truly great!
It’s increasingly recognized that multiparasitism (being infected by multiple parasites at the same time) is commonplace and what particular set of parasites you are infected with can have direct implications for health (and are interesting in their own right). However, quantifying the complex interactions between co-occurring parasites is tricky. For example, are the co-occurrence of particular parasites just related to age i.e. as you get older you simply accrue more infection? Or are the parasites (via the immune system) facilitating (or prohibiting) the invasion of others or is it another reason entirely? Answering these questions is important but choosing the appropriate analytical solution is a little daunting. Species co-occurrence patterns have been studied of other organisms for a long time so there are many approaches.
So what are the options? Broadly, I recognize three distinct approaches: 1. Network-based models. 2. Probabilistic models and 3. Joint species distribution models. Each I will talk a little bit about and point out briefly some pros and cons about each approach. See the resources below for links to some of the methods/papers that use the method.
Co-occurrence networks are networks of pathogens connected by edges (the connecting lines) which represent when those particular infections were sampled together. These methods look at the network structure by, for example, examining how connected certain pathogens are (i.e. degree) or by assessing which pathogens in the network cluster together more often than expected by chance (i.e. how modular the network is). Pros: relatively straightforward to analyze, a good way to view co-infection patterns (iGraph in R is great), not restricted to assessing just pairs of pathogens. Cons: difficult to overlay potentially confounding factors (e.g., age, but see the new and exciting MRFcov package from Nick Clark), hard to test for associations between pathogens across scales & difficult to incorporate trait or phylogenetic information.
Null and probabilistic models
Basically, these methods ask do two species co-occur more or less often by chance. There is a large number of methods in this category and much debate to how robust these methods are (see Gotelli 2000), but the Veech 2013 method is my favorite as its distribution free. Pros: Easy to interpret, fast to run with low error rates. Cons: Can only assess pairs (exception: the screening approach of Vaumorin but you have to have < 10 pathogens) and can’t control for confounding effects or test for associations between pathogens across scales, null models can have extreme Type I errors (see Harris, 2016) .
Joint distribution modeling
The last category and one I have used the most! Basically, this method quantifies the distribution of each parasite in your data to environmental (and host) variables using Bayesian hierarchical mixed modeling and then explores between-parasite relationships in the residual variation. There are nice packages in R to help you apply this approach (BORAL and HMSC are my favorites). Pros: Enables you to assess co-occurrence patterns after controlling for confounding factors and to assess these patterns easily across scales, they are flexible and can deal with parasite abundance data (i.e more than just presence/absence of a parasite) & you get useful niche models as a bonus. Also can easily incorporate parasite phylogenetic and functional trait data. Cons: an only assess pairs, &and it doesn’t provide coefficients for the strength of the co-occurrence patterns (just significantly different from zero).
Elise Vaumourin has a nice review article: https://parasitesandvectors.biomedcentral.com/articles/10.1186/s13071-015-1167-9
Interesting paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3973251/.
The Nick Clark MRFCov approach: https://github.com/nicholasjclark/MRFcov
Harris, 2016 for Markov networks: https://esajournals.onlinelibrary.wiley.com/doi/full/10.1002/ecy.1605
Probabilistic models – The Veech paper: http://ecology.wp.txstate.edu/files/2013/11/Veech_2013_GEB.pdf. Vaumourin et al (2014) https://www.ncbi.nlm.nih.gov/pubmed/24860791
Joint distribution modeling
Cool papers using the approach: https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/1365-2656.12708
The new Journal of Animal Ecology special issue focuses on animal host-microbe interactions (often in a disease context) looks like a must read. All the articles look interesting but there a few which particularly stand out . Most I’ve seen in preprint form but it is nice to see them all together. In no particular order:
Mihaljevic et al on parasite metacommunities – this looks like an interesting technique!
Keiser et al on queen presence and disease – ants are always interesting.
Raulo et al on social behaviour and gut microbiota.
Becker et al on resource provisioning and host traits in detrmining host-parasite interactions.
Looking forward to reading these articles and the others in more detail!
Recently I just got back from a really interesting meeting to link the disease work that Craig Packer, Meggan Craft and I have been doing as part of the Serengeti Lion Project to the Yellowstone Wolf Project. The meeting this time was in Yellowstone and it was a brilliant opportunity to see wolves in the wild (and the park in winter, see moose image below – my wolf photos weren’t great) . I now know much more about wolf biology and the effort taken to understand this charismatic species. Really a tremendous experience.
These two systems represent the most intensively studied social carnivore systems in the world and the opportunities to compare and contrast the disease ecology across systems is exciting. You wouldn’t necessarily think that the diseases infecting canids and felids would be similar, but in both systems one of the important pathogens is canine distemper virus (CDV). CDV can lead to serious reductions in numbers of both species, but perhaps due to the social organization,lions and wolves numerically recover quickly. CDV exposure , particularly for lions, is nasty and individuals can experience severe neurological symptoms. It is unknown, however, if prides/packs impacted by CDV alter their interactions with other groups. This in turn could alter how diseases move around both landscapes by reducing inter group interactions for the years proceeding an outbreak. The hypothesis is that groups weakened by the disease are less likely to fight over territory and thus become more timid for a period post epeidemic. Our initial results show that this maybe the case for lions at least with the number of pride-pride contacts diminishing in the 4 year period after the epidemic. The lion population had largely recovered by then but the effects of CDV epidemics look like they linger. The plan is to now see if the same pattern can be found with the wolves and this could be the first compelling case for the power of epidemics to cause social disruption.
This collaboration was only really possible due to Pete Hudson, and I’m amazed that he was able to turn a conversation we had when he visited UMN last year into a collaboration that has the potential to understand sociality and disease in a new light.
Understanding how robust the molecular clock is a critical step for many evolutionary analyses. Usually when I get given a set of aligned sequences I turn to TempEst to test the ‘clocklikeness’ of the data. However, after the release of ‘TIPDATINGBEAST’ (I’ll call it TDBEAST fr short) I may turn to TempEst much less often. Whist TempEst is useful in getting some qualitative idea of the temporal signal in the data but I find it annoying that its hard to drill down to find what sequences sequences that are leading to bias. Furthermore, TempEst is sensitive to the input tree which can lead to problems and I have had issues with guessDates as well which is slightly irritating. TDBEAST solves lots of these issues and more and provides a robust method to test how ‘clock-like’ your sequences are. The TDBEAST R package does 2 things using BEAST log files and .xml files:
1. Uses a date randomization test to measure temporal signal and provides a nice visualization to check results.
2. Uses a leave one out cross validation to work out the likely culprits that could be skewing results. These sequences, for example, could have the wrong date assigned to them by mistake – this is a real problem when working on sequences from the field.
Whilst I’m having a few teething issues with the code, overall the ‘how to’ guide is excellent and the code seems straight forward. TDBEAST definitely seems like a valuable addition to my phylogenetic tool box.
Here are the links: http://onlinelibrary.wiley.com/doi/10.1111/1755-0998.12603/full
Population evolutionary ecologists are increasingly turning to integral projection models to understand how changes performance (e.g., growth) influence population dynamics, but this type of modeling is rarely applied to understand host-parasite feed-backs. After being introduced to this modeling approach by Tim Coulson and Shelly Lachish, I’ve been thinking about how they could be applied to disease ecology. I’m not the first one to do so and there is a great review by Metcalf et al (see link below) on the topic. The technique appeals to me it’s quantitative data-driven approach to understanding host-pathogen dynamics that can account for variation at a within-host, individual and population scale. Recently, Bayesian IPMs have been developed and these offer further advantages (see http://www.owlnet.rice.edu/~tm9/pdf/ElderdMiller2016.pdf), but maybe more time consuming to construct.
It’s not surprising however, that these models haven’t really taken off in the field yet. One obvious reason for this could be due to high number of parameters necessary to run the model – although you can use this IPMs in a theoretical context also. For most wildlife systems detailed individual longitudinal data on things such as parasite load over the term of infection is near impossible to get. I wonder if new molecular tools (e.g., measuring viral load using RT PCR from feces) may help fill this data gap? Currently, it looks like you need extensive lab experiments before you can really use this approach (see Wilber et al below). Anyway, I’m looking forward to learning more the next time we meet with Tim and Shelly.
Mecalf et al (2015): http://onlinelibrary.wiley.com/doi/10.1111/1365-2656.12456/full
Wilber et al: http://onlinelibrary.wiley.com/doi/10.1111/ele.12814/full
From Metcalf et al (2015).
Curious about how phylogenetic community ecology can be applied to understand infectious disease? Our review is out now in Biological Reviews: http://onlinelibrary.wiley.com/doi/10.1111/brv.12380/full
This all started thanks to a small grant from the University of Minnesota Institute on the Environment enabling me to get a great group of people together to talk disease ecology and phylogenetics. It’s now great to see it out there! It was also great working with a graphic designer to help get the figures that bit more appealing. I can highly recommend Elissa (http://myvisualvoice.com/) – I learned a lot from her about getting the visual aspects of figures more refined. Figure 1 is below….
Fig. 1. Conceptual schema illustrating how an eco-phylogenetic framework can be applied to understand infectious disease dynamics. The example system used is the Ngoronogoro Crater (Tanzania), across scales: (A) within host; (B) among hosts of the same species; (C) multi-host complex; and (D) landscape scale. Colour-coded and lettered symbols below each panel indicate what data (squares) and statistical tools (circles) could be used to address each challenge (see Section I.2 for model and other tool details). White ovals contain hypothetical parasite communities within a host and different parasite colours and shapes (nematodes or viruses) represent different parasite species or genotypes. PGLMM: phylogenetic generalised linear mixed model.
Are ecologists using fancy statistics in situations where good old t tests, ANOVAs or regressions would do? Is this statistical ‘machismo’ (sensu Brian McGill) making ecologists worse at designing experiments/ observational studies? These questions have been topical at the moment with a few interesting ecology blog articles (and associated comments – see the links below). Now don’t get me wrong I have no problems at all with doing frequentist t tests etc when the data used fits the assumptions of the tests. However, even in my experimental work the data generated doesn’t come close to meeting the assumptions of theses ‘simple’ tests. I wasn’t expecting them to either – I’m a community ecologist and species abundance data is always going to zero inflated etc.
Moreover, even if I could apply these simple tests, these tests alone may miss some really interesting parts of the ecological story. For example, the fact that my multiple response variables (species) are correlated with each other is actually really interesting and helps us understand how this experimental community assembles and forms the basis of co-occurrence theory. Similarly observational studies are often unavoidably hierarchical and suffer from a degree of spatial auto-correlation that should be accounted for. Statistics have come a long way since Fisher in 1918 and ecologists and evolutionary biologists have been amongst the leaders of field for good reasons. We can now ask ecological questions that were even impossible to answer 20 years ago with modern statistical modelling. Similarly, evolutionary biology has had also had an incredible rise in usage of more complex evolutionary models – but this is enabling us to understand evolution in a more realistic way. From a disease perspective, these advances in evolutionary models, for example, have enabled me to understand virus spread in pretty high resolution. Ideally, we should be celebrating these statistical advances rather than lamenting them.
Nonetheless I admit that there is a problem with papers becoming more difficult to read and this is linked to the rise of statistical ‘machismo’. I envisage a two pronged approach to practically dealing with this. As mentioned in the comments sections of both the articles below statistical education for ecologists would help. For me I only really got an introduction to frequentist stats in the last year of my undergraduate degree, and limited formal training during my PhD. Ideally, statistics should be incorporated in the curriculum much earlier. I an ideal world basic frequentist statistics should be mandatory for the first year of an ecology degree, GLMMs/multivariate analysis introduced second year and Bayesian/machine learning methods in the last years. Stating the obvious, statistical education should be more strongly linked to the statistics most commonly employed by ecologists.
Secondly, papers using complicated methods should be strongly encouraged to provide a few sentences at least in the methods outlining why the method employed is justifiable, and in basic terms, how the methods works (and what the limitations of the method are). This hopefully would limit people from using unnecessarily complex designs. In my experience when I have employed complex methods reviewers have rightly demanded this of me. I think people applying a technique without understanding what is actually going on ‘under the hood’ and what to look out for is a separate problem – there is really no excuse for this. Email the person that created the R package if you need some assistance. These folks are usually pretty obliging and sometimes can catch inappropriate usage. Obviously it is in their best interest for the package to be employed and used appropriately.
After many years of work, our FIV bobcat paper is just out in early view in Molecular Ecology: http://onlinelibrary.wiley.com/doi/10.1111/mec.14375/full