Tip Dating in BEAST

Understanding how robust the molecular clock is a critical step for many evolutionary analyses. Usually when I get given a set of aligned sequences I turn to TempEst to test the ‘clocklikeness’ of the data. However, after the release of ‘TIPDATINGBEAST’ (I’ll call it TDBEAST fr short) I may turn to TempEst much less often. Whist TempEst is useful in getting some qualitative idea of the temporal signal in the data but I find it annoying that its hard to drill down to find what sequences sequences that are leading to bias.  Furthermore, TempEst  is sensitive to the input tree which can lead to problems and I have had issues with guessDates as well which is slightly irritating. TDBEAST solves lots of these issues and more and provides a robust method to test how ‘clock-like’ your sequences are. The TDBEAST R package does 2 things using BEAST log files and .xml files:

1. Uses a date randomization test to measure temporal signal and provides a nice visualization  to check results.

2. Uses a leave one out cross validation to work out the likely culprits that could be skewing results. These sequences, for example, could have the wrong date assigned to them by mistake – this is a real problem when working on sequences from the field.

Whilst I’m having a few teething issues with the code, overall the ‘how to’ guide is excellent and the code seems straight forward. TDBEAST definitely seems like a valuable addition to my phylogenetic tool box.

Here are the links: http://onlinelibrary.wiley.com/doi/10.1111/1755-0998.12603/full

http://onlinelibrary.wiley.com/store/10.1111/1755-0998.12603/asset/supinfo/men12603-sup-0001-SupInfo.pdf?v=1&s=3c884538553c0742a65d37dfe6fdeceaf2326573

https://cran.r-project.org/web/packages/TipDatingBeast/TipDatingBeast.pdf

 

 

 

 

Advertisements

Integrating networks and phylogenies

Considering the broad similarities between networks and phylogenies  it is amazing that they have, up until recently,  been very separate approaches. In the world of epidemiology transmission trees have been gaining momentum over the last 5 years (see the excellent review by Hall et al: https://www.ncbi.nlm.nih.gov/pubmed/27217184) as they turn phylogenies into something that more-or-less equates to transmission. Now it appears that ecologists are doing the same thing with this really interesting paper just out in Methods in Ecology and Evolution (see link below). The package attached to Schliep et al looks really cool and I can imagine will be of use to a broad array of disciplines. I’m looking forward to trying it out my self…..

Here is the link: http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12760/full

Demystifying the BEAST (Part I)

If you are like me when you first opened up BEAST phylogenetics software, or more specifically BEAUTi GUI,  you are immediately impressed but a little overwhelmed by the number of options you have to do reconstruction and phylogenetic analysis more broadly. It’s an amazingly powerful and accessible  free software package –  particularly with the extra utilities such as Fig Tree and SPREAD (will talk about these in future posts) you really can’t beat BEAST. The tutorials (see below) provide step by step instructions basically mean that anyone can run BEAST. Furthermore, If you have any technical problems,  they are often quickly resolved via the google group (https://groups.google.com/forum/#!forum/beast-users).

Despite this, understanding the numerous decisions that have to made throughout the process from sequence alignment to final product is the real challenge here. These decisions can make real differences to the inferences you make so are critical to get right. One weakness in the tutorials is that they tell you how, but don’t give enough detail to why you’d make a particular decision (e.g., why one tree prior over another?). The aims of  the following posts will be to demystify this process a little  and direct you to useful resources.

The plan is to do this tab by tab of BEAUti and I will assume that you know how to import your data in and set dates/traits (all  of this can be learnt from the tutorials easily).

So,

Part 2 –  Sites

Part 3 – Clocks

Part 4 – Trees

Part 5 -States, Priors and Operators

Part 6  – Running the whole thing and model selection

In the mean time, if you haven’t already, download BEAST 1.8.4 (http://beast.bio.ed.ac.uk/tutorials) and go through the tutorials: http://beast.bio.ed.ac.uk/tutorials.

 

 

Dealing with the dilution effect

The idea that there is reduced disease exposure risk in systems with high biodiversity is certainly an attractive one. Basically more species in an ecosystem ‘dilutes’ the chances of a parasite coming across a good host.  The argument follows that protecting biodiversity can have positive implications for human health. This idea however has sparked a massive amount of controversy and passionate debate  – see an excellent summary of the debate here:  the https://parasiteecology.wordpress.com/2013/12/08/dilution-effect-debates-randolph-and-dobson-vs-ostfeld/

This idea originating from a Conservation Biology paper by Ostfeld and Keesing (see the link below) on Lyme disease has been labelled everything from ‘visionary’ to a ‘fantasy’. I have mixed feeling about the paper – but I think the idea at least is an interesting one worthy of the attention that it gets. How to even conduct research to test this idea properly is contentious- in a  meeting a few years back people that have devoted their professional careers on this idea cannot come to a consensus. Adding further complexity, is that there is still  debate on how to measure biodiversity effectively in the first place (even though there have been huge advances  – See the article by Chiarucci et al below). As with most ecological research, there are also issues working at what scale to asses the effect and what species to use etc.

Unsurprisingly (if you have read my previous articles) I think that a functional and phylogenetic community ecology approach has potential to overcome some of these issues at least. What I feel is needed to answer this question appropriately  is a study design  with multiple sites across known diversity gradients at multiple scales (any ideas anyone?). This study design coupled with vertebrate surveys (rather than relying on coarse distribution data) and next-gen metagenomic  sampling symbionts/parasites from a sample of the hosts.  Using functional diversity and composition with taxonomic diversity and composition metrics of the host (and maybe the parasites?) would be useful too – and may allow for more broad applicability of the findings i.e. even though the host species composition is different the functional composition is the same.

Anyway if anyone has any ideas and may like to work on  a similar project in the future – let me know!

The original Ostfeld and Keesing paper: http://onlinelibrary.wiley.com/doi/10.1046/j.1523-1739.2000.99014.x/abstract

Chiarucci et al diversity article: http://rstb.royalsocietypublishing.org/content/366/1576/2426