Have you ever had that feeling of jubilation (or some similar feeling) when you read a paper that perfectly synthesizes ideas you’ve been thinking about for a long time? That’s exactly the the emotion I experienced when reading the 2014 paper by Matthew Fitzpatrick and Stephen Keller that use generalized dissimilarity modelling (GDM) and gradient forests (GF) to link genomic data to environmental gradients. How had I not seen this previously?
Just a qualifier I haven’t used these methods yet, but I can see the gap that these analyses fill. I have always considered random forests to be an intuitive way to understand the ecological drivers of species distributions , and lamented the fact that I didn’t know how to do the same with genetic data. I also had the delusion that one day I would get around to thinking about the idea in more detail and putting something together. No need to now I guess! Their solution was quite simple and from what I can tell consists of basically just performing GF on a SNP data set. I think the next step with this type of approach is to think about how to incorporate evolutionary models (E.g., Brownian motion).
GDM also looks like and interesting technique that I’s seem previously but never really grappled with. there is also phylogenetic gdm too which could be useful too (Rosaur et al 2014 Ecography). I really like there idea of using distance based MEMs (basically a version of spatial PCoA) to incorporate spatial auto correlation into the GF model. I’ve done similar with other modelling techniques so it’s nice to see this method used in this context. Anyway – in short I’m looking forward to testing and thinking more about these new and potentially really useful analysis approaches.
Here is the link: http://onlinelibrary.wiley.com/doi/10.1111/ele.12376/full