I was just reading a few older posts from Dynamic Ecology and stumbled into this post on the use and abuse of AIC (https://dynamicecology.wordpress.com/2015/05/21/why-aic-appeals-to-ecologists-lowest-instincts/).
After reading Brian McGill’s really cool article, I started to read the comments section (as suggested I should do in the foot notes). Wow – this is one way to get an AIC education! Probably the longest but most insightful comments section I’ve ever read – but absolutely worth the time. Lots of great ecology minds providing a really interesting debate. I agree with Mark Brewer that the comments section and the original article should be turned into a Methods paper
Have you ever had that feeling of jubilation (or some similar feeling) when you read a paper that perfectly synthesizes ideas you’ve been thinking about for a long time? That’s exactly the the emotion I experienced when reading the 2014 paper by Matthew Fitzpatrick and Stephen Keller that use generalized dissimilarity modelling (GDM) and gradient forests (GF) to link genomic data to environmental gradients. How had I not seen this previously?
Just a qualifier I haven’t used these methods yet, but I can see the gap that these analyses fill. I have always considered random forests to be an intuitive way to understand the ecological drivers of species distributions , and lamented the fact that I didn’t know how to do the same with genetic data. I also had the delusion that one day I would get around to thinking about the idea in more detail and putting something together. No need to now I guess! Their solution was quite simple and from what I can tell consists of basically just performing GF on a SNP data set. I think the next step with this type of approach is to think about how to incorporate evolutionary models (E.g., Brownian motion).
GDM also looks like and interesting technique that I’s seem previously but never really grappled with. there is also phylogenetic gdm too which could be useful too (Rosaur et al 2014 Ecography). I really like there idea of using distance based MEMs (basically a version of spatial PCoA) to incorporate spatial auto correlation into the GF model. I’ve done similar with other modelling techniques so it’s nice to see this method used in this context. Anyway – in short I’m looking forward to testing and thinking more about these new and potentially really useful analysis approaches.
Here is the link: http://onlinelibrary.wiley.com/doi/10.1111/ele.12376/full
Ever since I was a bright eyed naive undergrad, I’ve been indoctrinated into the distance metric coupled with randomization test community statistical paradigm of Legendre and Anderson (among others). That was all I knew, and I thought I could apply this plethora of techniques reasonably well. I knew model-based techniques to analyse community data sets existed, but largely ignored them. That was until I met Will Pearce (http://willpearse.com/) and my metric/randomization world has slowly been slipping since.
As David Warton suggests (see the link to mvabund),metric/randomization techniques were great short cuts, but these shortcuts are not necessary now due to increases in computing power. I think he slightly overstates the mean-variance problem as most people that deal with community abundance data apply some type transformation. Nonetheless, model-based approaches, such as mvabund and the eco-phylogenetic PGLMM have numerous advantages in terms of power and flexibility (see Wang et al 2012 and Ives and Helmus 2011). Interestingly, considering the advantages offered, both packages (particularly PGLMM ) aren’t used commonly with 39 citations for PGLMM and just over 100 for mvabund compared to the thousands (I’m guessing) that have used metric/randomization approaches. I wonder when this will change? They require only marginally more effort to use and there are plenty of well written guides to ease the skeptical community ecologist into it (see Will’s below). Maybe, for most people, it takes some convincing to convert away from techniques you learnt during your PhD…
I’m not saying that metric/randomisation methods will become redundant by the way. In fact currently, if you are interested in what landscape/biotic factors shape phylogenetic patterns like I am, I don’t know of any other set of methods that work better. If I’m missing something – let me know! Also, as Ives and Helmus suggest, metric/randomisation methods will also still be important, at minimum, for exploring complex community data sets prior to conducting PGLMM or other similar methods.
Wang et al, 2012: http://onlinelibrary.wiley.com/doi/10.1111/j.2041-210X.2012.00190.x/full
Ives and Helmus 2011: http://onlinelibrary.wiley.com/doi/10.1111/j.2041-210X.2012.00190.x/full
Will’s PGLMM guide: https://cran.r-project.org/web/packages/pez/vignettes/pez-pglmm-overview.pdf