Random forests: identifying package conflicts in R

I just lost a morning of my life dealing with a strange R problem. As a reader of my blog you may know my love for machine learning and Gradient Forests – turns out that if you have the uni-variate version installed (‘Random Forests’) your beautiful Gradient Forest is no longer – just a barren wasteland remains. Excuse the terrible metaphor, basically there is some weird conflict between the two that make Gradient forest produce the horrifying error: “The response has five or fewer unique values.  Are you sure you want to do regression?”. This was made worse by the fact that yesterday Gradient Forests was working perfectly yesterday (i.e. I hadn’t loaded Random Forests), but then today I found an inconsistency in the data and made some reasonable sized changes, got distracted and ran  Random Forests on another piece of data, then came back to analyse my modified data from yesterday  and bang I got the above error + topping it all off the error “The gradient forest is empty”.

The horror – was it the changes to the data I made? Did I modify the code and forget (I though my book keeping was pretty good…)? What’s going on?  I then did the same analysis on the old data-set and got the same error, (phew) and then eventually by process of elimination worked out that it was a package conflict. I wonder how many collective hours people spend diagnosing problems like this in science? Millions I suspect. Anyway, I guess I learnt something (?) and will check for this type of issue more frequently.

Gradient Forests: http://gradientforest.r-forge.r-project.org/biodiversity-survey.pdf

Leave a comment