Friday, June 07, 2013

Starting Again in R


I have twice tried to get comfortable and competent at using the open-source statistical software package R. I have twice been unsuccessful in this goal. It is not from lack of recognition of the many advantages of R over other statistical-analytical approaches – at this point, I can think of several people who would immediately start telling me about how great R is, I am trying to forestall that – it is a combination of the interrelated factors: lack of success in early stages, total unfamiliarity with the structure of R’s input/output language (and help files), and instructions and tutorials that seem targeted at users with completely different priorities than I have.

Priorities: I need to learn to use R for its two primary purposes – statistical analyses of my data and graphical representation of my data and my results. Obviously, this overall priority is the same as for most R users. What I don’t need to prioritize, in my opinion, is what the textbooks, help files, and most R-related websites put first: a list of commands and functions in R that are of broad general use. The problem for me there is that I cannot easily think of a situation in which I would be (for example) using a set of columns in my dataset, indicated by number. Yet such functions are commonly presented in the opening chapters of any discussion of R. The first thing I want to do is put my dataset, up to this point in the form of an Excel worksheet, into R, and then have a look at it to confirm it loaded correctly. Picking out values in specific places – e.g. the third value in the fifth variable – is indeed a useful way to do some of that. But not before I’ve loaded my dataset!

This is why I am happily reading Getting Started With R, AnIntroduction for Biologists, by Andrew P. Beckerman and Owen L. Petchey (Amazon link to the Kindle edition; I don't know why it's not finding the paperback I bought). This book, unlike all others I have met, starts with organizing your data and getting R up and running. I’ve read the first two chapters (of 6, it’s a small book) and my confidence is already improved. Getting demoralized by apparently deeply mysterious errors that just lead to more errors and confusion is a big part of why I’ve abandoned R in the past.


No comments: