A collection of computer systems and programming tips that you may find useful.
 
Brought to you by Craic Computing LLC, a bioinformatics consulting company.

Wednesday, November 16, 2011

Plotting a simple bar plot in R

Here is my cheat sheet for loading data from a CSV file into the R statistics package, plotting one column of data as a bar plot and saving it as a PNG image.

My input file is simple CSV file with two columns :
Position,Entropy
1,0.2237
2,0.4051
3,0.1312
4,0.1312
[...]

I want to load this into R as a data frame, then plot the values in the second column (Entropy) as a bar plot, using the values in the first column as the labels for the bars.

First step is to use read.csv (a shortcut version of read.table)
> d <- read.csv('<your path>/myfile.csv')

I can use barplot directly on the data frame (d)
> barplot(d[,'Entropy'])

But the default plot options are not great, so I can add custom options to the call, such as a main title and labels for X and Y axes. I set the lower and upper limits for the Y axis to be 0.0 and 1.0 and use the values in the first column of the data frame as the labels for the bars on the X axis
> barplot(d[,'Entropy'], main="Entropy Plot", xlab="Position",
  ylab="Entropy", ylim=c(0.0,1.0), names.arg=d[,'Position'])

The plot is displayed on my screen and looks the way I want it. To save it out to an image file, I specify the plotting device ('png') and the output filename, repeat the plot and then close/detach the plotting device.
> png('<your path>/myfile.png')
> barplot(d[,'Entropy'], main="Entropy Plot", xlab="Position",
  ylab="Entropy", ylim=c(0.0,1.0), names.arg=d[,'Position'])
> dev.off()

This produces the following image:

There are endless configuration options to play with but this works for a quick, simple plot.

Here are the steps without the prompts for you to cut and paste as needed:

d <- read.csv('<your path>/myfile.csv')
png('<your path>/myfile.png')
barplot(d[,'Entropy'], main="Entropy Plot", xlab="Position", ylab="Entropy", ylim=c(0.0,1.0), names.arg=d[,'Position'])
dev.off()

You could put these into a text file and run it from your system command line like this:

$ R CMD BATCH myfile.R

R is an incredibly useful system but as an occasional user I find the syntax and command names/options  hard to learn. Hopefully this simple example helps you with the learning curve.

... and always remember - arrays in R start at 1, not 0 ...


No comments:

Archive of Tips