A site to help Biochemists learn R.

Starting points

Wednesday, 15 July 2015

A bar chart looking at the number of Athena SWAN awards...

Bar charts are commonly used. They are easy to understand. They allow us to present data is a visual way. Lots of people appreciate data visually and prefer pictures to numbers. I don't just analyse biological data. So here is an example of using ggplot to create a stacked bar chart to show the increase in Athena SWAN Awards over the last six years.

Here is the graph:

Here is the script:

# activate the required packages

# here is the data - gathered from various awards booklets
# and the ECU press release
years <- c("2009","2010","2011","2012","2013", "2014")
bronze <- c(19,  13, 25, 66, 135, 152)
silver <- c(16, 16, 14, 26, 40, 43)
gold <- c(0,1,0,2,4,0)

# create the data frame required for ggplot
swan.df <- as.data.frame(years)
swan.df$Bronze <- bronze
swan.df$Silver <- silver
swan.df$Gold <- gold

# reshape the data from long into short format
as.melt <- melt(swan.df, id.vars = "years", value.name = "number", variable.name = "Awards")

# make the first version of the plot
p <- ggplot(as.melt, aes(x=years, y=number, fill=Awards)) + 
     geom_bar(stat="identity") +  # this bit makes the barplot
     scale_fill_manual(values=c("brown", "grey", "yellow")) + # control the colours
     xlab("") + #no need for the "years" x-axis
     theme_few() # nice clean theme  

# I want to add a nice y-axis label
# and increase the size of the text
p <- p + ylab("Number of Awards") +
     theme(axis.title.y = element_text(size = 14 )) + 
     theme(axis.text = element_text(size = 12))

# make the legend a bit bigger and move it to top left 
p <- p + theme(legend.position=c(0,1), # moves it to the top left
               legend.justification=c(0,1), # moves it in a bit
               legend.text=element_text(size = 12), # increase the size of the title
               legend.title=element_text(size = 12)) # and the labels

# have a look at the bar chart

# save the graph....
p + ggsave("AthenaSWANawards09_14.pdf")


Important resource: http://www.cookbook-r.com/Graphs/Bar_and_line_graphs_(ggplot2)/

Going beyond bar charts.

Bar charts are not very data rich. There is often a better way to show the data or a better way to do the experiment. As a student and post-doctoral fellow, I was encouraged to look to investigate relationships in more detail. For example, it's better to do a time course experiment or to investigating dose relationships - or both! These suggest a more detailed investigation of a biological system. This would be better plotted in other ways. For examples, see the protein assay graph and the LD50 graph.

No comments:

Post a Comment

Comments and suggestions are welcome.