Showing posts with label assay. Show all posts
Showing posts with label assay. Show all posts

Monday, 7 March 2016

Illustrating some R Fundamentals for VizBi 2016

I have surveyed the participants for my R tutorial at VizBi2016. The range of experience is a bit of a challenge. Some participants that have used R once or twice and others with plenty of experience. It's going to be a bit difficult to make the morning useful for everybody but I will try....

For the beginners, I'm going to start with this script which is designed to illustrate three of the fundamentals of R:

  • functions
  • objects
  • packages

It's an expansion of the script to draw a graph of a protein assay with ggplot:


Here is the script:
# First script for VizBi2016
# written to illustrate the Fundamentals of R

## first 'Fundamental' is ** functions **
# functions do things!
# you know it's a function because it contains brackets

# c() is a function - sometimes called combine

## second 'Fundamental' is ** objects **
# objects contain data
# we make them with functions

# example:
# using the c() function to create the object prot
# Protein Concentrations
prot <- c(0.000, 0.016, 0.031, 0.063, 0.125, 0.250, 0.500, 1.000, 
          0.000, 0.016, 0.031, 0.063, 0.125, 0.250, 0.500, 1.000) 

# a function has arguments - always inside the brackets

# Absorbance from my protein assay
abs <- c(0.329, 0.352, 0.349, 0.379, 0.417, 0.491, 0.668, 0.956, 
         0.327, 0.341, 0.355, 0.383, 0.417, 0.446, 0.655, 0.905)

# these objects are called 'vectors' - key term

## now we are going ot play with some of these objects to 

#Calculate the line using the linear model function lm()
line <- lm(abs~prot)

# creates another kind of object - a list
# multiple parts with different type of data in each part

# too look at the object type line
line
summary(line)

# access particular parts of the object line
# using the $ dollar sign
# Equation of a line y = mx + c
# In our case abs = slope * prot + intercept
# ukn.prot = (abs - intercept)/slope
int <- summary(line)$coefficients[1]
slope <- summary(line)$coefficients[2]

# now calculate some unknown protein concs from absorbances
# put the unknowns into a vector
abs.ukns <- c(0.554, 0.568, 0.705)

# rearrange the equation of the line to ukn.prot = (abs - intercept)/slope
prot.ukns <- (abs.ukns - int)/slope

# CONCEPT ALERT - functions work on a vector of numbers


## graphing

# quick graph using base R
plot(prot, abs)
abline(line)

## BUT there is a better way!!

## third 'Fundamental' is ** packages **
# these are collections of functions that have been written by others
# these can be installed 
# install.packages("ggplot2")
# and then activated
library(ggplot2)

# Convert from one type of object to another
# another kind of object is a data.frame
# a bit like an Excel spreadsheet
# often when we import data we get a data frame
data <- as.data.frame(prot)
data$abs <- abs


# make a simple graph with ggplot
# step one - add the data and then the 'aesthetics' 
# key asthetics in this case x and y
p <- ggplot(data=data,          # specify the data frame with data
            aes(x=prot, y=abs)) # specify x and y for the graph

# creates a list and a blank plot

# add a type of graph 
p <- p + geom_point()

# show the graph
p

# add a line
p + stat_smooth(method = "lm")

# more detailed plot: 
p <- ggplot(data=data,          # specify the data frame with data
            aes(x=prot, y=abs)) +   # specify x and y for the graph
  geom_point() +          # make a scatter plot
  stat_smooth(method = "lm") +  # add a linear model line
  xlab("[Protein] (microg/ml)") +   # label x-axis
  ylab("Absorbance (570nm)") +    # label y-axis
  ggtitle("Protein Assay") +  # add a title
  theme_bw() +      # a simple theme
  expand_limits(y=c(0,1)) +    # customise the y-axis
  annotate(geom="text", x=0.85, y= 0.6, label="Abs         Prot",  color="red")

# put the answers on the graph
for (i in 1:length(abs.ukns)){
  p <- p + annotate(geom="text", x = 0.8, y = (0.6 - i/20), label=abs.ukns[i])
  p <- p + annotate(geom="text", x = 0.92, y = (0.6 - i/20), label=round(prot.ukns[i], 3))
}

p # show us the graph...

## To Learn from this script:
# run this script line by line yourself and see what happens.
# watch what happens

# make the plot object again and try some things...
p <- ggplot(data=data,          # specify the data frame with data
            aes(x=prot, y=abs)) # specify x and y for the graph

# try to change the colour of the points
p + geom_point(colour = "blue")

# try to change the size of the points
p + geom_point(size = 5, colour = "red")

# look at the documentation for geom_point
# http://docs.ggplot2.org/current/geom_point.html
# try some of the functions and see if you can make sense of them






Tuesday, 26 May 2015

R Markdown and html....

As part of developing the slides for R for Biochemists, I have been using R Markdown. This allows the inclusion and the execution of scripts within a Markdown document.

I have prepared some the Markdown file on Github and the converted html file is there too. 

Wednesday, 22 April 2015

Drawing a cell death curve and calculating an LD50

I'm going to show you the this analysis before the script.
We do a lot of drug testing in our laboratory at Cardiff and it's a valuable thing for biochemists and pharmacologists. To do the analysis, most of the lab usually use a different (expensive) statistical package but I like R.
The experimental set up, done by a student in the lab, is as follows:
  • using some cells and add various concentrations of a novel drug
  • leave for 48 hours
  • then measure how many cells are dead
  • the experiment was done four times with the doses improved each time
Here is the graph and calculated LD50 - a measure of how good the drug is:


Here is the script I used to generate this graph:

### START of SCRIPT 

###  this is the data   ###
# data from four experiments
conc <- c(5.00E-07, 1.00E-06, 1.00E-05, 
          5.00E-07, 1.00E-06, 5.00E-06, 1.00E-05, 2.00E-05, 
          5.00E-07, 1.00E-06, 2.50E-06, 5.00E-06, 1.00E-05, 
          5.00E-07, 1.00E-06, 2.50E-06, 5.00E-06, 1.00E-05)
dead.cells <- c(34.6, 47.7, 81.7, 
                37.6, 55.7, 89.1, 84.3, 85.2, 
                34.4, 46.1, 76.2, 84.3, 84.1, 
                24.5, 26.1, 60.6, 82.7, 87)

# transform the data to make it postive and put into a data frame for fitting 
data <- as.data.frame(conc)   # create the data frame
data$dead.cells <- dead.cells
data$nM <- data$conc * 1000000000
data$log.nM <- log10(data$nM) 

###  fit the data  ###
# make sure logconc remains positive, otherwise multiply to keep positive values
# (such as in this example where the conc is multiplied by 1000000

fit <- nls(dead.cells ~ bot+(top-bot)/(1+(log.nM/LD50)^slope),
             data = data,
             start=list(bot=20, top=95, LD50=3, slope=-12))

###  Plot the results  ###
#this lets you graph your calculated equations nice and pretty

x <- seq(0,6, length=100)
y <- (coef(fit)["bot"]+(coef(fit)["top"]-coef(fit)["bot"])/(1+(x/coef(fit)["LD50"])^coef(fit)["slope"]))
m <- coef(fit)

# plot the points first
plot(data$nM, data$dead.cells, 
     log="x", 
     main="Drug Dose Response and LD50", 
     xlab="Drug concentration (microM)", 
     ylab="Dead cells (% of cells)",
     xlim= c(500, 10000),
     ylim= c(20,100),
     xaxt = "n")   # suppresses the labels on the x-axis

# adds the x-axis labels they way I want it to look
axis(1, at=c(500, 1000, 2500, 5000, 10000),  lab=c("0.5","1","2.5","5","10"))

# adds the fitted non-linear line 
lines(10^x,y, lty="dotted", col="red", lwd =2)

# add the LD50 in the legend which allows nice positioning. 
rp = vector('expression',1)
rp[1] = substitute(expression(LD50(microM) == MYVALUE), 
                   list(MYVALUE = format((10^m[3])/1000,dig=3)))[2]
legend('topleft', legend = rp, bty = 'n')

# END OF SCRIPT

To prepare this script, I used the following sources:
Thanks to Mel for the data. 

Monday, 20 April 2015

Let's analyse a protein assay....

Hopefully, you've already downloaded R and R-Studio. R-Studio is not essential.

Here is a script to do a protein assay:

# Protein Concentrations
prot <- c(0.000, 0.016, 0.031, 0.063, 0.125, 0.250, 0.500, 1.000, 0.000, 0.016, 0.031, 0.063, 0.125, 0.250, 0.500, 1.000) 

# Absorbance from my protein assay
abs <- c(0.329, 0.352, 0.349, 0.379, 0.417, 0.491, 0.668, 0.956, 0.327, 0.341, 0.355, 0.383, 0.417, 0.446, 0.655, 0.905)

#Plot the data simply
plot(abs~prot)

#Calculate the line using the linear model function
line <- lm(abs~prot)

#Draw the line
abline(line)

#Improve the graph:
plot(abs~prot, 
     xlab = "[Protein] (microg/ml)",
     ylab = "Absorbance (570nm)",
     main = "Protein Assay 20th April 2015")
abline(line)

r2 <- round(summary(line)$r.squared, 3)
mylabel = bquote(italic(R)^2 == .(format(r2, digits = 3)))
text(x = 0.2, y = 0.9, labels = mylabel)


#Equation of a line y = mx + c
#In our case abs = slope * prot + intercept
# ukn.prot = (abs - intercept)/slope
int <- summary(line)$coefficients[1]
slope <- summary(line)$coefficients[2]
mylabel = bquote(y == .(format(slope, digits = 3))*x + .(format(int, digits = 3)))
text(x = 0.2, y = 0.8, labels = mylabel)

#now calculate some unknown protein concs from absorbances
#put the unknowns into a vector
abs.unknowns <- c(0.554, 0.568, 0.705)
#rearrange the equation of the line to ukn.prot = (abs - intercept)/slope
prot.unknowns <- (abs.unknowns - int)/slope

#put the answers on the graph
text(x = 0.8, y = (0.6), "Abs")
text(x = 0.92, y = (0.6), "Prot")
for (i in 1:length(abs.unknowns)){
  text(x = 0.8, y = (0.6 - i/20), abs.unknowns[i])
  text(x = 0.92, y = (0.6 - i/20), round(prot.unknowns[i], 3))
}

# END OF SCRIPT



This can be copy and pasted into the script window of R-Studio. 




Then each line can be run one by one by pressing the Run button:



This will generate the following output: