Showing posts with label LD50. Show all posts
Showing posts with label LD50. Show all posts

Wednesday, 17 June 2015

Using ggplot to draw the LD50 graph

UPDATE: As of ggplot 2.0.0, released in Dec 2015, to use the geom_smooth() ggplot function, there is a need to put the method arguments (method.args = list()) into a list as detailed below.
As of 19th of Jan 2016, it means that the scripts here that use the geom_smooth() don't work. I will repair them all but it'll take a few days. Thanks to the commentors for sorting this out.


As mentioned previously, we do a lot of drug testing in our laboratory.
Here is the same data plotted with ggplot.
The experimental set up, done by a student in the lab, is as follows:
  • using some cells and add various concentrations of a novel drug
  • leave for 48 hours
  • then measure how many cells are dead
  • the experiment was done four times with the doses improved each time
Here is the graph and calculated LD50 - a measure of how good the drug is:



Here is the script:

# START of SCRIPT
library(ggplot2)

###  this is the data   ### 
# data from four experiments
conc <- c(5.00E-07, 1.00E-06, 1.00E-05, 
          5.00E-07, 1.00E-06, 5.00E-06, 1.00E-05, 2.00E-05, 
          5.00E-07, 1.00E-06, 2.50E-06, 5.00E-06, 1.00E-05, 
          5.00E-07, 1.00E-06, 2.50E-06, 5.00E-06, 1.00E-05)
dead.cells <- c(34.6, 47.7, 81.7, 
                37.6, 55.7, 89.1, 84.3, 85.2, 
                34.4, 46.1, 76.2, 84.3, 84.1, 
                24.5, 26.1, 60.6, 82.7, 87)

# transform the data to make it postive and put into a data frame for fitting 
data <- as.data.frame(conc)   # create the data frame
data$dead.cells <- dead.cells
data$nM <- data$conc * 1000000000
data$log.nM <- log10(data$nM) 

###  fit the data  ###
# make sure logconc remains positive, otherwise multiply to keep positive values
# (such as in this example where the iconc is multiplied by 1000

fit <- nls(dead.cells ~ bot+(top-bot)/(1+(log.nM/LD50)^slope),
           data = data,
           start=list(bot=20, top=95, LD50=3, slope=-12))
m <- coef(fit)
val <- format((10^m[3]),dig=4)

###  ggplot the results  ###
p <- ggplot(data=data,          # specify the data frame with data
            aes(x=nM, y=dead.cells)) +   # specify x and y
  geom_point() +          # make a scatter plot
  scale_x_log10(breaks = c(500, 1000, 2500, 5000, 10000, 20000))+
  xlab("Drug concentration (nM)") +   # label x-axis
  ylab("Dead cells (% of cells)") +    # label y-axis
  ggtitle("Drug Dose Response and LD50") +  # add a title
  theme_bw() +      # a simple theme
  expand_limits(y=c(20,100))   # customise the y-axis

# Add the line to graph using methods.args (New: Jan 2016)
p <- p +  geom_smooth(method = "nls",
                      method.args = list(formula = y ~ bot+(top-bot)/(1+( x / LD50)^slope), 
                                    start=list(bot=20, top=95, LD50=3, slope=-12)),
                      se = FALSE)

# Add the text with the LD50 to the graph. 
p <- p+ annotate(geom="text", x=7000, y= 60, label="LD50(nM): ",  color="red") +
   annotate(geom="text", x=9800, y= 60, label=val,  color="red")

p # show the plot

#END OF SCRIPT

Tuesday, 26 May 2015

Starting simply...

Then, here are three possible starting points for biochemists:
Here are the plots that are produced:

A protein assay

Some enzymatic data

A cell death curve

Tuesday, 5 May 2015

Controlling our graphs...

The styles of graphs in journals vary a lot and can be quite different from the default style encoded by R or by most data analysis packages. An essential part of learning to control a piece of analysis software is expressing the graphs in the format that your discipline and your journals like.

One of the things that varies a lot is whether your graph has a box or not. Boxes are presented by default in R and many other programmes. Removing the boxes leave floating axis which is not in the style of the journal either.

Working through some of these things using the data from the LD50 graph previously:

Here is the default graph:



From the default code:

plot(data$nM, data$dead.cells, 
    )

Make the x-axis log tranformed, add limits and change symbols:


Here is the code:

# Add log axis, x and y limits and change the symbols
plot(data$nM, data$dead.cells, 
     log="x", # makes it a log plot
     pch=20, # gives small black dots instead of empty circles
     xlim= c(500, 10000),
     ylim= c(0,100),
    ) 

Key next step is to suppress everything except the points. It looks a very odd:
Here is the code:
plot(data$nM, data$dead.cells, 
     log="x", 
     pch=20,
     xlim= c(500, 10000),
     ylim= c(0,100),
     axes=FALSE, # suppresses the titles on the axes
     ann=FALSE, suppresses the axes and the box  
  ) 


Now,  add back axis first:

Code:
axis(1, at=c(200, 500, 1000, 2500, 5000, 10000), # x-axis 
     lab=c("","0.5","1","2.5","5","10"), 
     lwd=2, # thicker lines
     pos=0)  # puts the x-axis at zero rather than floating 
axis(2, at=c(0, 20, 40, 60, 80,100), # y-axis
     lab=c("0", "20", "40", "60", "80", "100"), 
     lwd=2, 
     las=1) # turns the numbers so easier to read

Next, add titles (I like the Greek symbol in the x-axis):


Code:
title("Figure 1A", 
      xlab= expression(paste("Drug concentration (", mu, "M)")), 
      # not the word "micro"
      ylab= "Dead cells (% of cells)")

And a nice black fitted line:



Code:
# adds the fitted non-linear line 
lines(10^x,y, lty=1, lwd =2)

Finally, add the LD50 again:


using this code:
# add the LD50 in the legend which allows nice positioning. 
rp = vector('expression',1)
rp[1] = substitute(expression(LD50(microM) == MYVALUE), 
                   list(MYVALUE = format((10^m[3])/1000,dig=3)))[2]
legend('topleft', legend = rp, bty = 'n')


The full code for this script and the others mentioned in this blog are available on Github.

I used these resources to develop this script:

Wednesday, 29 April 2015

Comparing two drugs side by side

You might prefer to show your graphs independently rather than overlay them.
One way to do this is to use the par() function to divide your plotting area into two pieces.
par() is discussed here in the R Function of the Day archive and is mentioned here on the Quick R pages.
As with all functions, the syntax is important.
To plot two graphs, side by side this is the code: par(mfrow=c(1,2))
To plot four graphs, arranged in 2 rows and 2 columns, this is the code: par(mfrow=c(2,2))
It's good to know how to reset the layout: par(mfrow=c(1,1)) # back to one graph
So here's the output:



Here is the script (again with repetition for understanding):

###  I have detailed lots of things separately so that it's obvious what I'm doing. 
### The code could be made shorter and less repetitive.
###  This is the data   ###
# Data from multiple experiments

conc <- c(1.00E-08, 1.00E-07, 5.00E-07, 1.00E-06, 1.00E-05, 
          1.00E-08, 1.00E-07, 5.00E-07, 1.00E-06, 1.00E-05, 
          1.00E-08, 1.00E-07, 5.00E-07, 1.00E-06, 1.00E-05, 
          1.00E-07, 5.00E-07, 1.00E-06, 5.00E-06, 1.00E-05, 2.00E-05, 
          1.00E-07, 5.00E-07, 1.00E-06, 5.00E-06, 1.00E-05, 2.00E-05, 
          1.00E-07, 5.00E-07, 1.00E-06, 5.00E-06, 1.00E-05, 2.00E-05, 
          1.00E-07, 5.00E-07, 1.00E-06, 2.50E-06, 5.00E-06, 1.00E-05, 
          1.00E-07, 5.00E-07, 1.00E-06, 2.50E-06, 5.00E-06, 1.00E-05, 
          1.00E-07, 5.00E-07, 1.00E-06, 2.50E-06, 5.00E-06, 1.00E-05, 
          1.00E-07, 5.00E-07, 1.00E-06, 2.50E-06, 5.00E-06, 1.00E-05) 

# First drug is called Drug 49
dead.cells49 <- c(28.6, 30.1, 34.6, 47.7, 81.7, 
                  27.1, 27.2, 35, 47.9, 80.6, 
                  27.8, 27.9, 34.2, 47.9, 82.4, 
                  31.1, 37.6, 55.7, 89.1, 84.3, 85.2, 
                  30.8, 35.6, 55.9, 90.5, 85.2, 86.3, 
                  32.6, 34.7, 54.1, 89.8, 84.2, 87.1, 
                  32.2, 34.4, 46.1, 76.2, 84.3, 84.1, 
                  31.4, 37.4, 50.5, 75, 85.3, 84.8, 
                  27.6, 37.3, 46.3, 74.1, 81.6, 82.7, 
                  26.7, 28.4, 27.9, 59.6, 87, 86.1)

# Second Drug is called Drug 11
dead.cells11 <- c(28.4, 28.9, 28.6, 30.1, 90, 
                  27.8, 28.5, 29.2, 29.3, 89.9, 
                  28.6, 26.6, 27.3, 29, 89.7, 
                  25.8, 28.7, 31, 80.7, 85, 85.2, 
                  25.3, 28.2, 30, 81.4, 83.5, 87.2, 
                  25.9, 25.4, 28.4, 81.2, 81.4, 85.6, 
                  36.4, 36.2, 33.4, 52.3, 87.2, 97, 
                  33.9, 29.5, 32, 44.8, 84.9, 97.2, 
                  30.1, 30.1, 31, 46.5, 83.5, 96.1, 
                  22.8, 23.7, 25.8, 37.9, 73.8, 86.7) 

# Put the data into a data frame and transform the data to make it postive for fitting
data49 <- data.frame(conc,dead.cells49)
data49$nM <- data49$conc * 1000000000
data49$log.nM <- log10(data49$nM) 

data11 <- data.frame(conc,dead.cells11)
data11$nM <- data11$conc * 1000000000
data11$log.nM <- log10(data11$nM) 

##################  fit the data  ##################
# make sure logconc remains positive, otherwise multiply to keep positive values
# (such as in this example where the iconc is multiplied by 1000

fit49 <- nls(dead.cells49 ~ bot+(top-bot)/(1+(log.nM/LD50)^slope),
           data = data49,
           start=list(bot=20, top=95, LD50=4, slope=-12))

fit11 <- nls(dead.cells11 ~ bot+(top-bot)/(1+(log.nM/LD50)^slope),
             data = data11,
             start=list(bot=20, top=95, LD50=4, slope=-12))


##################  Plot the results  ##################
#this lets you graph your calculated equations nice and pretty
x <- seq(0,6, length=100)

y49 <- (coef(fit49)["bot"]+(coef(fit49)["top"]-coef(fit49)["bot"])/(1+(x/coef(fit49)["LD50"])^coef(fit49)["slope"]))
m49 <- coef(fit49)

y11 <- (coef(fit11)["bot"]+(coef(fit11)["top"]-coef(fit11)["bot"])/(1+(x/coef(fit11)["LD50"])^coef(fit11)["slope"]))
m11 <- coef(fit11)

# This is the key bit of code that divides the plot area in two
par(mfrow=c(1,2))

# plot the first graph - goes on the left side
plot(data49$nM, data49$dead.cells49, 
     log="x", 
     main="Drug 49", 
     xlab="Drug concentration (microM)", 
     ylab="Dead cells (% of cells)",
     xlim= c(500, 10000),
     ylim= c(20,100),
     xaxt = "n")   # suppresses the labels on the x-axis

# order here is important
# you have to complete the plot before you start the next one
# adds the axis they way we want it to look
axis(1, at=c(500, 1000, 2500, 5000, 10000),  lab=c("0.5","1","2.5","5","10"))

# adds the non-linear fitted line to the first graph
lines(10^x,y49, lty="dotted", col="blue")

# add the LD50s in the legend which allows nice positioning. 
rp = vector('expression',1)
rp[1] = substitute(expression(LD50(microM) == MYVALUE), 
                   list(MYVALUE = format((10^m49[3])/1000,dig=3)))[2]
legend("topleft", legend = rp, bty = 'n', cex = 0.75)


# plot the second graph - goes on the right side
plot(data11$nM, data11$dead.cells11, 
     log="x", 
     main="Drug 11", 
     xlab="Drug concentration (microM)", 
     ylab="Dead cells (% of cells)",
     xlim= c(500, 10000),
     ylim= c(20,100),
     xaxt = "n")   # suppresses the labels on the x-axis

# adds the axis they way we want it to look
axis(1, at=c(500, 1000, 2500, 5000, 10000),  lab=c("0.5","1","2.5","5","10"))

# adds the non-linear fitted line to the second graph
lines(10^x,y11, lty="dotted", col="blue")

# add the LD50s in the legend which allows nice positioning. 
rp = vector('expression',1)
rp[1] = substitute(expression(LD50(microM) == MYVALUE), 
                   list(MYVALUE = format((10^m11[3])/1000,dig=3)))[2]
legend("topleft", legend = rp, bty = 'n', cex = 0.75)



Comparing two drugs on the same plot

I really like the range of options for visualisation that are available in R. Today, I want to describe one of the options to compare two drugs.

The experimental set up is the same as we used previously, with the exception that we now have data from two drugs not one:
  • add various concentrations of a two novels drug to the cells in culture
  • leave for 48 hours
  • then measure how many cells are dead
  • the experiment was done multiple times with the doses improved as it progresses
The objective today is to draw a graph and calculate LD50s that allow us to compare the two drugs.  

Here is an illustration of a graph that contains the data from both drugs:


Here is the script I used to generate this graph:

###  I have detailed the code separately for each set of data
###  to make it more obvious what I'm doing. 
### The code could be made shorter and less repetitive.

###  This is the data   ###
# Data from multiple experiments

conc <- c(1.00E-08, 1.00E-07, 5.00E-07, 1.00E-06, 1.00E-05, 
          1.00E-08, 1.00E-07, 5.00E-07, 1.00E-06, 1.00E-05, 
          1.00E-08, 1.00E-07, 5.00E-07, 1.00E-06, 1.00E-05, 
          1.00E-07, 5.00E-07, 1.00E-06, 5.00E-06, 1.00E-05, 2.00E-05, 
          1.00E-07, 5.00E-07, 1.00E-06, 5.00E-06, 1.00E-05, 2.00E-05, 
          1.00E-07, 5.00E-07, 1.00E-06, 5.00E-06, 1.00E-05, 2.00E-05, 
          1.00E-07, 5.00E-07, 1.00E-06, 2.50E-06, 5.00E-06, 1.00E-05, 
          1.00E-07, 5.00E-07, 1.00E-06, 2.50E-06, 5.00E-06, 1.00E-05, 
          1.00E-07, 5.00E-07, 1.00E-06, 2.50E-06, 5.00E-06, 1.00E-05, 
          1.00E-07, 5.00E-07, 1.00E-06, 2.50E-06, 5.00E-06, 1.00E-05) 

# First drug is called Drug 49
dead.cells49 <- c(28.6, 30.1, 34.6, 47.7, 81.7, 
                  27.1, 27.2, 35, 47.9, 80.6, 
                  27.8, 27.9, 34.2, 47.9, 82.4, 
                  31.1, 37.6, 55.7, 89.1, 84.3, 85.2, 
                  30.8, 35.6, 55.9, 90.5, 85.2, 86.3, 
                  32.6, 34.7, 54.1, 89.8, 84.2, 87.1, 
                  32.2, 34.4, 46.1, 76.2, 84.3, 84.1, 
                  31.4, 37.4, 50.5, 75, 85.3, 84.8, 
                  27.6, 37.3, 46.3, 74.1, 81.6, 82.7, 
                  26.7, 28.4, 27.9, 59.6, 87, 86.1)

# Second Drug is called Drug 11
dead.cells11 <- c(28.4, 28.9, 28.6, 30.1, 90, 
                  27.8, 28.5, 29.2, 29.3, 89.9, 
                  28.6, 26.6, 27.3, 29, 89.7, 
                  25.8, 28.7, 31, 80.7, 85, 85.2, 
                  25.3, 28.2, 30, 81.4, 83.5, 87.2, 
                  25.9, 25.4, 28.4, 81.2, 81.4, 85.6, 
                  36.4, 36.2, 33.4, 52.3, 87.2, 97, 
                  33.9, 29.5, 32, 44.8, 84.9, 97.2, 
                  30.1, 30.1, 31, 46.5, 83.5, 96.1, 
                  22.8, 23.7, 25.8, 37.9, 73.8, 86.7) 

# Put the data into data frames and 
# transform the data to make it postive for fitting
data49 <- data.frame(conc,dead.cells49)
data49$nM <- data49$conc * 1000000000
data49$log.nM <- log10(data49$nM) 

data11 <- data.frame(conc,dead.cells11)
data11$nM <- data11$conc * 1000000000
data11$log.nM <- log10(data11$nM) 

##################  fit the data  ##################
# make sure logconc remains positive, otherwise multiply to keep positive values
# (such as in this example where the iconc is multiplied by 1000

fit49 <- nls(dead.cells49 ~ bot+(top-bot)/(1+(log.nM/LD50)^slope),
           data = data49,
           start=list(bot=20, top=95, LD50=4, slope=-12))

fit11 <- nls(dead.cells11 ~ bot+(top-bot)/(1+(log.nM/LD50)^slope),
             data = data11,
             start=list(bot=20, top=95, LD50=4, slope=-12))


##################  Plot the results  ##################
#this lets you graph your calculated equations nice and pretty
x <- seq(0,6, length=100)

y49 <- (coef(fit49)["bot"]+(coef(fit49)["top"]-coef(fit49)["bot"])/(1+(x/coef(fit49)["LD50"])^coef(fit49)["slope"]))
m49 <- coef(fit49)  #this extracts the fitted LD50 for drug 49

y11 <- (coef(fit11)["bot"]+(coef(fit11)["top"]-coef(fit11)["bot"])/(1+(x/coef(fit11)["LD50"])^coef(fit11)["slope"]))
m11 <- coef(fit11)



# plot the first set of points
plot(data49$nM, data49$dead.cells49, 
     log="x", 
     main="Comparing LD50 of two drugs ", 
     xlab="Drug concentration (microM)", 
     ylab="Dead cells (% of cells)",
     xlim= c(500, 10000),
     ylim= c(20,100),
     xaxt = "n")   # suppresses the labels on the x-axis

# add the points for the second set of data
points(data11$nM, data11$dead.cells11, col = "red")

# add the axis they way we want it to look
axis(1, at=c(500, 1000, 2500, 5000, 10000),  lab=c("0.5","1","2.5","5","10"))

# adds the two non-linear fitted lines 
lines(10^x,y49, lty="dotted", col="black")
lines(10^x,y11, lty="dotted", col="red")

# adds the legend
legend("topleft", 
       inset=.02,
       c("Drug 49","Drug 11"),
       lty=c("dotted","dotted"),
       lwd=c(2.5,2.5),
       col=c("black","red"),
       cex = 0.75)

# add the LD50s in the legend which allows nice positioning. 
rp = vector('expression',2)
rp[1] = substitute(expression(Drug49-LD50(microM) == MYVALUE), 
                   list(MYVALUE = format((10^m49[3])/1000,dig=3)))[2]
rp[2] = substitute(expression(Drug11-LD50(microM) == MYVALUE), 
                   list(MYVALUE = format((10^m11[3])/1000,dig=3)))[2]
legend('bottomright', legend = rp, bty = 'n', cex = 0.75)


Wednesday, 22 April 2015

Drawing a cell death curve and calculating an LD50

I'm going to show you the this analysis before the script.
We do a lot of drug testing in our laboratory at Cardiff and it's a valuable thing for biochemists and pharmacologists. To do the analysis, most of the lab usually use a different (expensive) statistical package but I like R.
The experimental set up, done by a student in the lab, is as follows:
  • using some cells and add various concentrations of a novel drug
  • leave for 48 hours
  • then measure how many cells are dead
  • the experiment was done four times with the doses improved each time
Here is the graph and calculated LD50 - a measure of how good the drug is:


Here is the script I used to generate this graph:

### START of SCRIPT 

###  this is the data   ###
# data from four experiments
conc <- c(5.00E-07, 1.00E-06, 1.00E-05, 
          5.00E-07, 1.00E-06, 5.00E-06, 1.00E-05, 2.00E-05, 
          5.00E-07, 1.00E-06, 2.50E-06, 5.00E-06, 1.00E-05, 
          5.00E-07, 1.00E-06, 2.50E-06, 5.00E-06, 1.00E-05)
dead.cells <- c(34.6, 47.7, 81.7, 
                37.6, 55.7, 89.1, 84.3, 85.2, 
                34.4, 46.1, 76.2, 84.3, 84.1, 
                24.5, 26.1, 60.6, 82.7, 87)

# transform the data to make it postive and put into a data frame for fitting 
data <- as.data.frame(conc)   # create the data frame
data$dead.cells <- dead.cells
data$nM <- data$conc * 1000000000
data$log.nM <- log10(data$nM) 

###  fit the data  ###
# make sure logconc remains positive, otherwise multiply to keep positive values
# (such as in this example where the conc is multiplied by 1000000

fit <- nls(dead.cells ~ bot+(top-bot)/(1+(log.nM/LD50)^slope),
             data = data,
             start=list(bot=20, top=95, LD50=3, slope=-12))

###  Plot the results  ###
#this lets you graph your calculated equations nice and pretty

x <- seq(0,6, length=100)
y <- (coef(fit)["bot"]+(coef(fit)["top"]-coef(fit)["bot"])/(1+(x/coef(fit)["LD50"])^coef(fit)["slope"]))
m <- coef(fit)

# plot the points first
plot(data$nM, data$dead.cells, 
     log="x", 
     main="Drug Dose Response and LD50", 
     xlab="Drug concentration (microM)", 
     ylab="Dead cells (% of cells)",
     xlim= c(500, 10000),
     ylim= c(20,100),
     xaxt = "n")   # suppresses the labels on the x-axis

# adds the x-axis labels they way I want it to look
axis(1, at=c(500, 1000, 2500, 5000, 10000),  lab=c("0.5","1","2.5","5","10"))

# adds the fitted non-linear line 
lines(10^x,y, lty="dotted", col="red", lwd =2)

# add the LD50 in the legend which allows nice positioning. 
rp = vector('expression',1)
rp[1] = substitute(expression(LD50(microM) == MYVALUE), 
                   list(MYVALUE = format((10^m[3])/1000,dig=3)))[2]
legend('topleft', legend = rp, bty = 'n')

# END OF SCRIPT

To prepare this script, I used the following sources:
Thanks to Mel for the data.