Building on my visualization of SARS-CoV-2 spike proteins, this script provides a R script to allow you to draw a schematic of the corona virus S1 spike protein and the UK variant that has changes within the S1 protein.
Here is the visualisation and below is the code to make it.
START
# viz the changes of the UK variant in S1 spike protein....
library(drawProteins)
library(ggplot2)
library(tidyverse)
# download protein data from
# Uniprot link: https://www.uniprot.org/uniprot/P0DTC2
drawProteins::get_features("P0DTC2") -> spike_sars
drawProteins::feature_to_dataframe(spike_sars) -> spike_data
# pull out S1 chain... begins 13 ends: 685
spike_data %>%
filter(begin > 12 & end < 686) -> s1_bot
# duplicate this and put order = 2
s1_top <- s1_bot
s1_top$order <- 2
# combine these two
s1_both <- rbind(s1_top, s1_bot)
# draw canvas, chains & regions
draw_canvas(s1_both) -> p
p <- draw_chains(p, s1_both, labels = c("S1 protein", "B.1.1.7 variant"))
p <- draw_regions(p, s1_both)
# here are the details of the changes...
uk_variant <- tribble(
~type, ~description, ~begin, ~end, ~length, ~accession, ~entryName, ~taxid,
~order,
"B.1.1.7", "deletion", 69, 70, 1, "P0DTC2","SPIKE_SARS2", 2697049, 1,
"B.1.1.7", "deletion", 144, 144, 1, "P0DTC2","SPIKE_SARS2", 2697049, 1,
"B.1.1.7", "substitution", 501, 501, 1, "P0DTC2","SPIKE_SARS2", 2697049, 1,
"B.1.1.7", "substitution", 570, 570, 1, "P0DTC2","SPIKE_SARS2", 2697049, 1,
"B.1.1.7", "substitution", 681, 681, 1, "P0DTC2","SPIKE_SARS2", 2697049, 1,
"B.1.1.7", "substitution", 716, 716, 1, "P0DTC2","SPIKE_SARS2", 2697049, 1,
"B.1.1.7", "substitution", 982, 982, 1, "P0DTC2","SPIKE_SARS2", 2697049, 1,
"B.1.1.7", "substitution", 1118, 1118, 1, "P0DTC2","SPIKE_SARS2", 2697049, 1,
)
# overlay information about the variants
p <- p + geom_point(data = filter(uk_variant, begin < 686),
aes(x = begin,
y = order+0.2,
shape = description), size = 5)
# style the plot a bit...
p <- p + theme_bw(base_size = 14) + # white background
theme(panel.grid.minor=element_blank(),
panel.grid.major=element_blank()) +
theme(axis.ticks = element_blank(),
axis.text.y = element_blank()) +
theme(panel.border = element_blank()) +
theme(legend.position = "bottom")
p <- p + labs(title = "Schematic of SARS-CoV-2 S1 Protein and UK variant",
subtitle = "Source: Uniprot (https://www.uniprot.org/uniprot/P0DTC2)")
p
END
I feel this could, and maybe will, be better but I'm stopping for now :-)
Some Resources
For more help, bug reports or to suggest features
- drawProteins on Github
- Bioconductor forum for questions
- If you use drawProteins in a publication, please cite my paper.
No comments:
Post a Comment
Comments and suggestions are welcome.