The SARS-CoV-2 Spike protein and its variants are very important at the moment. I thought it would be interesting to showcase using my Bioconductor package drawProteins to programmatically draw a visualization of the Spike protein. This helped me understand a little more about the protein too.
The data for making the visualization is from Uniprot: https://www.uniprot.org/uniprot/P0DTC2
Here is the visualisation and below is the code to make it.
START
library(drawProteins)
library(ggplot2)
library(tidyverse)
# Uniprot link: https://www.uniprot.org/uniprot/P0DTC2
drawProteins::get_features("P0DTC2") -> spike_sars
drawProteins::feature_to_dataframe(spike_sars) -> spike_data
# From the Uniprot entry, it say that the Spike protein
# is made as a single protein and then processed into
# S1 and S2 protein.
# thus the Uniprot entry has multiple chains
# Processing Uniprot data to create different proteins
# pull out full length chain
spike_data %>%
filter(begin < 685 & end == 1273) -> spike_data_1
# want this at the top...
spike_data_1$order = 3
# pull out S1 chain... begins 13 ends: 685
spike_data %>%
filter(begin < 685 & end < 686) -> spike_data_2
# want this next...
spike_data_2$order = 2
# pull out S2 chain... begins 686; ends: 1273
spike_data %>%
filter(begin > 685 & end < 1274) -> spike_data_3
# want this at the bottom
spike_data_3$order = 1
# combine all back for plotting
spike_data_o <- rbind(spike_data_1, spike_data_2, spike_data_3)
# pull out names for chains
spike_data_o %>%
filter(type == "CHAIN") -> chains
chain_names <- c(chains$description[1:3], "")
# draw canvas and chains...
draw_canvas(spike_data_o) -> p
p <- draw_chains(p, spike_data_o,
labels = chain_names)
# add regions to S1 and S2
p <- draw_regions(p, spike_data_o)
p <- p + theme_bw(base_size = 14) + # white background
theme(panel.grid.minor=element_blank(),
panel.grid.major=element_blank()) +
theme(axis.ticks = element_blank(),
axis.text.y = element_blank()) +
theme(panel.border = element_blank()) +
theme(legend.position = "bottom")
p <- p + labs(title = "Schematic of SARS-CoV-2 Spike Protein",
subtitle = "Source: Uniprot (https://www.uniprot.org/uniprot/P0DTC2")
p
END
Some Resources
For more help, bug reports or to suggest features
- drawProteins on Github
- Bioconductor forum for questions
- If you use drawProteins in a publication, please cite my paper.
No comments:
Post a Comment
Comments and suggestions are welcome.