Showing posts with label geom_polygon. Show all posts
Showing posts with label geom_polygon. Show all posts

Monday, 18 May 2020

Volcano time line by Tidy Tuesday...

We've been exploring the Tidy Tuesday datasets at our CaRdiff UseR group. It is good to look at a different types of data and the inspiration on Twitter. I tried to create a time line of volcano eruptions but it was a complete mess. Then I found this data visualisation by @ijeamaka_a which inspired me.

––

She very helpfully shared her code through Github so I was able to have a play with her code and use it to make some plots for other parts of the world and time frames.

Here is an example of a plot I have made which has eruptions in Indonesia between 1250 and 2020 with a volcano explosion index greater than 2.


Using this script it is possible to explore other places that have lots of earthquakes like Japan. This is a nice map of the volcanos in Japan.


START
## Inspiration from https://twitter.com/ijeamaka_a/status/1260660823229214724/photo/1
## https://github.com/Ijeamakaanyene/data_visualizations/blob/master/scripts/2020_10_volcanos.Rmd

library(dplyr)
library(tidyr)
library(ggplot2)

# download the data
volcano = readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-05-12/volcano.csv')
eruptions = readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-05-12/eruptions.csv')

# exploring the data a bit
## check countries with most eruptions...
erupt_by_country = left_join(eruptions, select(volcano, volcano_number, country, subregion),
    by = c("volcano_number"="volcano_number")) %>%
    group_by(country) %>%
    count(sort = TRUE)  
head(erupt_by_country)

## what size eruptions are most common...
count_erupt_size = left_join(eruptions, select(volcano, volcano_number, country, subregion),
    by = c("volcano_number"="volcano_number")) %>%
    group_by(vei) %>%
    count(sort = TRUE)  
## vei not available for lots of eruptions - historical I guess...
count_erupt_size

# create the bound data_set
all_eruptions_merged = left_join(eruptions, select(volcano, volcano_number, 
    country, subregion),
    by = c("volcano_number"="volcano_number")) %>%
    mutate(combo_year = as.numeric(paste(start_year, start_month, sep = ".")))

# allow some customization
## add your country, start year and end year
## and size of eruption from 1 to 7 or zero to see all.
my_country <- "Indonesia"
my_start_year <- 1250
my_end_year <- 2020
size <- 2

# filter merged data set...
eruptions_merged = all_eruptions_merged %>%
    filter(country == my_country) %>%
    filter(is.na(vei) == FALSE) %>%
    filter(start_year > my_start_year) %>%
    filter(start_year < my_end_year) %>%
    filter(vei > size)

# Using for loop to create data needed to plot a geom_polygon
volcano_polygon_list = list()
years = unlist(eruptions_merged$combo_year)
volcano_ids = unlist(eruptions_merged$volcano_number)
veis = unlist(eruptions_merged$vei)

## There is most likely a better way of doing this.. but I cannot think of it!
## so for this to work on a wider than 20 year window we need to 
## change the value for making the pyramid
## 0.25 is approx 1% of the number of decades
pyr_size <- (my_end_year - my_start_year)/100
for(i in 1:length(years)){
    volcano_polygon_df = data.frame(
        x = c(years[i], years[i] + pyr_size, years[i] + pyr_size*2),
        y = c(0, veis[i], 0),
        t = rep(volcano_ids[i], 3)
    )
    volcano_polygon_list[[i]] = volcano_polygon_df
}

# Converting into df and adding subregion information
volcano_polygon_df = volcano_polygon_list %>%
    bind_rows() %>%
    left_join(., select(eruptions_merged, volcano_number, subregion),
        by = c("t" = "volcano_number"))

# create the plot
volcano_timeline = ggplot() +
    geom_polygon(data = volcano_polygon_df, aes(x = x, y = y, group = t, fill = subregion),
        alpha = 0.75, colour = "black") +
    geom_segment(aes(y = 0, yend = 0, x = my_start_year, xend = my_end_year), 
        size = 1,
        colour = "black",
        arrow = arrow()) +
    scale_x_continuous(limits = c(my_start_year, my_end_year),
        expand = c(0.005, 0.005)) +
    scale_y_continuous(expand = c(0, 0))
    

## add titles - based on the values put in...
volcano_timeline <-  volcano_timeline +
    labs(y = "Explosion Index", x = NULL, fill = NULL,
    title = paste0("Volcanic Activity Timeline (vei>", size, ") ",
        my_start_year, " to ", my_end_year, " within ", my_country),
        subtitle = "Each triangle's height represents the erruptions volcanic explosion index.",
        caption = paste0("Source: The Smithsonian Institution\n",
            "Visualization: Paul Brennan | @brennanpcardiff  adapted from Ijeamaka Anyene | @ijeamaka_a"))

## brown colour theme...
volcano_timeline + theme_bw() + 
    ## separate out the colour scale to allow a change to this 
    scale_fill_manual(values = rcartocolor::carto_pal(n = 7, name = "BrwnYl")) 

## Teal Green colour theme give the plot below...
volcano_timeline + theme_bw() +
    ## separate out the colour scale to allow a change to this 
    scale_fill_manual(values = rcartocolor::carto_pal(n = 7, name = "TealGrn")) 


END

Resources