The relationship between suicide terrorism and fertility rates
Description
The dataset presents a comprehensive analysis exploring the potential correlation between suicide terrorism and fertility rates across different regions. Suicide terrorism data is meticulously sourced from the Database on Suicide Attacks, as released on October 2, 2020, by the Chicago Project on Security and Threats (CPOST). This dataset is robust, containing detailed information on various aspects of suicide attacks, which could be instrumental in understanding the demographic and social patterns associated with such incidents. To investigate the demographic dimension, fertility rate data, indicating the average number of children born per woman, is incorporated from a reliable source: the extensive work of Hannah Ritchie, Max Roser, and Pablo Rosado, published on Our World in Data in 2023. This data provides a global perspective on fertility trends, which can be crucial for analyzing the socioeconomic factors that might correlate with the incidence of suicide terrorism. Furthermore, the dataset is enriched with population data, derived from the seminal work 'Adaptation and Natural Selection: A Critique of Some Current Evolutionary Thought' by George Christopher Williams, published by Princeton University Press in 2018. The inclusion of population data allows for a nuanced analysis that accounts for the size and density of populations, factors that can influence both the frequency of suicide terrorism and fertility rates.
Files
Steps to reproduce
#Code in R library(tidyverse) library(readxl) # Define a function for safe data loading and preprocessing preprocess_data <- function(file_path, year_start, year_end, entity_name, year_name, value_name, new_value_name) { # Use sym to convert string to symbol and !! to force evaluation entity_sym <- sym(entity_name) year_sym <- sym(year_name) value_sym <- sym(value_name) new_value_sym <- sym(new_value_name) read.csv(file_path) %>% select(!!entity_sym, !!year_sym, !!value_sym) %>% filter(!!year_sym >= year_start, !!year_sym <= year_end) %>% group_by(!!entity_sym) %>% summarise(!!new_value_sym := (!!value_sym) %>% log %>% mean(na.rm = TRUE) %>% exp, .groups = "drop") %>% rename(country = !!entity_sym) } # Preprocess each dataset year_start <- 1974 year_end <- 2019 # source: https://ourworldindata.org/grapher/population?tab=table&facet=entity df_population <- preprocess_data("../datasets/population.csv", year_start, year_end, "Entity", "Year", "Population..historical.estimates.", "population") df_population$country[32] <- "Bosnia & Herzegovina" # source: https://ourworldindata.org/grapher/children-per-woman-un?tab=table&time=1950..latest df_fertility <- preprocess_data("../datasets/children-per-woman-un.csv", year_start, year_end, "Entity", "Year", "Fertility.rate...Sex..all...Age..all...Variant..estimates", "fertility") df_fertility$country[32] <- "Bosnia & Herzegovina" # source: # https://cpost.uchicago.edu/research/suicide_attacks/database_on_suicide_attacks/ df_terrorism <- read_excel("../datasets/dsat_dist_2020_10.xlsx", sheet = "dsat_attacks") %>% count(country = admin0_txt) # Merge datasets df <- df_terrorism %>% inner_join(df_population, by = "country") %>% mutate(logTerrorism = log(n / population)) %>% select(country, logTerrorism) %>% inner_join(df_fertility, by = "country") %>% mutate(logFertility = log(fertility)) %>% select(country, logTerrorism, logFertility) # Free memory rm(df_terrorism, df_fertility, df_population) # write.csv(df, "../datasets/terrorism-fertility.csv", row.names = F) # Plotting #png("fertility.png") #df %>% ggplot(aes(x = logFertility, y = logTerrorism)) + geom_point(alpha = 0.5) + geom_smooth(method = "lm", formula = y ~ x, color = "black") + xlab("Children per women")+ ylab("Suicide terrorist attacks per capita")+ labs(title="Relationship between fertility and suicide terrorism", subtitle = "log-log plot", caption = "Sources: \"Our World in Data\" and \n \"Chicago Project on Security and Threats\"" )+ theme_classic() #dev.off() # Linear modeling and summary model <- lm(logTerrorism ~ logFertility, data = df) summary(model) shapiro.test(model$residuals) plot(model$residuals)