This R Markdown-notebook contains the final annotated analysis code for the manuscript: “Overall bias and sample sizes were unchanged in ICU trials over time: a meta-epidemiological study” by Carl Thomas Anthon, Anders Granholm, Jon Henrik Laake, Anders Perner and Morten Hylander Møller.

The annotated analysis code was written by Anders Granholm according to the pre-published protocol and statistical analysis plain (doi: 10.1111/aas.13258). For further details, please see the full article and its supplementary files, which also includes the CSV-file containing the final dataset and a data dictionary.

For correspondence related to the manuscript, contact the corresponding author Carl Thomas Anthon. Questions directly related to the analysis code can also be sent to Anders Granholm.

Figures displayed in this document are scaled differently than in the final paper and supplement.

1. Setup

Load the packages/functions required for the analyses.

library(tidyverse) # Load the Tidyverse packages, used for data processing and visualisation.
plot_grid <- cowplot::plot_grid # Load this function from the cowplot package; only this function will be used from this package, and the entire library will not be loaded as this is unnecessary and cowplot overrides the default ggplot2 theme, which then has to be reset.

Import the file containing data for all the included trials.

Note: This code block uses RStudio’s API to allow manual selection of the final dataset. This requires the script to be run in a recent version of RStudio; otherwise this has to be manually replaced with the appropriate file path.

included_trials <- read_csv(rstudioapi::selectFile(filter = "CSV files (*.csv)"), col_types = cols(trial_year = col_integer(), sample_size = col_integer()))

Define the output directory where all results will be saved.

Note: This block uses RStudio’s API to allow manual selection of the directory. This requires the script to be run in a recent version of RStudio; otherwise this has to be manually replaced with the appropriate path.

output_path <- rstudioapi::selectDirectory()

Define plot theme and image settings for saving images, to allow easy changing later.

# File settings
img_dpi <- 600 # DPI - not used for vectorized formats, e.g. PDFs, but argument is kept if this is changed.
img_format <- "pdf" # File format
# Widths and heights (in millimeters) for single run chart plots
img_width <- 150
img_height <- 80
# Widths and heights (in millimeters) for the triple run chart plots
img_width_triple <- 150
img_height_triple <- 150
# Width and height (in millimeters) for the risk of bias overview plot
img_width_bias <- 200
img_height_bias <- 80

# Set ggplot2 theme
theme_set(theme_linedraw())

2. Data inspection and preparation

2.1 Helper function

The following helper function filters a dataset so only years with a certain number of trials are included, and optionally prints the excluded years. Arguments:

  • .dta: the dataset to filter.
  • minimum: the minimum number of required trials per year, default is 5.
  • verbose: if TRUE (default), the years with too few trials are printed.
filter_years <- function(.dta, minimum = 5, verbose = TRUE){
  years <- .dta %>% 
    group_by(trial_year) %>% 
    summarise(n = n()) %>% 
    filter(n < minimum) %>% 
    pull(trial_year)
  
  if (verbose){
    print(str_glue("The following years contain less than {minimum} trials and have beeen filtered from this dataset: {str_c(years, collapse = ', ')}."))
  }
  
  .dta %>% 
    filter(!(trial_year %in% years))
}

2.2 Risk of bias

Create overall_risk_of_bias variable and convert all risk of bias domains to factors.

included_trials <- included_trials %>% 
  mutate(overall_risk_of_bias = case_when(is.na(random_sequence_generation) | is.na(allocation_concealment) | is.na(blinding_of_participants_and_personnel) | is.na(blinding_of_outcome_assessment) | is.na(incomplete_outcome_data) | is.na(selective_reporting) | is.na(other_sources_of_bias) ~ NA_character_, # If any risk of bias domain is missing, the overall risk of bias will be missing
                                          random_sequence_generation == "Low" & allocation_concealment == "Low" & blinding_of_participants_and_personnel == "Low" & blinding_of_outcome_assessment == "Low" & incomplete_outcome_data == "Low" & selective_reporting == "Low" & other_sources_of_bias == "Low" ~ "Low", # If all domains are low risk, the overall risk of bias will be low as well
                                          TRUE ~ "High" # If none of the above criteria apply, the overall risk of bias will be high
         )) %>% 
  mutate_at(vars(random_sequence_generation:overall_risk_of_bias), ~factor(., levels = c("Low", "Unclear", "High")))

Assess how many trials have missing data for the overall risk of bias domain (which will be missing if any of the other domains are missing) and create dataset that only includes trials with available risk of bias ratings.

included_trials %>% 
  filter(is.na(overall_risk_of_bias)) %>% 
  str_glue_data("In total, {nrow(.)} trials have missing data for one or more risk of bias domains and will be excluded from these analyses.")
In total, 2 trials have missing data for one or more risk of bias domains and will be excluded from these analyses.
trials_bias_available <- included_trials %>% 
  filter(!is.na(overall_risk_of_bias))

Create dataset that only contains trials from years with at least 5 trials with available risk of bias ratings (for the primary analyses).

trials_bias_available_5plus <- filter_years(trials_bias_available)
The following years contain less than 5 trials and have beeen filtered from this dataset: 1977, 1978, 1981, 1982, 1983, 1984, 1986, 2017, 2018.

For all trials with available risk of bias, calculate overall distributions for each domain.

trials_bias_available %>% 
  gather(key = "bias_domain", value = "rating", random_sequence_generation:overall_risk_of_bias, factor_key = TRUE) %>% 
  select(bias_domain, rating) %>% 
  group_by(bias_domain) %>% 
  summarise(low = sum(rating == "Low") / n() * 100,
            unclear = sum(rating == "Unclear") / n() * 100,
            high = sum(rating == "High") / n() * 100) %>% 
  mutate_if(is.numeric, ~paste0(round(., 1), "%"))

Create plot visualising distributions of risk of bias for each domain (only include trials where risk of bias ratings are available).

trials_bias_available %>% 
  gather(key = "bias_domain", value = "rating", random_sequence_generation:overall_risk_of_bias, factor_key = TRUE) %>% 
  select(bias_domain, rating) %>%
  mutate(rating = factor(rating, levels = c("High", "Unclear", "Low")),
         bias_domain = fct_rev(bias_domain)) %>%  # Reverse the order to ensure proper printing in the table
  ggplot(aes(x = bias_domain, fill = rating)) + # Create the plot
  geom_bar(position = "fill", width = 0.95) +
  coord_flip() +
  labs(x = NULL, y = NULL) +
  scale_fill_manual(values = c("#C00000", "#E0E000", "#00C000"), name = "Risk of bias rating:", guide = guide_legend(reverse = TRUE)) +
  scale_y_continuous(expand = c(0, 0), breaks = 0.1 * 0:10, minor_breaks = 0.05 * 0:20, labels = paste0(10 * 0:10, "%")) +
  scale_x_discrete(expand = c(0, 0), labels = rev(c( # Use rev() to ensure that order matches the domains above
    "Random sequence generation",
    "Allocation concealment",
    "Blinding of participants and personnel",
    "Blinding of outcome assessment",
    "Incomplete outcome data",
    "Selective reporting",
    "Other sources of bias",
    "Overall risk of bias"
  ))) +
  theme(plot.margin = margin (t = 5, l = 10, r = 15, b = 5, unit = "pt"), legend.position = "bottom") # Fix plot margins to ensure all labels are printed and change legend

ggsave(paste0(output_path, "/Risk of bias overview.", img_format), width = img_width_bias, height = img_height_bias, dpi = img_dpi, units = "mm") # Save file

2.3 Sample sizes

Assess how many trials have missing data for sample size and create dataset that only includes trials with available sample sizes.

included_trials %>% 
  filter(is.na(sample_size)) %>% 
  str_glue_data("In total, {nrow(.)} trials have missing data for sample sizes and will be excluded from these analyses.")
In total, 2 trials have missing data for sample sizes and will be excluded from these analyses.
trials_size_available <- included_trials %>% 
  filter(!is.na(sample_size))

Create dataset that only contains trials from years with at least 5 trials with available sample sizes (for the primary analyses).

trials_size_available_5plus <- filter_years(trials_size_available)
The following years contain less than 5 trials and have beeen filtered from this dataset: 1977, 1978, 1981, 1982, 1983, 1984, 1986, 2017, 2018.

For all trials with available sample sizes, calculate the overall distribution.

round(summary(trials_size_available$sample_size), 1)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
    3.0    42.0    82.5   224.0   181.2  7000.0 

3. Define run chart functions

Define the function that will produce the run charts and associated statistics.

This contains two helper functions, run_chart_stat and run_chart_caption which should only be called from the main function, run_chart, and not directly.

The run_chart function has the following arguments:

The function returns an annotated plot.

An additional, simplified helper function is present, triple_bias_run_chart, with the following arguments:

# Function that calculates the run chart statistics using the "val" column in the aggregated dataset from the main function that has already been sorted by year.
# This function should not be called directly
run_chart_stat <- function(val, fixed_median = NULL){
  n_total <- length(val)
  if (n_total < 1){
    stop("ERROR: run_chart_stat: number of observations (years) < 1.")
  }
  if (!(is.null(fixed_median) | is.numeric(fixed_median))) {
    stop("ERROR: run_chart_stat: fixed_median must be either NULL or a numeric value.")
  }
  med <- ifelse(is.null(fixed_median), median(val), fixed_median)
  n_useful <- sum(!near(val, med))
  
  run_longest <- 0
  run_current <- 0
  pos <- NA
  cross_n <- NA
  
  for (i in 1:n_total) {
    cur <- val[i]
    if (!near(cur, med)) {
      
      if (is.na(pos)){ # Will be true for the first value that is not equal to the median (and only once)
        pos <- !(cur > med) # Sets current position as the opposite of what it is, to start new run
        cross_n <- -1 # The first run will not count as a crossing
      }
      
      if ((cur > med ) == pos) { # If the current value is on the same side as the last one
        run_current <- run_current + 1
      } else { # If the current value is on the opposite side as the last one
        if (run_current > run_longest) {
          run_longest <- run_current
        }
        
        run_current <- 1
        cross_n <- cross_n +1
        pos <- cur > med
      }
      
    }
  }
  if (run_current > run_longest) {
    run_longest <- run_current
  }
  
  # Return results (upper and lower prediction limits calculated according to Anhøj's rules)
  list(n_total = n_total, n_useful = n_useful, median = med,
       run_UPL = round(log2(n_useful) + 3), run_longest = run_longest,
       cross_LPL = qbinom(0.05, n_useful - 1, 0.5), cross_n = cross_n, fixed_median = !is.null(fixed_median))
}

# Function that formats the caption used in each run chart
# This function should not be called directly
run_chart_caption <- function(res, percentage = NULL, dec = 1){
  res$base <- paste0(res$n_total, " observations (", res$n_useful, " useful). ", ifelse(res$fixed_median, "Fixed median: ", "Median: "), format(res$median, digits = dec, nsmall = dec), percentage, ". Longest run:")
  res$long <- paste0("[max. ", res$run_UPL, "].")
  res$cross <- paste0("[min. ", res$cross_LPL, "].")
  
  if (res$run_longest > res$run_UPL & res$cross_n < res$cross_LPL){
    # Both results are significant (non-random variation)
    res <- map(res, as.character)
    cap <- substitute(base~bold(run_longest)~long~"Crossings:"~bold(cross_n)~cross, res)
  } else if (res$run_longest > res$run_UPL) {
    # Only the longest run is significant (non-random variation)
    res <- map(res, as.character)
    cap <- substitute(base~bold(run_longest)~long~"Crossings:"~cross_n~cross, res)
  } else if (res$cross_n < res$cross_LPL) {
    # Only the number of crossings is significant (non-random variation)
    res <- map(res, as.character)
    cap <- substitute(base~run_longest~long~"Crossings:"~bold(cross_n)~cross, res)
  } else {
    # No significance (only random variation)
    res <- map(res, as.character)
    cap <- substitute(base~run_longest~long~"Crossings:"~cross_n~cross, res)
  }
  # Return caption
  cap
}

# The main function - see description above
run_chart <- function(.dta, variable, category = NULL, title = NULL, delete_titles = TRUE, fixed_median = NULL, num_agg_mean = FALSE){
  # Quote variable name and get the type of it
  var <- enquo(variable)
  var_pull <- pull(.dta, !!var)
  type <- case_when(is.factor(var_pull) ~ "factor",
                    is.integer(var_pull) ~ "integer",
                    TRUE ~ "other")
  
  # Input checks
  if (type == "factor") {
    if (is.null(category)) {
      stop("ERROR: run_chart: 'variable' is a factor (risk of bias domain?) and 'category' is NULL. 'category' must be either 'Low', 'Unclear' or 'High'.", call. = FALSE)
    } else if (!(category %in% c("Low", "Unclear", "High"))){
      stop(paste0("ERROR: run_chart: 'variable' is a factor (risk of bias domain?), 'category' is '", category, "', but must be either 'Low', 'Unclear' or 'High'."), call. = FALSE)
    }
  } else if (type == "integer") {
    if (!is.null(category)){
      warning(paste0("WARNING: run_chart: 'variable' is an integer (sample size?). 'category' is '", category, "' and will NOT be used."), call. = FALSE)
    }
  } else {
    stop(paste0("ERROR: run_chart: 'variable' must be either a factor (risk of bias category) or an integer (sample size), '", quo_name(var), "' is neither."))
  }
  
  # Remove missing data
  n_miss <- .dta %>% 
    filter(is.na(trial_year) | is.na(!!var)) %>% 
    nrow()
  if (n_miss > 0) {
    warning(paste0("WARNING: run_chart: ", n_miss, " rows with missing data in either trial_year or ", quo_name(var), " removed from the dataset prior to analyses."))
    .dta <- .dta %>% 
      filter(!is.na(trial_year) & !is.na(!!var))
  }
  
  # If delete_titles == TRUE, then set title to null
  if (delete_titles){
    title <- NULL
  }
  
  # Aggregate data
  if (type == "factor") {
    agg <- .dta %>% 
      group_by(trial_year) %>% 
      summarise(val = sum(!!var == category) / n() * 100) %>% 
      arrange(trial_year)
  } else if (type == "integer") {
    if (!num_agg_mean){
      agg <- .dta %>% 
        group_by(trial_year) %>% 
        summarise(val = median(!!var),
                  y_high = quantile(!!var, 0.75),
                  y_low = quantile(!!var, 0.25)) %>%
        arrange(trial_year)      
    } else if (num_agg_mean) {
      agg <- .dta %>% 
        group_by(trial_year) %>% 
        summarise(val = mean(!!var),
                  se_val = sd(!!var) / sqrt(n()),
                  y_high = val + 1.96 * se_val,
                  y_low_tmp = val - 1.96 * se_val,
                  y_low = ifelse(y_low_tmp < 0, 0, y_low_tmp)) %>%  # 95% confidence intervals (CIs) below 0 are impossible
        arrange(trial_year)
    }

  }
  
  # Calculate stats
  stats <- agg %>%
    pull(val) %>% 
    run_chart_stat(fixed_median = fixed_median)
  
  # General plot aspects
  gg <- ggplot(data = agg, aes(x = trial_year, y = val)) +
    labs(caption = run_chart_caption(stats, if_else(type == "factor", "%", "")), x = NULL, y = NULL, subtitle = title) + # The title argument provided is used for a ggplot2 subtitle, due to the default smaller font size
    theme(plot.caption = element_text(hjust = 0.5)) + # Align caption in the middle
    scale_x_continuous(limits = c(1975, 2020), breaks = 1975 + 5 * 0:8, minor_breaks = 1975:2020, expand = c(0, 0))
  
  # Add different aspects for risk of bias domains and sample sizes
  if (type == "factor") {
    # Risk of bias domains - different y-axis for overall risk of bias plots and subdomains
    if (quo_name(var) == "overall_risk_of_bias") {
      gg <- gg +
        scale_y_continuous(limits = c(0, 50), breaks = 10 * 0:5, minor_breaks = 5 * 0:10, labels = paste0(10 * 0:5, "%"), expand = c(0.015, 0))
    } else {
      gg <- gg +
        scale_y_continuous(limits = c(0, 100), breaks = 20 * 0:5, minor_breaks = 5 * 0:20, labels = paste0(20 * 0:5, "%"), expand = c(0.015, 0))
    }
  } else if (type == "integer") {
    # Sample sizes - different y-axis according to aggregation function
    gg <- gg +
      scale_y_continuous(breaks = 100 * 0:15, minor_breaks = 25 * 0:60, expand = c(0, 0)) + # Axis long enough for both types of plots - different limits for median/mean values are set in next line
      coord_cartesian(ylim = c(0, ifelse(!num_agg_mean, 600, 1400))) + # Limits are set here to avoid outliers for 95% CIs of yearly means to be removed
      geom_ribbon(aes(ymin = y_low, ymax = y_high), fill = "#B5B5B5", alpha = 0.80) # Add grey shaded area
  }
  
  # Add points and lines
  gg <- gg + geom_point(size = 2) +
    geom_line() +
    geom_hline(yintercept = stats$median, size = 1.25)
  
  # Return plot
  gg
}

# Simple helper function that creates a combined plot with all 3 risk of bias ratings
triple_bias_run_chart <- function(.dta, variable, title_add = NULL){
  variable <- enquo(variable)
  title_base <- str_replace_all(str_to_title(quo_name(variable)), "_", " ")
  plot_grid(
    run_chart(.dta, !!variable, category = "Low", title = paste0(title_base, title_add)) + ylab("Low"),
    run_chart(.dta, !!variable, category = "Unclear") + ylab("Unclear"),
    run_chart(.dta, !!variable, category = "High") + ylab("High"),
    nrow = 3, ncol = 1)
}

4. Primary analysis

For all primary analyses, only years with 5 or more trials are used.

Analysis 1.1: Percentage of overall low risk of bias trials per year

trials_bias_available_5plus %>% 
  run_chart(overall_risk_of_bias, "Low", title = "Overall low risk of bias")

ggsave(paste0(output_path, "/1.1 Primary analysis - overall low risk of bias.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")

Analysis 1.2: Annual median sample sizes

trials_size_available_5plus %>% 
  run_chart(sample_size, title = "Median (IQR) sample sizes")

ggsave(paste0(output_path, "/1.2 Primary analysis - sample sizes.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")

Analysis 1.2ph1: Post-hoc analysis of the annual mean sample sizes

trials_size_available_5plus %>% 
  run_chart(sample_size, title = "Mean (95% CI) sample sizes (post-hoc analysis)", num_agg_mean = TRUE)

ggsave(paste0(output_path, "/1.2ph1 Primary analysis - sample sizes, post-hoc analysis using mean values.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")

Analysis 1.3: Yearly proportion of low, unclear and high risk of bias ratings in each of the seven domains

# Random sequence generation
trials_bias_available_5plus %>%
  triple_bias_run_chart(random_sequence_generation)
ggsave(paste0(output_path, "/1.3A Primary analysis - Random sequence generation.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Allocation concealment
trials_bias_available_5plus %>%
  triple_bias_run_chart(allocation_concealment)
ggsave(paste0(output_path, "/1.3B Primary analysis - Allocation concealment.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Blinding of participants and personnel
trials_bias_available_5plus %>%
  triple_bias_run_chart(blinding_of_participants_and_personnel)
ggsave(paste0(output_path, "/1.3C Primary analysis - Blinding of participants and personnel.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Blinding of outcome assessment
trials_bias_available_5plus %>%
  triple_bias_run_chart(blinding_of_outcome_assessment)
ggsave(paste0(output_path, "/1.3D Primary analysis - Blinding of outcome assessment.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Incomplete outcome data
trials_bias_available_5plus %>%
  triple_bias_run_chart(incomplete_outcome_data)
ggsave(paste0(output_path, "/1.3E Primary analysis - Incomplete outcome data.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Selective reporting
trials_bias_available_5plus %>%
  triple_bias_run_chart(selective_reporting)
ggsave(paste0(output_path, "/1.3F Primary analysis - Selective reporting.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Other sources of bias
trials_bias_available_5plus %>%
  triple_bias_run_chart(other_sources_of_bias)
ggsave(paste0(output_path, "/1.3G Primary analysis - Other sources of bias.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

5. Sensitivity analysis #1 - all trials from years with 5 or more trials, but separated according to type of intervention

As for the primary analyses, only years with 5 or more trials are used here.

Assess the distribution of intervention types.

included_trials %>% 
  group_by(review_intervention_category) %>% 
  summarise(n = n())

Rather few medical device and management trials are included. Thus, these analyses will contain few data points and should be interpreted with caution.

5.1 Sensitivity analysis #1A - drug trials

Create dataset for risk of bias analyses.

trials_bias_available_5plus_drug <- trials_bias_available %>% 
  filter(review_intervention_category == "Drug") %>% 
  filter_years()
The following years contain less than 5 trials and have beeen filtered from this dataset: 1977, 1978, 1981, 1982, 1983, 1984, 1986, 1988, 1989, 1996, 2017, 2018.
str_glue("In total {nrow(trials_bias_available_5plus_drug)} drug trials will be included in these sensitivity analyses of risk of bias.")
In total 302 drug trials will be included in these sensitivity analyses of risk of bias.

Create dataset for sample size analysis.

trials_size_available_5plus_drug <- trials_size_available %>% 
  filter(review_intervention_category == "Drug") %>% 
  filter_years()
The following years contain less than 5 trials and have beeen filtered from this dataset: 1977, 1978, 1981, 1982, 1983, 1984, 1986, 1988, 1989, 1996, 2017, 2018.
str_glue("In total {nrow(trials_size_available_5plus_drug)} drug trials will be included in this sensitivity analysis of sample sizes.")
In total 304 drug trials will be included in this sensitivity analysis of sample sizes.

Analysis 2A.1: Percentage of overall low risk of bias trials per year - drug trials

trials_bias_available_5plus_drug %>% 
  run_chart(overall_risk_of_bias, "Low", title = "Overall low risk of bias - drug RCTs")

ggsave(paste0(output_path, "/2A.1 Sensitivity analysis - overall low risk of bias - drug trials.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")

Analysis 2A.2: Annual median sample sizes - drug trials

trials_size_available_5plus_drug %>% 
  run_chart(sample_size, title = "Median (IQR) sample sizes - drug RCTs")

ggsave(paste0(output_path, "/2A.2 Sensitivity analysis - sample sizes - drug trials.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")

Analysis 2A.2ph1: Post-hoc: annual mean sample sizes - drug trials

trials_size_available_5plus_drug %>% 
  run_chart(sample_size, title = "Mean (95% CI) sample sizes - drug RCTs (post-hoc analysis)", num_agg_mean = TRUE)

ggsave(paste0(output_path, "/2A.2ph1 Sensitivity analysis - sample sizes - drug trials, post-hoc using means.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")

Analysis 2A.3: Yearly proportion of low, unclear and high risk of bias ratings in each of the seven domains - drug trials

# Random sequence generation
trials_bias_available_5plus_drug %>%
  triple_bias_run_chart(random_sequence_generation, title_add = " - drug RCTs")
ggsave(paste0(output_path, "/2A.3A Sensitivity analysis - Random sequence generation - drug trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Allocation concealment
trials_bias_available_5plus_drug %>%
  triple_bias_run_chart(allocation_concealment, title_add = " - drug RCTs")
ggsave(paste0(output_path, "/2A.3B Sensitivity analysis - Allocation concealment - drug trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Blinding of participants and personnel
trials_bias_available_5plus_drug %>%
  triple_bias_run_chart(blinding_of_participants_and_personnel, title_add = " - drug RCTs")
ggsave(paste0(output_path, "/2A.3C Sensitivity analysis - Blinding of participants and personnel - drug trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Blinding of outcome assessment
trials_bias_available_5plus_drug %>%
  triple_bias_run_chart(blinding_of_outcome_assessment, title_add = " - drug RCTs")
ggsave(paste0(output_path, "/2A.3D Sensitivity analysis - Blinding of outcome assessment - drug trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Incomplete outcome data
trials_bias_available_5plus_drug %>%
  triple_bias_run_chart(incomplete_outcome_data, title_add = " - drug RCTs")
ggsave(paste0(output_path, "/2A.3E Sensitivity analysis - Incomplete outcome data - drug trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Selective reporting
trials_bias_available_5plus_drug %>%
  triple_bias_run_chart(selective_reporting, title_add = " - drug RCTs")
ggsave(paste0(output_path, "/2A.3F Sensitivity analysis - Selective reporting - drug trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Other sources of bias
trials_bias_available_5plus_drug %>%
  triple_bias_run_chart(other_sources_of_bias, title_add = " - drug RCTs")
ggsave(paste0(output_path, "/2A.3G Sensitivity analysis - Other sources of bias - drug trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

NOTE: The analysis of high risk of bias for allocation allocation concealment indicates non-variation, however the median yearly proportion is 0% and thus the median cannot be crossed and the runs cannot be broken in the run chart; this analysis must be interpreted with caution.

5.2 Sensitivity analysis #1B - management trials

Create dataset for risk of bias analyses.

trials_bias_available_5plus_management <- trials_bias_available %>% 
  filter(review_intervention_category == "Management") %>% 
  filter_years()
The following years contain less than 5 trials and have beeen filtered from this dataset: 1981, 1982, 1987, 1988, 1989, 1990, 1992, 1995, 1996, 1997, 1998, 1999, 2000, 2003, 2005, 2015, 2016.
str_glue("In total {nrow(trials_bias_available_5plus_management)} management trials will be included in these sensitivity analyses of risk of bias.")
In total 111 management trials will be included in these sensitivity analyses of risk of bias.

Create dataset for sample size analysis.

trials_size_available_5plus_management <- trials_size_available %>% 
  filter(review_intervention_category == "Management") %>% 
  filter_years()
The following years contain less than 5 trials and have beeen filtered from this dataset: 1981, 1982, 1987, 1988, 1989, 1990, 1992, 1995, 1996, 1997, 1998, 1999, 2000, 2003, 2005, 2015, 2016.
str_glue("In total {nrow(trials_size_available_5plus_management)} management trials will be included in this sensitivity analysis of sample sizes.")
In total 110 management trials will be included in this sensitivity analysis of sample sizes.

Analysis 2B.1: Percentage of overall low risk of bias trials per year - management trials

trials_bias_available_5plus_management %>% 
  run_chart(overall_risk_of_bias, "Low", title = "Overall low risk of bias - management RCTs")

ggsave(paste0(output_path, "/2B.1 Sensitivity analysis - overall low risk of bias - management trials.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")

Analysis 2B.2: Annual median sample sizes - management trials

trials_size_available_5plus_management %>% 
  run_chart(sample_size, title = "Median (IQR) sample sizes - management RCTs")

ggsave(paste0(output_path, "/2B.2 Sensitivity analysis - sample sizes - management trials.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")

Analysis 2B.2ph1: Post-hoc: annual mean sample sizes - management trials

trials_size_available_5plus_management %>% 
  run_chart(sample_size, title = "Mean (95% CI) sample sizes - management RCTs (post-hoc analysis)", num_agg_mean = TRUE)

ggsave(paste0(output_path, "/2B.2ph1 Sensitivity analysis - sample sizes - management trials, post-hoc using means.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")

Analysis 2B.3: Yearly proportion of low, unclear and high risk of bias ratings in each of the seven domains - management trials

# Random sequence generation
trials_bias_available_5plus_management %>%
  triple_bias_run_chart(random_sequence_generation, title_add = " - management RCTs")
ggsave(paste0(output_path, "/2B.3A Sensitivity analysis - Random sequence generation - management trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Allocation concealment
trials_bias_available_5plus_management %>%
  triple_bias_run_chart(allocation_concealment, title_add = " - management RCTs")
ggsave(paste0(output_path, "/2B.3B Sensitivity analysis - Allocation concealment - management trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Blinding of participants and personnel
trials_bias_available_5plus_management %>%
  triple_bias_run_chart(blinding_of_participants_and_personnel, title_add = " - management RCTs")
ggsave(paste0(output_path, "/2B.3C Sensitivity analysis - Blinding of participants and personnel - management trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Blinding of outcome assessment
trials_bias_available_5plus_management %>%
  triple_bias_run_chart(blinding_of_outcome_assessment, title_add = " - management RCTs")
ggsave(paste0(output_path, "/2B.3D Sensitivity analysis - Blinding of outcome assessment - management trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Incomplete outcome data
trials_bias_available_5plus_management %>%
  triple_bias_run_chart(incomplete_outcome_data, title_add = " - management RCTs")
ggsave(paste0(output_path, "/2B.3E Sensitivity analysis - Incomplete outcome data - management trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Selective reporting
trials_bias_available_5plus_management %>%
  triple_bias_run_chart(selective_reporting, title_add = " - management RCTs")
ggsave(paste0(output_path, "/2B.3F Sensitivity analysis - Selective reporting - management trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Other sources of bias
trials_bias_available_5plus_management %>%
  triple_bias_run_chart(other_sources_of_bias, title_add = " - management RCTs")
ggsave(paste0(output_path, "/2B.3G Sensitivity analysis - Other sources of bias - management trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

5.3 Sensitivity analysis #1C - medical device trials

Create dataset for risk of bias analyses.

trials_bias_available_5plus_device <- trials_bias_available %>% 
  filter(review_intervention_category == "Device") %>% 
  filter_years()
The following years contain less than 5 trials and have beeen filtered from this dataset: 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1999, 2000, 2003, 2007, 2008, 2010, 2015, 2016, 2017.
str_glue("In total {nrow(trials_bias_available_5plus_device)} medical device trials will be included in these sensitivity analyses of risk of bias.")
In total 77 medical device trials will be included in these sensitivity analyses of risk of bias.

Create dataset for sample size analysis.

trials_size_available_5plus_device <- trials_size_available %>% 
  filter(review_intervention_category == "Device") %>% 
  filter_years()
The following years contain less than 5 trials and have beeen filtered from this dataset: 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1999, 2000, 2003, 2007, 2008, 2010, 2015, 2016, 2017.
str_glue("In total {nrow(trials_size_available_5plus_device)} medical device trials will be included in this sensitivity analysis of sample sizes.")
In total 77 medical device trials will be included in this sensitivity analysis of sample sizes.

Analysis 2C.1: Percentage of overall low risk of bias trials per year - medical device trials

trials_bias_available_5plus_device %>% 
  run_chart(overall_risk_of_bias, "Low", title = "Overall low risk of bias - medical device RCTs")

ggsave(paste0(output_path, "/2C.1 Sensitivity analysis - overall low risk of bias - medical device trials.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")

Analysis 2C.2: Annual median sample sizes - medical device trials

trials_size_available_5plus_device %>% 
  run_chart(sample_size, title = "Median (IQR) sample sizes - medical device RCTs")

ggsave(paste0(output_path, "/2C.2 Sensitivity analysis - sample sizes - medical device trials.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")

Analysis 2C.2ph1: Post-hoc: annual mean sample sizes - medical device trials

trials_size_available_5plus_device %>% 
  run_chart(sample_size, title = "Mean (95% CI) sample sizes - medical device RCTs (post-hoc analysis)", num_agg_mean = TRUE)

ggsave(paste0(output_path, "/2C.2ph1 Sensitivity analysis - sample sizes - medical device trials, post-hoc using means.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")

Analysis 2C.3: Yearly proportion of low, unclear and high risk of bias ratings in each of the seven domains - medical device trials

# Random sequence generation
trials_bias_available_5plus_device %>%
  triple_bias_run_chart(random_sequence_generation, title_add = " - medical device RCTs")
ggsave(paste0(output_path, "/2C.3A Sensitivity analysis - Random sequence generation - medical device trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Allocation concealment
trials_bias_available_5plus_device %>%
  triple_bias_run_chart(allocation_concealment, title_add = "- medical device RCTs")
ggsave(paste0(output_path, "/2C.3B Sensitivity analysis - Allocation concealment - medical device trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Blinding of participants and personnel
trials_bias_available_5plus_device %>%
  triple_bias_run_chart(blinding_of_participants_and_personnel, title_add = " - medical device RCTs")
ggsave(paste0(output_path, "/2C.3C Sensitivity analysis - Blinding of participants and personnel - medical device trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Blinding of outcome assessment
trials_bias_available_5plus_device %>%
  triple_bias_run_chart(blinding_of_outcome_assessment, title_add = " - medical device RCTs")
ggsave(paste0(output_path, "/2C.3D Sensitivity analysis - Blinding of outcome assessment - medical device trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Incomplete outcome data
trials_bias_available_5plus_device %>%
  triple_bias_run_chart(incomplete_outcome_data, title_add = " - medical device RCTs")
ggsave(paste0(output_path, "/2C.3E Sensitivity analysis - Incomplete outcome data - medical device trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Selective reporting
trials_bias_available_5plus_device %>%
  triple_bias_run_chart(selective_reporting, title_add = " - medical device RCTs")
ggsave(paste0(output_path, "/2C.3F Sensitivity analysis - Selective reporting - medical device trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Other sources of bias
trials_bias_available_5plus_device %>%
  triple_bias_run_chart(other_sources_of_bias, title_add = " - medical device RCTs")
ggsave(paste0(output_path, "/2C.3G Sensitivity analysis - Other sources of bias - medical device trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

6. Sensitivity analysis #2 - all trials

For these sensitivity analyses, all trials will be included, including trials from years with < 5 trials.

This makes the analyses - especially of the risk of bias domains - “fragile” and results should be interpreted with caution.

Analysis 3.1: Percentage of overall low risk of bias trials per year

trials_bias_available %>% 
  run_chart(overall_risk_of_bias, "Low", title = "Overall low risk of bias - all RCTs") +
  scale_y_continuous(limits = c(0, 100), breaks = 10 * 0:10, minor_breaks = 5 * 0:20, labels = paste0(10 * 0:10, "%"), expand = c(0.015, 0))
Scale for 'y' is already present. Adding another scale for 'y', which will replace the existing scale.
ggsave(paste0(output_path, "/3.1 Sensitivity analysis - overall low risk of bias - all trials.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")

NOTE: The y-axis goes from 0-100% in this plot, unlike the other plots of overall low risk of bias (0-50%) due to larger differences in yearly proportions caused by 1 single trial with overall low risk of bias in 2018.

These results indicate non-random variation, however, the median yearly proportion is 0%. Thus no values below the median are possible, and no crossings are possible in the run chart and runs cnanot be broken. This result should be interpreted with caution.

Analysis 3.2: Annual median sample sizes

trials_size_available %>% 
  run_chart(sample_size, title = "Median (IQR) sample sizes - all RCTs")

ggsave(paste0(output_path, "/3.2 Sensitivity analysis - sample sizes - all trials.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")

Analysis 3.2ph1: Annual mean sample sizes

trials_size_available %>% 
  run_chart(sample_size, title = "Mean (95% CI) sample sizes - all RCTs (post-hoc analysis)", num_agg_mean = TRUE)

ggsave(paste0(output_path, "/3.2ph1 Sensitivity analysis - sample sizes - all trials, post-hoc using mean values.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")

Note: for years with only 1 trial, the grey ribbon is missing as a confidence interval cannot be calculated.

Analysis 3.3: Yearly proportion of low, unclear and high risk of bias ratings in each of the seven domains - all trials

# Random sequence generation
trials_bias_available %>%
  triple_bias_run_chart(random_sequence_generation, title_add = " - all RCTs")
ggsave(paste0(output_path, "/3.3A Sensitivity analysis - Random sequence generation.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Allocation concealment
trials_bias_available %>%
  triple_bias_run_chart(allocation_concealment, title_add = " - all RCTs")
ggsave(paste0(output_path, "/3.3B Sensitivity analysis - Allocation concealment.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Blinding of participants and personnel
trials_bias_available %>%
  triple_bias_run_chart(blinding_of_participants_and_personnel, title_add = " - all RCTs")
ggsave(paste0(output_path, "/3.3C Sensitivity analysis - Blinding of participants and personnel.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Blinding of outcome assessment
trials_bias_available %>%
  triple_bias_run_chart(blinding_of_outcome_assessment, title_add = " - all RCTs")
ggsave(paste0(output_path, "/3.3D Sensitivity analysis - Blinding of outcome assessment.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Incomplete outcome data
trials_bias_available %>%
  triple_bias_run_chart(incomplete_outcome_data, title_add = " - all RCTs")
ggsave(paste0(output_path, "/3.3E Sensitivity analysis - Incomplete outcome data.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Selective reporting
trials_bias_available %>%
  triple_bias_run_chart(selective_reporting, title_add = " - all RCTs")
ggsave(paste0(output_path, "/3.3F Sensitivity analysis - Selective reporting.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")


# Other sources of bias
trials_bias_available %>%
  triple_bias_run_chart(other_sources_of_bias, title_add = " - all RCTs")
ggsave(paste0(output_path, "/3.3G Sensitivity analysis - Other sources of bias.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

7. Overview of the number of included trials per year

Create a table of the number of included trials per year in each analysis (and in the full dataset).

# Simple helper function
n_years <- function(.dta){
  name <- quo_name(enquo(.dta))
  .dta %>%
    group_by(trial_year) %>% 
    summarise(!!name := n()) %>% 
    arrange(trial_year)
}

tbl <- n_years(included_trials) %>% # First row is the total number of trials included
  left_join(n_years(trials_bias_available_5plus)) %>% # Primary analysis - bias in >= 5 trials /year
  left_join(n_years(trials_size_available_5plus)) %>% # Primary analysis - size in >= 5 trials/year
  left_join(n_years(trials_bias_available_5plus_drug)) %>% # Sensitivity analysis #1A (drug trials) - bias in >= 5 trials/year
  left_join(n_years(trials_size_available_5plus_drug)) %>% # Sensitivity analysis #1A (drug trials) - size in >= 5 trials/year
  left_join(n_years(trials_bias_available_5plus_management)) %>% # Sensitivity analysis #1B (management trials) - bias in >= 5 trials/year
  left_join(n_years(trials_size_available_5plus_management)) %>% # Sensitivity analysis #1B (management trials) - size in >= 5 trials/year
  left_join(n_years(trials_bias_available_5plus_device)) %>% # Sensitivity analysis #1C (medical device trials) - bias in >= 5 trials/year
  left_join(n_years(trials_size_available_5plus_device)) %>% # Sensitivity analysis #1C (medical device trials) - size in >= 5 trials/year
  left_join(n_years(trials_bias_available)) %>% # Sensitivity analysis #2 - bias in all trials
  left_join(n_years(trials_size_available)) %>% # Sensitivity analysis #2 - size in all trials
  mutate_all(~{ifelse(is.na(.), 0, .)}) # Replace all NAs (missing) with 0

tbl_total <- tbl %>%
  summarise_all(sum) %>% 
  mutate(trial_year = "Total")

tbl %>% 
  mutate(trial_year = as.character(trial_year)) %>% # Convert to string to allow merging with the "Total" row
  bind_rows(tbl_total) %>% # Merge
  print() %>% # Print to RMarkdown document
  write_csv(paste0(output_path, "/trials_per_year_in_each_analysis.csv"))

8. Additional plots

Create two additional plots visualising the absolute numbers; these plots were not pre-planned, but added during the peer review process.

Absolute number of low and high overall risk of bias trials per year:

trials_bias_available %>% 
  group_by(trial_year) %>% 
  summarise(Low = sum(overall_risk_of_bias == "Low"),
            High = sum(overall_risk_of_bias == "High")) %>% 
  gather(key = "rob", value = "n", -trial_year) %>% 
  arrange(trial_year) %>% 
  mutate(rob = factor(rob, levels = c("Low", "High"))) %>% 
  ggplot(aes(x = trial_year, y = n, fill = rob)) +
  geom_col(position = "dodge") +
  scale_fill_manual(values = c("#00C000", "#C00000"), name = "Overall risk of bias:") +
  labs(x = NULL, y = "Number of RCTs") +
  scale_x_continuous(limits = c(1975, 2020), breaks = 1975 + 5 * 0:8, minor_breaks = 1975:2020, expand = c(0, 0)) +
  scale_y_continuous(limits = c(0, 50), breaks = 0:5 * 10, minor_breaks = 0:10 * 5, expand = c(0, 0)) + 
  theme(plot.margin = margin (t = 5, l = 10, r = 15, b = 5, unit = "pt"), legend.position = "bottom") # Fix plot margins to ensure all labels are printed and change legend position
ggsave(paste0(output_path, "/Additional plots - absolute number of trials by overall risk of bias per year.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")

Absolute number of small and large trials per year:

Small trials are defined as trials including <= the median overall sample size participants per year; large trials are defined as trials including > the median overal sample size participants per year.

# Calculate median
med_sample_size <- trials_size_available %>%
  pull(sample_size) %>% 
  median()

# Categorise trials
trials_size_categorised <- trials_size_available %>% 
  mutate(size_cat = ifelse(sample_size <= med_sample_size, "Small", "Large"))
str_glue("The overall median sample size is {med_sample_size}.
         {sum(trials_size_categorised$size_cat == 'Small')} trials are small and {sum(trials_size_categorised$size_cat == 'Large')} trials are large.")
The overall median sample size is 82.5.
301 trials are small and 301 trials are large.
# Plot
trials_size_categorised %>% 
  group_by(trial_year) %>% 
  summarise(Small = sum(size_cat == "Small"),
            Large = sum(size_cat == "Large")) %>% 
  gather(key = "size", value = "n", -trial_year) %>% 
  arrange(trial_year) %>% 
  mutate(size = factor(size, levels = c("Small", "Large"))) %>% 
  ggplot(aes(x = trial_year, y = n, fill = size)) +
  geom_col(position = "dodge") +
  scale_fill_manual(values = c("#C00000", "#00C000"), name = "RCT sizes:") +
  labs(x = NULL, y = "Number of RCTs") +
  scale_x_continuous(limits = c(1975, 2020), breaks = 1975 + 5 * 0:8, minor_breaks = 1975:2020, expand = c(0, 0)) +
  scale_y_continuous(limits = c(0, 30), breaks = 0:6 * 5, expand = c(0, 0)) + 
  theme(plot.margin = margin (t = 5, l = 10, r = 15, b = 5, unit = "pt"), legend.position = "bottom") # Fix plot margins to ensure all labels are printed and change legend position


ggsave(paste0(output_path, "/Additional plots - absolute number of trials by overall size per year.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")

9. Date and session info

Save the date and session info for when the script was run to the RMarkdown-HTML-file.

date()
[1] "Sat May 18 16:45:31 2019"
sessionInfo()
R version 3.5.3 (2019-03-11)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.4

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] forcats_0.4.0   stringr_1.4.0   dplyr_0.8.0.1   purrr_0.3.2     readr_1.3.1     tidyr_0.8.3     tibble_2.1.1    ggplot2_3.1.1  
[9] tidyverse_1.2.1

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.1       cellranger_1.1.0 pillar_1.3.1     compiler_3.5.3   plyr_1.8.4       tools_3.5.3      digest_0.6.18   
 [8] jsonlite_1.6     lubridate_1.7.4  nlme_3.1-139     gtable_0.3.0     lattice_0.20-38  pkgconfig_2.0.2  rlang_0.3.4     
[15] cli_1.1.0        rstudioapi_0.10  haven_2.1.0      xfun_0.6         withr_2.1.2      xml2_1.2.0       httr_1.4.0      
[22] knitr_1.22       generics_0.0.2   hms_0.4.2        cowplot_0.9.4    grid_3.5.3       tidyselect_0.2.5 glue_1.3.1      
[29] R6_2.4.0         readxl_1.3.1     modelr_0.1.4     magrittr_1.5     backports_1.1.4  scales_1.0.0     rvest_0.3.3     
[36] assertthat_0.2.1 colorspace_1.4-1 labeling_0.3     stringi_1.4.3    lazyeval_0.2.2   munsell_0.5.0    broom_0.5.2     
[43] crayon_1.3.4    
---
title: "Overall bias and sample sizes were unchanged in ICU trials over time: a meta-epidemiological study - final analysis code"
output: html_notebook
---

This [R Markdown-notebook](https://bookdown.org/yihui/rmarkdown/notebook.html) contains the final annotated analysis code for the manuscript: ***"Overall bias and sample sizes were unchanged in ICU trials over time: a meta-epidemiological study"*** by **Carl Thomas Anthon, Anders Granholm, Jon Henrik Laake, Anders Perner and Morten Hylander Møller**.

The annotated analysis code was written by [Anders Granholm](mailto:andersgran@gmail.com) according to the pre-published protocol and statistical analysis plain ([doi: 10.1111/aas.13258](https://doi.org/10.1111/aas.13258)). For further details, please see the full article and its supplementary files, which also includes the CSV-file containing the final dataset and a data dictionary.

For correspondence related to the manuscript, contact the corresponding author [Carl Thomas Anthon](mailto:carl.anthon@gmail.com). Questions directly related to the analysis code can also be sent to [Anders Granholm](mailto:andersgran@gmail.com).

Figures displayed in this document are scaled differently than in the final paper and supplement.

### 1. Setup

Load the packages/functions required for the analyses.

```{r message=FALSE}
library(tidyverse) # Load the Tidyverse packages, used for data processing and visualisation.
plot_grid <- cowplot::plot_grid # Load this function from the cowplot package; only this function will be used from this package, and the entire library will not be loaded as this is unnecessary and cowplot overrides the default ggplot2 theme, which then has to be reset.
```

Import the file containing data for all the included trials.

**Note:** This code block uses RStudio's API to allow manual selection of the final dataset. This requires the script to be run in a recent version of RStudio; otherwise this has to be manually replaced with the appropriate file path.

```{r}
included_trials <- read_csv(rstudioapi::selectFile(filter = "CSV files (*.csv)"), col_types = cols(trial_year = col_integer(), sample_size = col_integer()))
```

Define the output directory where all results will be saved.

**Note:** This block uses RStudio's API to allow manual selection of the directory. This requires the script to be run in a recent version of RStudio; otherwise this has to be manually replaced with the appropriate path.

```{r}
output_path <- rstudioapi::selectDirectory()
```

Define plot theme and image settings for saving images, to allow easy changing later.

```{r}
# File settings
img_dpi <- 600 # DPI - not used for vectorized formats, e.g. PDFs, but argument is kept if this is changed.
img_format <- "pdf" # File format
# Widths and heights (in millimeters) for single run chart plots
img_width <- 150
img_height <- 80
# Widths and heights (in millimeters) for the triple run chart plots
img_width_triple <- 150
img_height_triple <- 150
# Width and height (in millimeters) for the risk of bias overview plot
img_width_bias <- 200
img_height_bias <- 80

# Set ggplot2 theme
theme_set(theme_linedraw())
```


### 2. Data inspection and preparation

#### 2.1 Helper function
The following helper function filters a dataset so only years with a certain number of trials are included, and optionally prints the excluded years.
Arguments:

- **.dta**: the dataset to filter.
- **minimum**: the minimum number of required trials per year, default is 5.
- **verbose**: if TRUE (default), the years with too few trials are printed.

```{r}
filter_years <- function(.dta, minimum = 5, verbose = TRUE){
  years <- .dta %>% 
    group_by(trial_year) %>% 
    summarise(n = n()) %>% 
    filter(n < minimum) %>% 
    pull(trial_year)
  
  if (verbose){
    print(str_glue("The following years contain less than {minimum} trials and have beeen filtered from this dataset: {str_c(years, collapse = ', ')}."))
  }
  
  .dta %>% 
    filter(!(trial_year %in% years))
}
```


#### 2.2 Risk of bias

Create **overall_risk_of_bias** variable and convert all risk of bias domains to factors.

```{r}
included_trials <- included_trials %>% 
  mutate(overall_risk_of_bias = case_when(is.na(random_sequence_generation) | is.na(allocation_concealment) | is.na(blinding_of_participants_and_personnel) | is.na(blinding_of_outcome_assessment) | is.na(incomplete_outcome_data) | is.na(selective_reporting) | is.na(other_sources_of_bias) ~ NA_character_, # If any risk of bias domain is missing, the overall risk of bias will be missing
                                          random_sequence_generation == "Low" & allocation_concealment == "Low" & blinding_of_participants_and_personnel == "Low" & blinding_of_outcome_assessment == "Low" & incomplete_outcome_data == "Low" & selective_reporting == "Low" & other_sources_of_bias == "Low" ~ "Low", # If all domains are low risk, the overall risk of bias will be low as well
                                          TRUE ~ "High" # If none of the above criteria apply, the overall risk of bias will be high
         )) %>% 
  mutate_at(vars(random_sequence_generation:overall_risk_of_bias), ~factor(., levels = c("Low", "Unclear", "High")))
```

Assess how many trials have missing data for the overall risk of bias domain (which will be missing if any of the other domains are missing) and create dataset that only includes trials with available risk of bias ratings.

```{r}
included_trials %>% 
  filter(is.na(overall_risk_of_bias)) %>% 
  str_glue_data("In total, {nrow(.)} trials have missing data for one or more risk of bias domains and will be excluded from these analyses.")

trials_bias_available <- included_trials %>% 
  filter(!is.na(overall_risk_of_bias))
```

Create dataset that only contains trials from years with at least 5 trials with available risk of bias ratings (for the primary analyses).

```{r}
trials_bias_available_5plus <- filter_years(trials_bias_available)
```

For all trials with available risk of bias, calculate overall distributions for each domain.

```{r}
trials_bias_available %>% 
  gather(key = "bias_domain", value = "rating", random_sequence_generation:overall_risk_of_bias, factor_key = TRUE) %>% 
  select(bias_domain, rating) %>% 
  group_by(bias_domain) %>% 
  summarise(low = sum(rating == "Low") / n() * 100,
            unclear = sum(rating == "Unclear") / n() * 100,
            high = sum(rating == "High") / n() * 100) %>% 
  mutate_if(is.numeric, ~paste0(round(., 1), "%"))
```

Create plot visualising distributions of risk of bias for each domain (only include trials where risk of bias ratings are available).

```{r}
trials_bias_available %>% 
  gather(key = "bias_domain", value = "rating", random_sequence_generation:overall_risk_of_bias, factor_key = TRUE) %>% 
  select(bias_domain, rating) %>%
  mutate(rating = factor(rating, levels = c("High", "Unclear", "Low")),
         bias_domain = fct_rev(bias_domain)) %>%  # Reverse the order to ensure proper printing in the table
  ggplot(aes(x = bias_domain, fill = rating)) + # Create the plot
  geom_bar(position = "fill", width = 0.95) +
  coord_flip() +
  labs(x = NULL, y = NULL) +
  scale_fill_manual(values = c("#C00000", "#E0E000", "#00C000"), name = "Risk of bias rating:", guide = guide_legend(reverse = TRUE)) +
  scale_y_continuous(expand = c(0, 0), breaks = 0.1 * 0:10, minor_breaks = 0.05 * 0:20, labels = paste0(10 * 0:10, "%")) +
  scale_x_discrete(expand = c(0, 0), labels = rev(c( # Use rev() to ensure that order matches the domains above
    "Random sequence generation",
    "Allocation concealment",
    "Blinding of participants and personnel",
    "Blinding of outcome assessment",
    "Incomplete outcome data",
    "Selective reporting",
    "Other sources of bias",
    "Overall risk of bias"
  ))) +
  theme(plot.margin = margin (t = 5, l = 10, r = 15, b = 5, unit = "pt"), legend.position = "bottom") # Fix plot margins to ensure all labels are printed and change legend

ggsave(paste0(output_path, "/Risk of bias overview.", img_format), width = img_width_bias, height = img_height_bias, dpi = img_dpi, units = "mm") # Save file
```

#### 2.3 Sample sizes

Assess how many trials have missing data for sample size and create dataset that only includes trials with available sample sizes.

```{r}
included_trials %>% 
  filter(is.na(sample_size)) %>% 
  str_glue_data("In total, {nrow(.)} trials have missing data for sample sizes and will be excluded from these analyses.")

trials_size_available <- included_trials %>% 
  filter(!is.na(sample_size))
```

Create dataset that only contains trials from years with at least 5 trials with available sample sizes (for the primary analyses).

```{r}
trials_size_available_5plus <- filter_years(trials_size_available)
```

For all trials with available sample sizes, calculate the overall distribution.

```{r}
round(summary(trials_size_available$sample_size), 1)
```

### 3. Define run chart functions

Define the function that will produce the run charts and associated statistics.

This contains two helper functions, *run_chart_stat* and *run_chart_caption* which should only be called from the main function, **run_chart**, and not directly.

The **run_chart** function has the following arguments:

- **.dta**: the dataset used.
- **variable**: the variable (e.g. a risk of bias domain or the sample sizes variable), which will be assessed.
- **category**: if the variable is a risk of bias domain, the rating (e.g. "Low") that will be assessed.
- **title**: the plot title, if any is wanted. See argument below.
- **delete_title**: this argument was added after the majority of the code was written. If TRUE (current default), no plot titles will be used regardless of the above argument.
- **fixed_median**: defaults to NULL, and then the median value will be calculated using the data provided. If a numeric value is provided, the median will be fixed to this value. This argument is currently not used in any analyses.
- **num_agg_mean**: defaults to FALSE. If FALSE, for numeric values (i.e. yearly samply sizes) the median value will be used in the run chart and the interquartile range will be displayed. If TRUE, mean values will be used and the 95% confidence intervals for the mean will be displayed instead. Ignored for risk of bias domains.

The function returns an annotated plot.

An additional, simplified helper function is present, **triple_bias_run_chart**, with the following arguments:

- **.dta**: the dataset used.
- **variable**: the risk of bias domain that will be assessed.
- **title_add**: extra text added as part of the title (added directly after the bias domain), if NULL (default), nothing will be added.

```{r}
# Function that calculates the run chart statistics using the "val" column in the aggregated dataset from the main function that has already been sorted by year.
# This function should not be called directly
run_chart_stat <- function(val, fixed_median = NULL){
  n_total <- length(val)
  if (n_total < 1){
    stop("ERROR: run_chart_stat: number of observations (years) < 1.")
  }
  if (!(is.null(fixed_median) | is.numeric(fixed_median))) {
    stop("ERROR: run_chart_stat: fixed_median must be either NULL or a numeric value.")
  }
  med <- ifelse(is.null(fixed_median), median(val), fixed_median)
  n_useful <- sum(!near(val, med))
  
  run_longest <- 0
  run_current <- 0
  pos <- NA
  cross_n <- NA
  
  for (i in 1:n_total) {
    cur <- val[i]
    if (!near(cur, med)) {
      
      if (is.na(pos)){ # Will be true for the first value that is not equal to the median (and only once)
        pos <- !(cur > med) # Sets current position as the opposite of what it is, to start new run
        cross_n <- -1 # The first run will not count as a crossing
      }
      
      if ((cur > med ) == pos) { # If the current value is on the same side as the last one
        run_current <- run_current + 1
      } else { # If the current value is on the opposite side as the last one
        if (run_current > run_longest) {
          run_longest <- run_current
        }
        
        run_current <- 1
        cross_n <- cross_n +1
        pos <- cur > med
      }
      
    }
  }
  if (run_current > run_longest) {
    run_longest <- run_current
  }
  
  # Return results (upper and lower prediction limits calculated according to Anhøj's rules)
  list(n_total = n_total, n_useful = n_useful, median = med,
       run_UPL = round(log2(n_useful) + 3), run_longest = run_longest,
       cross_LPL = qbinom(0.05, n_useful - 1, 0.5), cross_n = cross_n, fixed_median = !is.null(fixed_median))
}

# Function that formats the caption used in each run chart
# This function should not be called directly
run_chart_caption <- function(res, percentage = NULL, dec = 1){
  res$base <- paste0(res$n_total, " observations (", res$n_useful, " useful). ", ifelse(res$fixed_median, "Fixed median: ", "Median: "), format(res$median, digits = dec, nsmall = dec), percentage, ". Longest run:")
  res$long <- paste0("[max. ", res$run_UPL, "].")
  res$cross <- paste0("[min. ", res$cross_LPL, "].")
  
  if (res$run_longest > res$run_UPL & res$cross_n < res$cross_LPL){
    # Both results are significant (non-random variation)
    res <- map(res, as.character)
    cap <- substitute(base~bold(run_longest)~long~"Crossings:"~bold(cross_n)~cross, res)
  } else if (res$run_longest > res$run_UPL) {
    # Only the longest run is significant (non-random variation)
    res <- map(res, as.character)
    cap <- substitute(base~bold(run_longest)~long~"Crossings:"~cross_n~cross, res)
  } else if (res$cross_n < res$cross_LPL) {
    # Only the number of crossings is significant (non-random variation)
    res <- map(res, as.character)
    cap <- substitute(base~run_longest~long~"Crossings:"~bold(cross_n)~cross, res)
  } else {
    # No significance (only random variation)
    res <- map(res, as.character)
    cap <- substitute(base~run_longest~long~"Crossings:"~cross_n~cross, res)
  }
  # Return caption
  cap
}

# The main function - see description above
run_chart <- function(.dta, variable, category = NULL, title = NULL, delete_titles = TRUE, fixed_median = NULL, num_agg_mean = FALSE){
  # Quote variable name and get the type of it
  var <- enquo(variable)
  var_pull <- pull(.dta, !!var)
  type <- case_when(is.factor(var_pull) ~ "factor",
                    is.integer(var_pull) ~ "integer",
                    TRUE ~ "other")
  
  # Input checks
  if (type == "factor") {
    if (is.null(category)) {
      stop("ERROR: run_chart: 'variable' is a factor (risk of bias domain?) and 'category' is NULL. 'category' must be either 'Low', 'Unclear' or 'High'.", call. = FALSE)
    } else if (!(category %in% c("Low", "Unclear", "High"))){
      stop(paste0("ERROR: run_chart: 'variable' is a factor (risk of bias domain?), 'category' is '", category, "', but must be either 'Low', 'Unclear' or 'High'."), call. = FALSE)
    }
  } else if (type == "integer") {
    if (!is.null(category)){
      warning(paste0("WARNING: run_chart: 'variable' is an integer (sample size?). 'category' is '", category, "' and will NOT be used."), call. = FALSE)
    }
  } else {
    stop(paste0("ERROR: run_chart: 'variable' must be either a factor (risk of bias category) or an integer (sample size), '", quo_name(var), "' is neither."))
  }
  
  # Remove missing data
  n_miss <- .dta %>% 
    filter(is.na(trial_year) | is.na(!!var)) %>% 
    nrow()
  if (n_miss > 0) {
    warning(paste0("WARNING: run_chart: ", n_miss, " rows with missing data in either trial_year or ", quo_name(var), " removed from the dataset prior to analyses."))
    .dta <- .dta %>% 
      filter(!is.na(trial_year) & !is.na(!!var))
  }
  
  # If delete_titles == TRUE, then set title to null
  if (delete_titles){
    title <- NULL
  }
  
  # Aggregate data
  if (type == "factor") {
    agg <- .dta %>% 
      group_by(trial_year) %>% 
      summarise(val = sum(!!var == category) / n() * 100) %>% 
      arrange(trial_year)
  } else if (type == "integer") {
    if (!num_agg_mean){
      agg <- .dta %>% 
        group_by(trial_year) %>% 
        summarise(val = median(!!var),
                  y_high = quantile(!!var, 0.75),
                  y_low = quantile(!!var, 0.25)) %>%
        arrange(trial_year)      
    } else if (num_agg_mean) {
      agg <- .dta %>% 
        group_by(trial_year) %>% 
        summarise(val = mean(!!var),
                  se_val = sd(!!var) / sqrt(n()),
                  y_high = val + 1.96 * se_val,
                  y_low_tmp = val - 1.96 * se_val,
                  y_low = ifelse(y_low_tmp < 0, 0, y_low_tmp)) %>%  # 95% confidence intervals (CIs) below 0 are impossible
        arrange(trial_year)
    }

  }
  
  # Calculate stats
  stats <- agg %>%
    pull(val) %>% 
    run_chart_stat(fixed_median = fixed_median)
  
  # General plot aspects
  gg <- ggplot(data = agg, aes(x = trial_year, y = val)) +
    labs(caption = run_chart_caption(stats, if_else(type == "factor", "%", "")), x = NULL, y = NULL, subtitle = title) + # The title argument provided is used for a ggplot2 subtitle, due to the default smaller font size
    theme(plot.caption = element_text(hjust = 0.5)) + # Align caption in the middle
    scale_x_continuous(limits = c(1975, 2020), breaks = 1975 + 5 * 0:8, minor_breaks = 1975:2020, expand = c(0, 0))
  
  # Add different aspects for risk of bias domains and sample sizes
  if (type == "factor") {
    # Risk of bias domains - different y-axis for overall risk of bias plots and subdomains
    if (quo_name(var) == "overall_risk_of_bias") {
      gg <- gg +
        scale_y_continuous(limits = c(0, 50), breaks = 10 * 0:5, minor_breaks = 5 * 0:10, labels = paste0(10 * 0:5, "%"), expand = c(0.015, 0))
    } else {
      gg <- gg +
        scale_y_continuous(limits = c(0, 100), breaks = 20 * 0:5, minor_breaks = 5 * 0:20, labels = paste0(20 * 0:5, "%"), expand = c(0.015, 0))
    }
  } else if (type == "integer") {
    # Sample sizes - different y-axis according to aggregation function
    gg <- gg +
      scale_y_continuous(breaks = 100 * 0:15, minor_breaks = 25 * 0:60, expand = c(0, 0)) + # Axis long enough for both types of plots - different limits for median/mean values are set in next line
      coord_cartesian(ylim = c(0, ifelse(!num_agg_mean, 600, 1400))) + # Limits are set here to avoid outliers for 95% CIs of yearly means to be removed
      geom_ribbon(aes(ymin = y_low, ymax = y_high), fill = "#B5B5B5", alpha = 0.80) # Add grey shaded area
  }
  
  # Add points and lines
  gg <- gg + geom_point(size = 2) +
    geom_line() +
    geom_hline(yintercept = stats$median, size = 1.25)
  
  # Return plot
  gg
}

# Simple helper function that creates a combined plot with all 3 risk of bias ratings
triple_bias_run_chart <- function(.dta, variable, title_add = NULL){
  variable <- enquo(variable)
  title_base <- str_replace_all(str_to_title(quo_name(variable)), "_", " ")
  plot_grid(
    run_chart(.dta, !!variable, category = "Low", title = paste0(title_base, title_add)) + ylab("Low"),
    run_chart(.dta, !!variable, category = "Unclear") + ylab("Unclear"),
    run_chart(.dta, !!variable, category = "High") + ylab("High"),
    nrow = 3, ncol = 1)
}
```


### 4. Primary analysis

*For all primary analyses, only years with 5 or more trials are used*.

**Analysis 1.1: Percentage of overall low risk of bias trials per year**

```{r}
trials_bias_available_5plus %>% 
  run_chart(overall_risk_of_bias, "Low", title = "Overall low risk of bias")

ggsave(paste0(output_path, "/1.1 Primary analysis - overall low risk of bias.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")
```

**Analysis 1.2: Annual median sample sizes**

```{r}
trials_size_available_5plus %>% 
  run_chart(sample_size, title = "Median (IQR) sample sizes")

ggsave(paste0(output_path, "/1.2 Primary analysis - sample sizes.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")
```

**Analysis 1.2ph1: Post-hoc analysis of the annual mean sample sizes**

```{r}
trials_size_available_5plus %>% 
  run_chart(sample_size, title = "Mean (95% CI) sample sizes (post-hoc analysis)", num_agg_mean = TRUE)

ggsave(paste0(output_path, "/1.2ph1 Primary analysis - sample sizes, post-hoc analysis using mean values.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")
```

**Analysis 1.3: Yearly proportion of low, unclear and high risk of bias ratings in each of the seven domains**

```{r}
# Random sequence generation
trials_bias_available_5plus %>%
  triple_bias_run_chart(random_sequence_generation)
ggsave(paste0(output_path, "/1.3A Primary analysis - Random sequence generation.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Allocation concealment
trials_bias_available_5plus %>%
  triple_bias_run_chart(allocation_concealment)
ggsave(paste0(output_path, "/1.3B Primary analysis - Allocation concealment.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Blinding of participants and personnel
trials_bias_available_5plus %>%
  triple_bias_run_chart(blinding_of_participants_and_personnel)
ggsave(paste0(output_path, "/1.3C Primary analysis - Blinding of participants and personnel.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Blinding of outcome assessment
trials_bias_available_5plus %>%
  triple_bias_run_chart(blinding_of_outcome_assessment)
ggsave(paste0(output_path, "/1.3D Primary analysis - Blinding of outcome assessment.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Incomplete outcome data
trials_bias_available_5plus %>%
  triple_bias_run_chart(incomplete_outcome_data)
ggsave(paste0(output_path, "/1.3E Primary analysis - Incomplete outcome data.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Selective reporting
trials_bias_available_5plus %>%
  triple_bias_run_chart(selective_reporting)
ggsave(paste0(output_path, "/1.3F Primary analysis - Selective reporting.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Other sources of bias
trials_bias_available_5plus %>%
  triple_bias_run_chart(other_sources_of_bias)
ggsave(paste0(output_path, "/1.3G Primary analysis - Other sources of bias.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")
```

### 5. Sensitivity analysis #1 - all trials from years with 5 or more trials, but separated according to type of intervention

*As for the primary analyses, only years with 5 or more trials are used here*.

Assess the distribution of intervention types.

```{r}
included_trials %>% 
  group_by(review_intervention_category) %>% 
  summarise(n = n())
```

Rather few medical device and management trials are included. Thus, these analyses will contain few data points and should be interpreted with caution.

#### 5.1 Sensitivity analysis #1A - drug trials

Create dataset for risk of bias analyses.

```{r}
trials_bias_available_5plus_drug <- trials_bias_available %>% 
  filter(review_intervention_category == "Drug") %>% 
  filter_years()

str_glue("In total {nrow(trials_bias_available_5plus_drug)} drug trials will be included in these sensitivity analyses of risk of bias.")
```

Create dataset for sample size analysis.

```{r}
trials_size_available_5plus_drug <- trials_size_available %>% 
  filter(review_intervention_category == "Drug") %>% 
  filter_years()

str_glue("In total {nrow(trials_size_available_5plus_drug)} drug trials will be included in this sensitivity analysis of sample sizes.")
```

**Analysis 2A.1: Percentage of overall low risk of bias trials per year - drug trials**

```{r}
trials_bias_available_5plus_drug %>% 
  run_chart(overall_risk_of_bias, "Low", title = "Overall low risk of bias - drug RCTs")

ggsave(paste0(output_path, "/2A.1 Sensitivity analysis - overall low risk of bias - drug trials.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")
```

**Analysis 2A.2: Annual median sample sizes - drug trials**

```{r}
trials_size_available_5plus_drug %>% 
  run_chart(sample_size, title = "Median (IQR) sample sizes - drug RCTs")

ggsave(paste0(output_path, "/2A.2 Sensitivity analysis - sample sizes - drug trials.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")
```

**Analysis 2A.2ph1: Post-hoc: annual mean sample sizes - drug trials**

```{r}
trials_size_available_5plus_drug %>% 
  run_chart(sample_size, title = "Mean (95% CI) sample sizes - drug RCTs (post-hoc analysis)", num_agg_mean = TRUE)

ggsave(paste0(output_path, "/2A.2ph1 Sensitivity analysis - sample sizes - drug trials, post-hoc using means.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")
```

**Analysis 2A.3: Yearly proportion of low, unclear and high risk of bias ratings in each of the seven domains - drug trials**

```{r}
# Random sequence generation
trials_bias_available_5plus_drug %>%
  triple_bias_run_chart(random_sequence_generation, title_add = " - drug RCTs")
ggsave(paste0(output_path, "/2A.3A Sensitivity analysis - Random sequence generation - drug trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Allocation concealment
trials_bias_available_5plus_drug %>%
  triple_bias_run_chart(allocation_concealment, title_add = " - drug RCTs")
ggsave(paste0(output_path, "/2A.3B Sensitivity analysis - Allocation concealment - drug trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Blinding of participants and personnel
trials_bias_available_5plus_drug %>%
  triple_bias_run_chart(blinding_of_participants_and_personnel, title_add = " - drug RCTs")
ggsave(paste0(output_path, "/2A.3C Sensitivity analysis - Blinding of participants and personnel - drug trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Blinding of outcome assessment
trials_bias_available_5plus_drug %>%
  triple_bias_run_chart(blinding_of_outcome_assessment, title_add = " - drug RCTs")
ggsave(paste0(output_path, "/2A.3D Sensitivity analysis - Blinding of outcome assessment - drug trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Incomplete outcome data
trials_bias_available_5plus_drug %>%
  triple_bias_run_chart(incomplete_outcome_data, title_add = " - drug RCTs")
ggsave(paste0(output_path, "/2A.3E Sensitivity analysis - Incomplete outcome data - drug trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Selective reporting
trials_bias_available_5plus_drug %>%
  triple_bias_run_chart(selective_reporting, title_add = " - drug RCTs")
ggsave(paste0(output_path, "/2A.3F Sensitivity analysis - Selective reporting - drug trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Other sources of bias
trials_bias_available_5plus_drug %>%
  triple_bias_run_chart(other_sources_of_bias, title_add = " - drug RCTs")
ggsave(paste0(output_path, "/2A.3G Sensitivity analysis - Other sources of bias - drug trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")
```

**NOTE:** The analysis of high risk of bias for allocation allocation concealment indicates non-variation, however the median yearly proportion is 0% and thus the median cannot be crossed and the runs cannot be broken in the run chart; this analysis must be interpreted with caution.

#### 5.2 Sensitivity analysis #1B - management trials

Create dataset for risk of bias analyses.

```{r}
trials_bias_available_5plus_management <- trials_bias_available %>% 
  filter(review_intervention_category == "Management") %>% 
  filter_years()

str_glue("In total {nrow(trials_bias_available_5plus_management)} management trials will be included in these sensitivity analyses of risk of bias.")
```

Create dataset for sample size analysis.

```{r}
trials_size_available_5plus_management <- trials_size_available %>% 
  filter(review_intervention_category == "Management") %>% 
  filter_years()

str_glue("In total {nrow(trials_size_available_5plus_management)} management trials will be included in this sensitivity analysis of sample sizes.")
```

**Analysis 2B.1: Percentage of overall low risk of bias trials per year - management trials**

```{r}
trials_bias_available_5plus_management %>% 
  run_chart(overall_risk_of_bias, "Low", title = "Overall low risk of bias - management RCTs")

ggsave(paste0(output_path, "/2B.1 Sensitivity analysis - overall low risk of bias - management trials.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")
```

**Analysis 2B.2: Annual median sample sizes - management trials**

```{r}
trials_size_available_5plus_management %>% 
  run_chart(sample_size, title = "Median (IQR) sample sizes - management RCTs")

ggsave(paste0(output_path, "/2B.2 Sensitivity analysis - sample sizes - management trials.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")
```

**Analysis 2B.2ph1: Post-hoc: annual mean sample sizes - management trials**

```{r}
trials_size_available_5plus_management %>% 
  run_chart(sample_size, title = "Mean (95% CI) sample sizes - management RCTs (post-hoc analysis)", num_agg_mean = TRUE)

ggsave(paste0(output_path, "/2B.2ph1 Sensitivity analysis - sample sizes - management trials, post-hoc using means.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")
```

**Analysis 2B.3: Yearly proportion of low, unclear and high risk of bias ratings in each of the seven domains - management trials**

```{r}
# Random sequence generation
trials_bias_available_5plus_management %>%
  triple_bias_run_chart(random_sequence_generation, title_add = " - management RCTs")
ggsave(paste0(output_path, "/2B.3A Sensitivity analysis - Random sequence generation - management trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Allocation concealment
trials_bias_available_5plus_management %>%
  triple_bias_run_chart(allocation_concealment, title_add = " - management RCTs")
ggsave(paste0(output_path, "/2B.3B Sensitivity analysis - Allocation concealment - management trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Blinding of participants and personnel
trials_bias_available_5plus_management %>%
  triple_bias_run_chart(blinding_of_participants_and_personnel, title_add = " - management RCTs")
ggsave(paste0(output_path, "/2B.3C Sensitivity analysis - Blinding of participants and personnel - management trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Blinding of outcome assessment
trials_bias_available_5plus_management %>%
  triple_bias_run_chart(blinding_of_outcome_assessment, title_add = " - management RCTs")
ggsave(paste0(output_path, "/2B.3D Sensitivity analysis - Blinding of outcome assessment - management trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Incomplete outcome data
trials_bias_available_5plus_management %>%
  triple_bias_run_chart(incomplete_outcome_data, title_add = " - management RCTs")
ggsave(paste0(output_path, "/2B.3E Sensitivity analysis - Incomplete outcome data - management trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Selective reporting
trials_bias_available_5plus_management %>%
  triple_bias_run_chart(selective_reporting, title_add = " - management RCTs")
ggsave(paste0(output_path, "/2B.3F Sensitivity analysis - Selective reporting - management trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Other sources of bias
trials_bias_available_5plus_management %>%
  triple_bias_run_chart(other_sources_of_bias, title_add = " - management RCTs")
ggsave(paste0(output_path, "/2B.3G Sensitivity analysis - Other sources of bias - management trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")
```

#### 5.3 Sensitivity analysis #1C - medical device trials

Create dataset for risk of bias analyses.

```{r}
trials_bias_available_5plus_device <- trials_bias_available %>% 
  filter(review_intervention_category == "Device") %>% 
  filter_years()

str_glue("In total {nrow(trials_bias_available_5plus_device)} medical device trials will be included in these sensitivity analyses of risk of bias.")
```

Create dataset for sample size analysis.

```{r}
trials_size_available_5plus_device <- trials_size_available %>% 
  filter(review_intervention_category == "Device") %>% 
  filter_years()

str_glue("In total {nrow(trials_size_available_5plus_device)} medical device trials will be included in this sensitivity analysis of sample sizes.")
```

**Analysis 2C.1: Percentage of overall low risk of bias trials per year - medical device trials**

```{r}
trials_bias_available_5plus_device %>% 
  run_chart(overall_risk_of_bias, "Low", title = "Overall low risk of bias - medical device RCTs")

ggsave(paste0(output_path, "/2C.1 Sensitivity analysis - overall low risk of bias - medical device trials.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")
```

**Analysis 2C.2: Annual median sample sizes - medical device trials**

```{r}
trials_size_available_5plus_device %>% 
  run_chart(sample_size, title = "Median (IQR) sample sizes - medical device RCTs")

ggsave(paste0(output_path, "/2C.2 Sensitivity analysis - sample sizes - medical device trials.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")
```

**Analysis 2C.2ph1: Post-hoc: annual mean sample sizes - medical device trials**

```{r}
trials_size_available_5plus_device %>% 
  run_chart(sample_size, title = "Mean (95% CI) sample sizes - medical device RCTs (post-hoc analysis)", num_agg_mean = TRUE)

ggsave(paste0(output_path, "/2C.2ph1 Sensitivity analysis - sample sizes - medical device trials, post-hoc using means.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")
```

**Analysis 2C.3: Yearly proportion of low, unclear and high risk of bias ratings in each of the seven domains - medical device trials**
```{r}
# Random sequence generation
trials_bias_available_5plus_device %>%
  triple_bias_run_chart(random_sequence_generation, title_add = " - medical device RCTs")
ggsave(paste0(output_path, "/2C.3A Sensitivity analysis - Random sequence generation - medical device trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Allocation concealment
trials_bias_available_5plus_device %>%
  triple_bias_run_chart(allocation_concealment, title_add = "- medical device RCTs")
ggsave(paste0(output_path, "/2C.3B Sensitivity analysis - Allocation concealment - medical device trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Blinding of participants and personnel
trials_bias_available_5plus_device %>%
  triple_bias_run_chart(blinding_of_participants_and_personnel, title_add = " - medical device RCTs")
ggsave(paste0(output_path, "/2C.3C Sensitivity analysis - Blinding of participants and personnel - medical device trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Blinding of outcome assessment
trials_bias_available_5plus_device %>%
  triple_bias_run_chart(blinding_of_outcome_assessment, title_add = " - medical device RCTs")
ggsave(paste0(output_path, "/2C.3D Sensitivity analysis - Blinding of outcome assessment - medical device trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Incomplete outcome data
trials_bias_available_5plus_device %>%
  triple_bias_run_chart(incomplete_outcome_data, title_add = " - medical device RCTs")
ggsave(paste0(output_path, "/2C.3E Sensitivity analysis - Incomplete outcome data - medical device trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Selective reporting
trials_bias_available_5plus_device %>%
  triple_bias_run_chart(selective_reporting, title_add = " - medical device RCTs")
ggsave(paste0(output_path, "/2C.3F Sensitivity analysis - Selective reporting - medical device trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Other sources of bias
trials_bias_available_5plus_device %>%
  triple_bias_run_chart(other_sources_of_bias, title_add = " - medical device RCTs")
ggsave(paste0(output_path, "/2C.3G Sensitivity analysis - Other sources of bias - medical device trials.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")
```

### 6. Sensitivity analysis #2 - all trials

*For these sensitivity analyses, all trials will be included, including trials from years with < 5 trials*.

This makes the analyses - especially of the risk of bias domains - "fragile" and results should be interpreted with caution.

**Analysis 3.1: Percentage of overall low risk of bias trials per year**

```{r}
trials_bias_available %>% 
  run_chart(overall_risk_of_bias, "Low", title = "Overall low risk of bias - all RCTs") +
  scale_y_continuous(limits = c(0, 100), breaks = 10 * 0:10, minor_breaks = 5 * 0:20, labels = paste0(10 * 0:10, "%"), expand = c(0.015, 0))

ggsave(paste0(output_path, "/3.1 Sensitivity analysis - overall low risk of bias - all trials.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")
```

**NOTE**: The **y-axis** goes from 0-100% in this plot, unlike the other plots of overall low risk of bias (0-50%) due to larger differences in yearly proportions caused by 1 single trial with overall low risk of bias in 2018.

These results indicate non-random variation, however, the median yearly proportion is 0%. Thus no values below the median are possible, and no crossings are possible in the run chart and runs cnanot be broken. This result should be interpreted with caution.

**Analysis 3.2: Annual median sample sizes**

```{r}
trials_size_available %>% 
  run_chart(sample_size, title = "Median (IQR) sample sizes - all RCTs")

ggsave(paste0(output_path, "/3.2 Sensitivity analysis - sample sizes - all trials.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")
```

**Analysis 3.2ph1: Annual mean sample sizes**

```{r}
trials_size_available %>% 
  run_chart(sample_size, title = "Mean (95% CI) sample sizes - all RCTs (post-hoc analysis)", num_agg_mean = TRUE)

ggsave(paste0(output_path, "/3.2ph1 Sensitivity analysis - sample sizes - all trials, post-hoc using mean values.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")
```

Note: for years with only 1 trial, the grey ribbon is missing as a confidence interval cannot be calculated.

**Analysis 3.3: Yearly proportion of low, unclear and high risk of bias ratings in each of the seven domains - all trials**

```{r}
# Random sequence generation
trials_bias_available %>%
  triple_bias_run_chart(random_sequence_generation, title_add = " - all RCTs")
ggsave(paste0(output_path, "/3.3A Sensitivity analysis - Random sequence generation.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Allocation concealment
trials_bias_available %>%
  triple_bias_run_chart(allocation_concealment, title_add = " - all RCTs")
ggsave(paste0(output_path, "/3.3B Sensitivity analysis - Allocation concealment.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Blinding of participants and personnel
trials_bias_available %>%
  triple_bias_run_chart(blinding_of_participants_and_personnel, title_add = " - all RCTs")
ggsave(paste0(output_path, "/3.3C Sensitivity analysis - Blinding of participants and personnel.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Blinding of outcome assessment
trials_bias_available %>%
  triple_bias_run_chart(blinding_of_outcome_assessment, title_add = " - all RCTs")
ggsave(paste0(output_path, "/3.3D Sensitivity analysis - Blinding of outcome assessment.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Incomplete outcome data
trials_bias_available %>%
  triple_bias_run_chart(incomplete_outcome_data, title_add = " - all RCTs")
ggsave(paste0(output_path, "/3.3E Sensitivity analysis - Incomplete outcome data.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Selective reporting
trials_bias_available %>%
  triple_bias_run_chart(selective_reporting, title_add = " - all RCTs")
ggsave(paste0(output_path, "/3.3F Sensitivity analysis - Selective reporting.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")

# Other sources of bias
trials_bias_available %>%
  triple_bias_run_chart(other_sources_of_bias, title_add = " - all RCTs")
ggsave(paste0(output_path, "/3.3G Sensitivity analysis - Other sources of bias.", img_format), width = img_width_triple, height = img_height_triple, dpi = img_dpi, units = "mm")
```

### 7. Overview of the number of included trials per year

Create a table of the number of included trials per year in each analysis (and in the full dataset).

```{r message=FALSE}
# Simple helper function
n_years <- function(.dta){
  name <- quo_name(enquo(.dta))
  .dta %>%
    group_by(trial_year) %>% 
    summarise(!!name := n()) %>% 
    arrange(trial_year)
}

tbl <- n_years(included_trials) %>% # First row is the total number of trials included
  left_join(n_years(trials_bias_available_5plus)) %>% # Primary analysis - bias in >= 5 trials /year
  left_join(n_years(trials_size_available_5plus)) %>% # Primary analysis - size in >= 5 trials/year
  left_join(n_years(trials_bias_available_5plus_drug)) %>% # Sensitivity analysis #1A (drug trials) - bias in >= 5 trials/year
  left_join(n_years(trials_size_available_5plus_drug)) %>% # Sensitivity analysis #1A (drug trials) - size in >= 5 trials/year
  left_join(n_years(trials_bias_available_5plus_management)) %>% # Sensitivity analysis #1B (management trials) - bias in >= 5 trials/year
  left_join(n_years(trials_size_available_5plus_management)) %>% # Sensitivity analysis #1B (management trials) - size in >= 5 trials/year
  left_join(n_years(trials_bias_available_5plus_device)) %>% # Sensitivity analysis #1C (medical device trials) - bias in >= 5 trials/year
  left_join(n_years(trials_size_available_5plus_device)) %>% # Sensitivity analysis #1C (medical device trials) - size in >= 5 trials/year
  left_join(n_years(trials_bias_available)) %>% # Sensitivity analysis #2 - bias in all trials
  left_join(n_years(trials_size_available)) %>% # Sensitivity analysis #2 - size in all trials
  mutate_all(~{ifelse(is.na(.), 0, .)}) # Replace all NAs (missing) with 0

tbl_total <- tbl %>%
  summarise_all(sum) %>% 
  mutate(trial_year = "Total")

tbl %>% 
  mutate(trial_year = as.character(trial_year)) %>% # Convert to string to allow merging with the "Total" row
  bind_rows(tbl_total) %>% # Merge
  print() %>% # Print to RMarkdown document
  write_csv(paste0(output_path, "/trials_per_year_in_each_analysis.csv"))
```

### 8. Additional plots

Create two additional plots visualising the absolute numbers; these plots were not pre-planned, but added during the peer review process.

**Absolute number of low and high overall risk of bias trials per year:**

```{r}
trials_bias_available %>% 
  group_by(trial_year) %>% 
  summarise(Low = sum(overall_risk_of_bias == "Low"),
            High = sum(overall_risk_of_bias == "High")) %>% 
  gather(key = "rob", value = "n", -trial_year) %>% 
  arrange(trial_year) %>% 
  mutate(rob = factor(rob, levels = c("Low", "High"))) %>% 
  ggplot(aes(x = trial_year, y = n, fill = rob)) +
  geom_col(position = "dodge") +
  scale_fill_manual(values = c("#00C000", "#C00000"), name = "Overall risk of bias:") +
  labs(x = NULL, y = "Number of RCTs") +
  scale_x_continuous(limits = c(1975, 2020), breaks = 1975 + 5 * 0:8, minor_breaks = 1975:2020, expand = c(0, 0)) +
  scale_y_continuous(limits = c(0, 50), breaks = 0:5 * 10, minor_breaks = 0:10 * 5, expand = c(0, 0)) + 
  theme(plot.margin = margin (t = 5, l = 10, r = 15, b = 5, unit = "pt"), legend.position = "bottom") # Fix plot margins to ensure all labels are printed and change legend position
ggsave(paste0(output_path, "/Additional plots - absolute number of trials by overall risk of bias per year.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")
```

**Absolute number of small and large trials per year:**

Small trials are defined as trials including <= the median overall sample size participants per year; large trials are defined as trials including > the median overal sample size participants per year.

```{r}
# Calculate median
med_sample_size <- trials_size_available %>%
  pull(sample_size) %>% 
  median()

# Categorise trials
trials_size_categorised <- trials_size_available %>% 
  mutate(size_cat = ifelse(sample_size <= med_sample_size, "Small", "Large"))
str_glue("The overall median sample size is {med_sample_size}.
         {sum(trials_size_categorised$size_cat == 'Small')} trials are small and {sum(trials_size_categorised$size_cat == 'Large')} trials are large.")

# Plot
trials_size_categorised %>% 
  group_by(trial_year) %>% 
  summarise(Small = sum(size_cat == "Small"),
            Large = sum(size_cat == "Large")) %>% 
  gather(key = "size", value = "n", -trial_year) %>% 
  arrange(trial_year) %>% 
  mutate(size = factor(size, levels = c("Small", "Large"))) %>% 
  ggplot(aes(x = trial_year, y = n, fill = size)) +
  geom_col(position = "dodge") +
  scale_fill_manual(values = c("#C00000", "#00C000"), name = "RCT sizes:") +
  labs(x = NULL, y = "Number of RCTs") +
  scale_x_continuous(limits = c(1975, 2020), breaks = 1975 + 5 * 0:8, minor_breaks = 1975:2020, expand = c(0, 0)) +
  scale_y_continuous(limits = c(0, 30), breaks = 0:6 * 5, expand = c(0, 0)) + 
  theme(plot.margin = margin (t = 5, l = 10, r = 15, b = 5, unit = "pt"), legend.position = "bottom") # Fix plot margins to ensure all labels are printed and change legend position


ggsave(paste0(output_path, "/Additional plots - absolute number of trials by overall size per year.", img_format), width = img_width, height = img_height, dpi = img_dpi, units = "mm")
```

### 9. Date and session info

Save the date and session info for when the script was run to the RMarkdown-HTML-file.

```{r}
date()
sessionInfo()
```