Durable Cell and Gene Therapy Potential and Financial Impact

Published: 10 December 2021| Version 1 | DOI: 10.17632/djm65zrhmt.1
Colin Young


Supporting data for Young CM, Quinn C, Trusheim MR, "Durable Cell and Gene Therapy Potential Patient and Financial Impact: US Projections of Product Approvals, Patients Treated, and Product Revenues", Drug Discovery Today, September 17, 2021. Available from: https://doi.org/10.1016/j.drudis.2021.09.001 The data is described in Young CM, Quinn C, Trusheim MR, "Data for Modelling US Projections of Product Approvals, Patients Treated, and Product Revenues for Durable Cell and Gene Therapies ", Data in Brief, Submitted Sep 21, 2021. Taking the pipeline of clinical trials for durable cellular and gene therapy products at December 31, 2020 as a starting point, we use a Markov Chain Monte Carlo model to estimate a distribution for US FDA marketing approvals for those products. Further, using epidemiological data and estimated prices, we infer distributions for total patients treated and revenues generated. The data presented includes forecasting data (FD), parameter estimation data (PED), and supplementary data (SS). Forecasting data comprises the set of clinical trials used in our forecasts (FD1), a set of preclinical trials (FD2) (used for sensitivity analysis), estimates of the relative prevalence of genes or antigens associated with specific diseases (FD3), and estimates of the treatable incidence and prevalence for those diseases (FD4). FD4 also includes our assumptions regarding the potential market penetration and adoption rates for approved products. Parameter estimation data comprises clinical trials data for specific products in four categories: chimeric antigen receptor T-cells (CAR-Ts) and T-cell receptors (TCRs) (PED1); other oncology (PED2); gene therapy (PED3); cellular therapy (PED4). These data are used to estimate "time in phase" and "trial success rates", the stochastic parameters used in our Markov Chain model. Supplementary File S1 contains a bibliography organized by disease and, within diseases, by antigen or gene. Supplementary File S2 contains examples of code (mostly SQL) used to extract and process data from the Clinical Trials Transformation Initiative's database for Aggregate Analysis of ClinicalTrials.gov (AACT). (AACT is a dynamic entity and was substantially changed on November 15, 2021.) Supplementary File S3 contains participant inclusion and exclusion criteria for clinical trials in our sample. Supplementary file S4 is an Excel workbook containing code useful for extracting NCT identifiers from Pharmaprojects data. Clinical trials data in FD1 were extracted from the NIH/FDA ClinicalTrials.gov database. http://www.clinicaltrials.gov Preclinical data in FD2 were identified using Informa PLC. Pharmaprojects, published 2021. https://pharmaintelligence.informa.com/products-and-services/data-and-analysis/ pharmaprojects Clinical trials in the parameter estimation datasets (PED1-4) were identified using Informa PLC. Pharmaprojects, published 2021. Data for the identified trials were extracted from AACT.



Massachusetts Institute of Technology


Disease Epidemiology, Health Economics, Markov Chain Monte Carlo