Example Stata syntax and data construction for negative binomial time series regression

Published: 31 August 2022| Version 1 | DOI: 10.17632/3mj526hgzx.1
Sarah Price,


We include Stata syntax that creates panel datasets for negative binomial time series regression analyses, as described in our paper "Examining methodology to identify patterns of consulting in primary care for different groups of patients before a diagnosis of cancer: an exemplar applied to oesophagogastric cancer". We also include a sample dataset for clarity. The variables are defined as follows: case: binary variable for case or control status (takes a value of 0 for controls and 1 for cases). patid: a unique patient identifier. time_period: A count variable denoting the time period. In this example, 1 denotes 24 months before diagnosis with cancer, and 24 denotes the month of diagnosis with cancer, ncons: number of consultations per month. period1 to period24: 24 unique inflection point variables (one for each month before diagnosis). These are used to test which aggregation period includes the inflection point. In this example, we study consultation rates over time, but the method could be used to study any countable event, such as number of prescriptions.



University of Exeter


Early Cancer Detection