Estimating the Causal Effect of Measured Endogenous Variables: Study 1 (Simulation Data)

Published: 08-08-2019| Version 1 | DOI: 10.17632/5ph6b4nvpr.1
Gwendolin Sajons


This is a simulated data set (n = 10.000). It contains four variables q, t, e und u, which are all normally distributed with a mean of 0 and a standard deviation of 1. Further, there are two variables x and y, which depend on the other variables as follows: x = 0.5*q + 0.3*z + u y = 0.4*x + 0.6*q + e For our analysis, y represents the dependent variable, x an endogenous explanatory variable, q an omitted variable, and t an exogenous instrumental variable. u and e are the equations’ error terms.


Steps to reproduce

Stata-code: set seed 123 set obs 10000 gen q=rnormal(0,1) gen t=rnormal(0,1) gen e=rnormal(0,1) gen u=rnormal(0,1) gen x=0.5*q+0.3*t+u gen y=0.4*x+0.6*q+e