Assessing the influence of policies and policy mixes on corporate green innovation: Insights from interpretable machine learning
Description
This study employs a sample comprising data from Chinese non-financial listed companies spanning from 2011 to 2021.
Files
Steps to reproduce
1) Excluding financial firms. 2) Excluding ST and *ST companies. 3) Performing missing value imputation using random forest imputation. 4) Applying 1% and 99% winsorization to all continuous variables to reduce the impact of extreme values. 5) The data of 32,559 firm-year observations from 4,439 Chinese listed companies were obtained. The data are mainly from CSMAR database. Feature normalization was omitted in light of the winsorization and logarithmic transformations that were already applied. Excluding samples with missing values was not conducted since random forest imputation was used. The analysis is performed at the firm-year level. All features are lagged one year to avoid reverse causality issues. We divided the dataset, allocating 70% of it to the training dataset and retaining the remaining 30% for the test dataset.