Prediction of Venous Thromboembolism in Diverse Populations Using Machine Learning and Electronic Health Records

Published: 25 October 2023| Version 6 | DOI: 10.17632/tkwzysr4y6.6
Robert Chen


Contains resources needed to train, test, and analyze performance of gradient boosting models used to predict venous thromboembolism (VTE) from electronic health record (EHR) data. "Code for analyses" folder: Contains code we used for the analyses in our paper. Prediction.ipynb: Contains code needed to run trained models. Small, Medium, and Large.xlsx: Excel templates to correctly format data for prediction generation. Contains trained models. Note that this is 0.4 GB once unzipped. Analysis.ipynb: Contains code used to train the models. Dependencies: Python 3.10.9; Pandas 1.5.1; LightGBM 3.3.2.



Icahn School of Medicine at Mount Sinai


Machine Learning