Bankruptcy Prediction Using Deep Learning with Textual Information

Published: 8 February 2023| Version 1 | DOI: 10.17632/stf3kg7fw3.1
Kaiyan Lin, Hyejin Ku, Mingfu Wang


The database is constructed by the textual data and numerical data. The textual data includes Risk Factors (Rifa), Management’s Discussion and Analysis of Financial Condition and Results of Operations (MD&A), which are Item 1A and Item 7 in the annual 10-K fillings, respectively, and text data from Twitter. The numerical data(39FV_2021June(2).csv) contains financial ratios and accounting variables. 10-K disclosure textual data and tweets data are in NUM10K and NUMtw files. They can be used to train bankruptcy prediction models solely or with numerical data. Our study aims to investigate methods of managing textual data from various sources and to develop a model that combines textual and numerical data to increase prediction power. Compustat North America : S.E.C website: Twitter:



Bankruptcy, Machine Learning