SBERT-Enhanced Loan Approval Prediction Dataset with QNN, Ensemble, and Classical Machine Learning Models.

Published: 17 April 2026| Version 2 | DOI: 10.17632/k9wszywskf.2
Contributors:
NEEL GANGAR,
,
,

Description

This dataset contains 4,269 loan applicant records used for loan approval prediction research. The dataset includes financial, demographic, and engineered features such as income, loan amount, CIBIL score, asset values, and risk indicators. Categorical applicant profiles were transformed into 384-dimensional semantic embeddings using the Sentence-BERT (all-MiniLM-L6-v2) model. The final dataset combines numerical financial features with semantic embeddings to form a high-dimensional feature matrix suitable for machine learning research. The dataset also includes model outputs from Logistic Regression, Random Forest, Gradient Boosting, Decision Tree, Support Vector Machine, Ensemble Voting Classifier, and a Quantum-Inspired Neural Network (QNN). This dataset is provided to support reproducible research in credit risk assessment, financial machine learning, and automated loan approval prediction systems.

Files

Categories

Machine Learning

Licence