Simulated Credit Default data

Published: 22 May 2023| Version 1 | DOI: 10.17632/fs4wm9hfxd.1
Jose Vicente Alonso Salgado


The dataset consists on 100k records simulating the good (GB=0) or bad (GB=1) credit behavior of personal loans based on explanatory variables such as the account balance, payment history, loan purpose, other available assets, credit duration, debtor's age, home type and others.


Steps to reproduce

Simulated data based on the German Credit dataset using the version from The Pennsylvania State University ( The Creditability target variable has been renamed as GB (1 for default, 0 otherwise) and other explanatory variables have been given shorter names: Balance (Account Balance), PaymentHistory (Payment Status of Previous Credit), OtherAssets (Value Savings/Stocks), JobDuration (Length of current employment), CivilStatus (Sex & Marital Status), LiquidAssets (Most valuable available asset), HomeType (Type of apartment), OutsideCredits (Concurrent Credits), CreditDuration (Duration of Credit (month)). The dataset consists on 100k records simulated by MonteCarlo from a glm model built from the German Credit dataset using the following explanatory variables: Balance, PaymentHistory, Purpose, OtherAssets, JobDuration, CivilStatus, LiquidAssets, HomeType, OutsideCredits, CreditDuration, CreditAmount and Age.