Machine Learning for Prediction of Glycemic Control in Diabetes Mellitus

Published: 24 January 2024| Version 2 | DOI: 10.17632/rr4rzzrjfc.2
Contributors:
Kemal Hakan Gulkesen,
,
,
,
,
,
,
,

Description

This study aims to predict glycemic control status three years after diagnosis and document the most important factors for glycemic control in diabetes mellitus. The data belongs to the people living in Istanbul. We identified newly diagnosed diabetes mellitus patients in Istanbul Province in 2017. The criteria for having diabetes were a diabetes diagnosis (ICD-10 codes E10-E14) or prescription of antidiabetic medications (except metformin) or having an HbA1c over 6.5. In addition, all the diabetic patients were checked if they had initial serum creatinine, lipid profile, at least four HbA1c measurements (roughly once a year). The patients were divided into two groups according to their profile of HbA1c levels; under control (the last two HbA1c values are under 7) or uncontrolled. id: Patient id no HbA1c: Serum level at diagnosis (%) Hba1c_change: %, (level at 1st year-level at diagnosis) sex: 1: female, 2: male age: years LDL: mg/dL, level at diagnosis Cholesterol: mg/dL, level at diagnosis HDL: mg/dL, level at diagnosis Creatinine: mg/dL, level at diagnosis Triglyceride: mg/dL, level at diagnosis infectious_diseases;Malign_neoplasms;Obesity;Thyroid_dis;neoplasms_unknown;anemia;vitamin_deficiency;lipoprotein_met_dis;hematologic_dis;endocrine_other;bipolar_affective_dis;depression;anxiety_dis;Other_mental_dis;neuropathies;diabetic_nueropathy;nervous_sys_dis;cataract;retinopathy;refraction_dis;impacted_cerumen;tinnitus_h93_1;eye_other;otitis_externa_h60;mastoid_h60_h95;hypertension_i10;ischemic_heart_dis;cardiomyopathies;cerebrovascular;other_circulatory;respiratory_sys;oral_dis;gastro_oes_reflux;dyspepsia;digestive_sys_dis;skin_dis;musculoskeletal_dis;nephropaties;kidney_failure;other_genitourinary;pregnancy;birth;ceserian_multiple;other_pregnancy: 1: present in the first year of diagnosis, 0: absent in the first year of diagnosis digestive_drugs;antiobesity;other_digestive;hematologic_drugs;cardiovascular_drugs;lipid_modifying;dermatologic_drugs;gynecologic_drugs;sex_hormones;systemic_hormones;glucagon;calcium_homeostasis_drugs;antiinfectives;vaccines;antineoplastics;endocrin_drugs;immunostimulants;immunosupresants;musculosceletal_drugs;anesthetics;analgesics;antiepileptics;antiparkinson;antipsychotics;anxiolytics;pshycoanaleptics;other_nervous_drugs;antiparasitic;respiratory_sys_drugs;eye_ear_drugs;various_drugs: 1: prescribed in the first year of diagnosis, 0: not prescribed in the first year of diagnosis. akarboz;dapagliflozin;eksenatid;gliklazid;glimepirid;glipizid;linagliptin;nateglinid;pioglitazon_hcl;repaglinide;saksagliptin;sitagliptin;vildagliptin: prescribed amount in the first year after diagnosis (gram/year) insulin_aspart;insulin_detemir;insulin_glarjin;insulin_glusilin;insulin_lispro;insulin_nph;insulin_reguler: prescribed amount in the first year after diagnosis (1000IU/year) metformin_hcl: prescribed amount in the first year after diagnosis (kg/year) Glycemic_control: 0: Under control, 1: Poor control

Files

Steps to reproduce

In Turkey, a central and national-wide EHR system named e-Nabız (https://enabiz.gov.tr/) has been used in routine healthcare services. The quality of data in Istanbul province is better. Therefore, Istanbul’s data was used in this dataset. The Akdeniz University Clinical Research Ethical Committee granted the ethical approval. The e-Nabız system became live in 2015, and by 2017, the data was comparatively more comprehensive. By the end of 2017, 94.8% of the population was covered by the e-Nabız system. At the end of 2017, there were 15,029,231 people living in Istanbul. Predefined HL7 v3 packages were used to transfer the original data from healthcare facilities to the e-Nabız database. The data was queried by a business intelligence platform, Turboard (v2020.07, E-Kalite Ltd., Ankara, Turkey), based on Apache Impala (v.3.2.0). Although the Turboard platform provides tools for data summarizing and displaying, it does not allow for the viewing of specific patient data. Additionally, it can help with de-identified data export as Microsoft Excel tables. In Istanbul Province, we identified individuals with newly diagnosed diabetes mellitus in 2017. A diagnosis of diabetes (ICD-10 codes E10–E14), the prescription of an antidiabetic drug (apart from metformin), or a HbA1c above 6.5 were the requirements for being diagnosed with the disease. In addition, all the people with diabetes were checked if they had initial serum creatinine, lipid profile, at least four HbA1c measurements (roughly once a year), and if they were alive at the end of 2020. Based on their profile of HbA1c levels, the patients were split into two groups: under control (the last two HbA1c values are under 7) and poor control. For every patient, 105 variables were taken out of the e-Nabız system and used as independent variables. These variables were age at diagnosis, sex, first serum HbA1c result, the difference between the HbA1c around 12th months and the first HbA1c result, and other lab results (5 variables, including lipid profile and creatinine, first value with/after DM diagnosis), comorbidities (44 variables, binary, presence or absence of the disease in the first year of the disease), antidiabetics (21 variables, extracted as total dose in one year after diagnosis), and other drugs (31 variables, binary, presence of prescription in the first year of DM). ICD-10 was used to classify comorbidities, while ATC classifications were used to classify medications. The data was checked for noise, extreme, or irrational values. All the doubtful data was deleted. The patients with missing data were not included in the final dataset. Glycemic_control variable is the dependent variable (0: Under control, 1: Poor control). All other variables (independent variables, except patient id) have the potential of effect the dependent variable. İndependent variables are obtained at the diagnosis or in the first year after diagnosis. Dependent variable is obtained three years after diagnosis.

Institutions

Akdeniz Universitesi

Categories

Diabetes Mellitus

Licence