Data for Machine Learning-aided Computational Fragment-based Design of Small Molecules for Hypertension Treatment

Published: 28 May 2024| Version 1 | DOI: 10.17632/brgzpd5wj4.1


The study sought to develop a machine learning-aided computational drug discovery system to generate new lead drug molecules for hypertension treatment by targeting the renin-angiotensin-aldosterone system (RAAS). The main agents that act on the RAAS are commonly classified as Angiotensin-Converting Enzyme Inhibitors (ACEIs) or Angiotensin II Receptor Blockers (ARBs), therefore, the objective was to generate new lead ACEIs and ARBs to treat hypertension through the RAAS. As a result, we developed a seven (7) phase computational fragment-based drug design system aided by machine learning, which guides the process of using existing hypertension molecules as the basis for discovering new hypertension lead (candidate) molecules. The output of this study was a dataset of newly generated lead Angiotensin-Converting Enzyme Inhibitor (ACEI) and Angiotensin II Receptor Blocker (ARB) molecules. The Input Data folder below contains all the files that were used to generate this dataset, which can be found in the Output Data folder below.


Steps to reproduce

The code files containing the steps to reproduce this method can be found in the following repository: The steps to reproduce the data are outlined below: # 1. INSTALLATION 1.1 Install Anaconda onto your computer Installation instructions for each operating system: [ ]( 1.2. Install the FBDD Conda environment - Open the Anaconda Prompt from Start - see,Enter%20the%20command%20python%20 for help. - Download the FBDD_environment.yml file and store this in your local drive. - Run the following script in the command prompt: conda env create -f FBDD_environment.yml 1.3. Open Jupyter Notebook Open Jupyter Notebook by typing and running "jupyter notebook" in the Anaconda Prompt or terminal, or by opening Anaconda Navigator and clicking the Jupyter Notebook icon. # 2. INPUT DATA Download the "Data" folder in order to access the data files required for the model. Save this folder in your local drive. Example: The path used on our machine was "/Users/odilehasa/Hypertension/Final_Experiments/FINAL - October/Data" Rename the path to indicate the location of your "Data" folder on your computer. # 3. OUTPUT Create an "Output" folder on your local drive, where all the output files will be stored. Example: The path used on our machine was "/Users/odilehasa/Hypertension/Final_Experiments/FINAL - October/Output" Rename the path to indicate the location of your "Output" folder on your computer. # 4. EXECUTING THE CODE WORKBOOKS Jupyter Notebook was used as the primary IDE for this model. Therefore, all the files are .ipynb to indicate Jupyter Notebook. However, these can be executed using other platforms, aside from Jupyter Notebook. The notebooks are numbered, to indicate the order in which they should be executed. The outputs of each phase should be stored in the "Output" file created above. Order of execution: 1. Phase_1 2. Phase_2 3. Phase_3 4. Phase_4 5. Phase_5 6. Phase_6 7. Phase_7 8. Supp_0 9. Supp_1 10. Supp_2 11. Supp_3 ### Important Note: Change all references to the file path to your directory where your respective folders are stored. Example: The path used on our machine was "/Users/odilehasa/Hypertension/Final_Experiments/FINAL - October/Output" Rename the path to indicate the location of your "Output" folder on your computer. Do the same for the Data folder.


University of Johannesburg


Drug Discovery, Machine Learning, Angiotensin Receptor, Angiotensin-Converting-Enzyme Inhibitor, Hypertension, Cheminformatics, Computational Bioinformatics, k-means Clustering


University of Johannesburg