AI-Solubility algorithm
Description
The model development was carried out in the Google Colaboratory environment, using the NumPy (version 1.25.2), Pandas (version 2.0.3) and Matplotlib (version 3.7.1) libraries for data processing and plotting. The 4615 available data were split into 80% training data and 20% test data using the train_test_split function of the scikit-learn library with a random seed of random_state=0. The data were then normalised. The data was then normalised using the StandardScaler method. A neural network was implemented with an input layer of 66 neurons, a hidden layer of 128 neurons and a ReLu activation function, and an output layer of one neuron and a ReLu activation function to predict solubility. The neural network was trained using the Adam optimiser with a learning rate (learning_rate=0.001) of 35 epochs using the Keras library of TensorFlow. To avoid overfitting, Dropout, with a dropout rate of 20% was used as a regularisation technique by deactivating neurons during training to avoid overfitting. The different accuracy metrics were evaluated to analyse the model performance.