TOX3_TOX4_TOX5 generated data

Published: 26-01-2021| Version 1 | DOI: 10.17632/y5jyd3ycgf.1
Contributor:
Gergely Toth

Description

The TOX3, TOX4 and TOX5 datasets are QSAR ones published by Gramatica et al. In their paper the experimental EC50 values (growth inhibition with respect to algea species Pseudokirchneriella subcapitata) of 35 triazole or benzo-triazole derivatives were used as training set. In the original article three linear QSAR models were developed using these 35 compounds (models A, B and C). Thereafter, the models were used on further 369 compounds, but for these no experimental data were available. In our calculation three datasets were formed. Each set contained the merged set of cases with altogether 404 molecules and the corresponding descriptor values of the given model. The response values were calculated in a quasi-independent way. For example, the response values for set TOX3 were defined as the averages of the response values calculated according to models B and C. For TOX4 the average responses of models A and C and for TOX5 the average responses of A and B were used. All responses were calculated in this way. We know that the generation procedure of TOX3-TOX5 datasets regularized the responses and probably eliminated the outliers. The original data were downloaded from the qsardb repository. P. Gramatica, S. Cassani, P.P. Roy, S. Kovarich, C.W. Yap, E. Papa, QSAR Modeling is not “Push a Button and Find a Correlation”: A Case Study of Toxicity of (Benzo-)triazoles on Algae, Mol. Inform. 31 (2012) 817-835. http://dx.doi.org/10.1002/minf.201200075 V. Ruusmann, S. Sild, U. Maran, QSAR DataBank repository: open and linked qualitative and quantitative structure–activity relationship models. J. Cheminf. 7 (2015) 32. https://doi.org/10.1186/s13321-015-0082-6, http://www.qsardb.org TOX3 multivariate linear regression R2 on total set 404 cases 3 descriptors 1 response 0.75 TOX4 multivariate linear regression 404 cases 3 descriptors 1 response 0.80 TOX5 multivariate linear regression 404 cases 3 descriptors 1 response 0.83

Files

Steps to reproduce

See description