PeriodicTableDataset
Description
This dataset contains comprehensive physicochemical properties of periodic table elements, compiled from authoritative sources including NIST Chemistry WebBook, CRC Handbook 2022–2023, and PubChem. The dataset is designed for educational purposes, research applications, and machine learning studies focused on chemical element clustering and property prediction.
Files
Steps to reproduce
1. Download all files and unzip into a working directory. 2. Install Python (>=3.9) with the following libraries: pandas, numpy, scikit-learn, matplotlib, seaborn, plotly, scipy. 3. Load the dataset located in /data (available in CSV and XLSX formats). 4. Optionally, open the documentation in /docs for methodology (methodology.pdf) and data dictionary (data_dictionary.md). 5. Use the scripts in /scripts to reproduce the analysis: - preprocessing.py → standardizes and prepares the dataset. - clustering/ → contains scripts for K-means, hierarchical clustering, and metrics evaluation. - visualization/ → generates raincloud/violin plots, PCA biplots, and periodic table colored by clusters. 6. Figures in /figures are generated directly from the visualization scripts. 7. Results should match the analyses and plots reported in the associated article (DOI: 10.1016/j.rechem.2025.102517).
Institutions
- Universidad de Cartagena