Atomic IQA energies of some capped amino acids, oligopeptides, single water molecules and water pentamers
Description
This dataset comprises filtered atomic IQA energies for a diverse set of molecular systems, including capped amino acids, oligopeptides, isolated water molecules, and water pentamers (saved in .csv files). The peptide-like subset includes glycine, alanine, phenylalanine, tryptophan, serine, proline, cysteine, carnosine, trialanine, and ATY. For each system, we provide the total electronic energy of each atom. These IQA energies were computed at the B3LYP/6-31+G(d,p) level of theory and subsequently refined using an iterative z-score–based outlier removal procedure. In our recently proposed BMIQA protocol, the distribution of these atomic energies is leveraged to identify distinct local chemical environments, or atom types, directly from conformational datasets. It turns out that neutral oligopeptides (regardless of their length and heterogeneity) can be reduced to a compact representation where individual atoms are replaced by their types chosen among a list of < 20 types. This observation encourages our ongoing development of the first atom-typed IQA-based machine learning force field. Besides IQA energies, the accompanying .csv files also include structural descriptors generated using the atomic local frame (ALF) representation. Together, these ALF features and IQA energies were used in the training of FFLUX models for capped glycine. When deployed in canonical NVT molecular dynamics simulations, these models reproduced the same local environments learnt from topological IQA energies.
Files
Steps to reproduce
1. Conformational sampling through well-tempered metadynamics (in the case of peptide-like systems) and unbiased semi-empirical molecular dynamics for water and water clusters. 2. Topological calculations at the B3LYP/6-31+G(d,p) using the AIMAll19 program. 3. Data filtering using a Python script that processes all IQA energies and iteratively removes geometries where at least one atom has an IQA energy that falls outside 3*sigma of the average/mean IQA energy of the same atom.
Institutions
- University of ManchesterEngland, Manchester