# On the Prebiotic Selection of Nucleotide Anomers: Computational Data

## Description

The data generated for this article are obtained from computational quantum chemistry. The data consist of raw total molecular energies, zero-point energy-corrected molecular energies, and Gibbs energies, all obtained from the Density Functional Theory (DFT) calculations with the commonly used DFT functional B3LYP and the Pople-style basis set 6-31G(d,p). The calculated data include two sets obtained in parallel and contrasted: A vacuum (gas) phase set and corresponding calculations performed with implicit aqueous solvation implemented through the self-consistent reaction field (SCRF) polarizable continuum model (PCM) using the integral equation formalism variant (IEFPCM) as implemented in the Gaussian 16 suite of programmes. The calculations give the energy differences between the α-anomers and the corresponding β-anomers of the canonical nucleosides and nucleotides as well as compares different hypothetical reaction paths that lead to the present-day nucleosides/nucleotides assembled from their three building blocks, namely, the sugar, the phosphate, and the nitrogenous base. A comparison is also given for the relative energies of the uracil (U) and thymine (T) nucleosides/nucleotides in their canonical forms contrasted with their form that is not generally observed in biological systems, that is, where T is linked to a ribose sugar and U to a deoxyribose sugar. This data can form the nucleus for a much larger database that includes non-canonical bases, sugars, and even linkers (different sugars, or linkers other than sugars e.g. peptide links) in an effort to pin-down the thermodynamics that led to the observed present day selection of the particular forms of these nucleic acid building blocks. Machine learning may be used in the future on such an enlarged dataset to estimate the relative stabilities of synthetic nucleoside/nucleotides that could perhaps be used in molecular medicine applications.

## Files

## Steps to reproduce

The structure of each nucleoside in two anomeric forms (β and α) where were subjected to a soft potential energy hypersurface scan with respect to the angle that governs the N-glycosidic bond as follows. Z-matrices for the ribofuranose and 2'-deoxyribofuranose sugar, each in both the β and α configuration, were read into Granadarot [1,2] which then creates 1,000 random conformers for each. These structures were optimized at the semiempirical PM7 level of theory using MOPAC2016 [3]. The most stable structures were then refined at the DFT-B3LYP/6-31G(d,p) level of theory[4-6]. Aqueous solvation was accounted for using an integral equation formalism variant of the “polarizable continuum model” (IEFPCM) [7-9]. All DFT calculations were performed using Gaussian 16 [10]. The 5 bases (A, G, C, T, U) were optimized at the same level of DFT theory. Finally, a mono-anionic dihydrogen phosphate group has been optimized unconstrained in vacuum and in solvent. The optimized phosphate was attached to the nucleosides and a soft scan performed, again retaining the most stable form of each nucleotide. Hence, in summary, the above procedure starts by adding sugar + base to form the nucleoside followed by optimization of the nucleoside, then adding the phosphate before the final optimization. Since optimization follows every step, the order of these steps is crucial. Hence, we have also tested the sequence: sugar + phosphate --> 5'sugar-monophosphate --> optimization --> 5'sugar-monophosphate + base nucleotide --> optimization. For a given pair of anomers, their relative energy is defined as the energy of the β-anomers minus the energy of the α -anomers. Calculated energy differences include differences in raw (uncorrected) total energies, differences in total energies corrected for zero-point vibrational energies, and differences in the Gibbs energies at standard conditions (all calculations repeated with and without solvation effects). References [1] Montero, L.A. http://karin.fq.uh.cu/mmh/ 2019, [2] Montero, L.A., et al. J. Am. Chem. Soc. 1998, 120, 12023-12033. [3] Stewart, J.J.P. MOPAC - (http://openmopac.net/) 2019, [4] Becke, A. J. Chem. Phys. 1993, 98, 5648-5652. [5] Lee, C.; Yang, W.; Parr, R. Phys. Rev. B 1988, 37, 785-789. [6] Hehre, W.J.; Radom, L.; Pople, J.A.; Schleyer, P.v.R. Ab Initio Molecular Orbital Theory; Wiley-Interscience: New York, 1986; [7] Miertus, S.; Scrocco, E.; Tomasi, J. Chem. Phys. 1981, 55, 117-129. [8] Tomasi, J.; Mennucci, B.; Cammi, R. Chem. Rev. 2005, 105, 2999-3093. [9] Tomasi, J.; Cappelli, C.; Mennucci, B.; Cammi, R. In: Quantum Biochemistry: Electronic Structure and Biological Activity; Matta, C.F. Ed. Wiley-VCH: Weinheim, 2010; pp 131-170. [10] Frisch M. J., et al. Gaussian 16 (C.01.), Gaussian Inc., Wallingford, 2019. [11] Castanedo, L. A. M.; Matta, C. F. Heliyon 2022, in press.