Assessing computational predictions of antimicrobial resistance phenotypes from microbial genomes

Published: 26 February 2024| Version 2 | DOI: 10.17632/6vc2msmsxj.2
Kaixin Hu


This dataset contains the following materials: 1. ZIP file of txt files. PATRIC Genome IDs and AMR phenotypes, with '1' indicating resistance and '0' indicating susceptibility, for 78 single-species-antibiotic datasets. 2. ZIP file of JSON files. 223 (78+67+78) sets of random folds, phylogeny-aware folds, and homology-aware folds for 78 single-species-antibiotic datasets. Nine sets of homology-aware folds for nine multi-antibiotic datasets, each corresponding to one of nine species. A set of homology-aware folds for evaluating Aytan-Aktug control multi-species model and cross-species model. Nine sets of homology-aware folds for evaluating the Aytan-Aktug leave-one-species-out cross-species model. 3. ZIP file of PDF files. Phylogenetic trees generated by Geno2Pheno for 10 species (including 67 species-antibiotic combinations), annotated with phenotypes and the cross-validation folds for phylogeny-aware and random folds, respectively.


Steps to reproduce


Genomics, Machine Learning, Computational Biology


Deutsche Forschungsgemeinschaft


Deutsche Forschungsgemeinschaft


Deutsches Zentrum für Infektionsforschung

TI 12.002