Non-Hemolytic and Non-Toxic AMPs from Cephalopods' Posterior Salivary Glands: Singular Set and Sets with Similarity to Characterised AMPs

Published: 6 October 2023| Version 1 | DOI: 10.17632/8mttp4pvmc.1
Contributors:
Guillermin Agüero-Chapin, Dany Domínguez-Pérez, Yovani Marrero-Ponce, Kevin Castillo-Mendieta

Description

The datasets presented here are the result of a comprehensive comparison involving 68,694 non-hemolytic and non-toxic Antimicrobial Peptides (AMPs) extracted from the posterior salivary glands of Cephalopods (doi: 10.17632/htntmccyd4.1). The comparison was made against characterized AMPs registered in the StarPep database (http://mobiosd-hub.com/starpep/) Dataset Structure: The provided datasets, available in FASTA file format, are structured as follows: 1. Singular NoHemNoTox AMPs (5,466): AMPs from cephalopods that are non-hemolytic and non-toxic, exhibiting less than 40% sequence identity with members in the StarPep database. 2. NoHemNoTox AMPs_SimHigher40 (63,228): AMPs from cephalopods that are non-hemolytic and non-toxic, sharing more than 40% sequence identity with members in the StarPep database. 3. NoHemNoTox AMPs_Sim40-50 (26,744): AMPs from cephalopods that are non-hemolytic and non-toxic, sharing 40-50% sequence identity with members in the StarPep database. 4. NoHemNoTox AMPs_Sim50-60 (30,217): AMPs from cephalopods that are non-hemolytic and non-toxic, sharing 50-60% sequence identity with members in the StarPep database. 5. NoHemNoTox AMPs_Sim60-70 (5,716): AMPs from cephalopods that are non-hemolytic and non-toxic, sharing 60-70% sequence identity with members in the StarPep database. 6. NoHemNoTox AMPs_Sim70-80 (453): AMPs from cephalopods that are non-hemolytic and non-toxic, sharing 70-80% sequence identity with members in the StarPep database. 7. NoHemNoTox AMPs_SimHigher80 (98): AMPs from cephalopods that are non-hemolytic and non-toxic, sharing more than 80% sequence identity with members in the StarPep database. These datasets provide a valuable resource for screening purposes as part of the peptide drugs discovery, especially considering their similarities to the known chemical space (characterized AMPs in the StarPep database).

Files

Steps to reproduce

These datasets resulted from the application of Cd-hit-2D at different identity cutoffs (0.40, 0.50, 0.60, 0.70, 0.80) in the comparative analysis of 68,694 AMPs derived from cephalopods with the know chemical space registered in StarPepDB. The creation of this dataset involved two key steps: 1- TThe original StarPepDB, comprising 45,120 AMPs, underwent a process of redundancy reduction at 98% sequence identity using CD-HIT. Additionally, only peptides with a length within the range of 5-100 amino acids were retained. This procedure yielded a refined set of 32,863 characterised peptides, forming the basis for comparison. 2- The presented datasets resulted from the application of Cd-hit-2D at different identity cutoffs (0.40, 0.50, 0.60, 0.70, 0.80) when comparing the 68,694 AMPs derived from cephalopods with the characterised chemical space registered in StarPepDB (32,863 AMPs).

Institutions

Universidade do Porto Centro Interdisciplinar de Investigacao Marinha e Ambiental

Categories

Drug Discovery, Peptides, Cephalopoda, Antimicrobial

Licence