An Extensive and Non-Redundant Peptide Library Derived from Omics Data of Cephalopods' Posterior Salivary Glands

Published: 29 August 2023| Version 1 | DOI: 10.17632/v67g7r8nf2.1
Guillermin Agüero-Chapin, Dany Domínguez-Pérez, Yovani Marrero-Ponce


The provided dataset comprises a consolidated peptide library formed by amalgamating peptide libraries generated through 13 distinct in silico digestion protocols applied to the omics data sourced from Cephalopods' Posterior Salivary Glands. These initial peptide libraries are documented in (doi:10.17632/6fjsdnvygb.1). The resulting dataset is made up of 9,216,442 non-redundant peptides to facilitate the exploration of novel antimicrobial peptides, as the chemical territory associated with Cephalopod's salivary apparatus remains relatively uncharted in the realm of biodiscovery pursuits.


Steps to reproduce

The creation of this dataset involved two key steps: 1. The initial peptide libraries, accessible via doi:10.17632/6fjsdnvygb.1, were combined. 2. To eliminate sequence redundancy, CD-HIT was utilized with a sequence identity threshold set at 0.98.


Universidade do Porto Centro Interdisciplinar de Investigacao Marinha e Ambiental


Biodiscovery, Cephalopoda, Omics, Peptide Library, Database