An Extensive and Non-Redundant Peptide Library Derived from Omics Data of Cephalopods' Posterior Salivary Glands
Description
The provided dataset comprises a consolidated peptide library formed by amalgamating peptide libraries generated through 13 distinct in silico digestion protocols applied to the omics data sourced from Cephalopods' Posterior Salivary Glands. These initial peptide libraries are documented in (doi:10.17632/6fjsdnvygb.1). The resulting dataset is made up of 9,216,442 non-redundant peptides to facilitate the exploration of novel antimicrobial peptides, as the chemical territory associated with Cephalopod's salivary apparatus remains relatively uncharted in the realm of biodiscovery pursuits.
Files
Steps to reproduce
The creation of this dataset involved two key steps: 1. The initial peptide libraries, accessible via doi:10.17632/6fjsdnvygb.1, were combined. 2. To eliminate sequence redundancy, CD-HIT was utilized with a sequence identity threshold set at 0.98.