Dataset of NGS-Phage display library of nanobodies derived from Indian desert camel (Camelus dromedarius L.)
Description
We constructed a phage display library (PDL) of nanobodies (Nbs) derived from an E. coli lipopolysaccharide-immunized Indian desert camel and subjected it to NGS methodology to obtain the NGS reads dataset of Nbs in a fraction of that PDL. This is the first phage display library of Nbs derived from the Indian desert camel and its initial preliminary characterization revealed functional Nbs against diverse Ags from infectious agents. Several Nb clones were isolated by panning against E. coli endotoxin and S. aureus β-hemolysin (an exotoxin), and were deposited, along with the library, in ICAR-Veterinary Type Culture Collection at Hisar (Haryana) (Accession no. VTCCMBA22 to VTCCMBA48). Some of these Nb clones were characterized for biochemical and functional features, and Sanger sequenced (GenBank accession no.: EU861212; KF990215; KF990216; KF990217; GU014816). For NGS, the cryo-preserved transformants library was revived to extract the Nb-encoding VHH (inserts)-pHEN4 (vector) DNA pool. The DNA sample was used for amplifying VHH pool by PCR. The VHH amplicons band was gel-purified and subjected to NGS using Illumina MiSeqTM platform. ‘Nextra XT micro V2 Index’ kit was used for the Nb library DNA sample sequencing, with the adaptors: ‘i7’ (N706: TAGGCATG) and ‘i5’ (S517: GCGTAAGA). The raw data of NGS reads was submitted to NCBI ‘Sequence Reads Archive’ repository. Raw NGS data ‘Read 1’ [Phage lib NGS seqs-ABT-28_S24_L001_R1_001.fastq (1)] and ‘Read 2’ [Phage lib NGS seqs-ABT-28_S24_L001_R2_001.fastq (2)] files were generated by Illumina® MiSeq™ system. Preliminary examination using CLC Genomics Workbench 12.0 revealed 91,073 sequence reads in each file. Quality score ‘Q30’ (1 in 1000 bases incorrect) at 2x150 bp was >80%, i.e., acceptable. A total read count was 182146 (matched= 179591; unmatched=2555), with average read length of 130.33 bases and a total of 23.74 Mb. Of 179591 matched reads, 142004 were paired reads and 37587 broken paired reads. These data can be accessed at SRA RunSelector: https://www.ncbi.nlm.nih.gov/Traces/study/?acc=PRJNA516512 Table 3 shows contig 1 and 2 having CDR1, -2 and -3, whereas contig 3 being of shorter length, having only CDR 1 and -2. In addition, the table 3 shows that the contig 2 CDR 1 and -2 are identical to the LPS-binder clones previously isolated from the library and sequenced by Sanger’s automated technique. CDR3 sequence has only one amino acid position mismatch with four LPS-binders and two amino acid positions with one LPS-binder clone. One randomly picked and Sanger sequenced clone (EU429319) from the library showed CDR 1 and -3 similar to the LPS-binder clones, but CDR2 was entirely different. The table 3 also shows that VHH hallmark amino acids are present in all the contigs as well as individually isolated clones from the library. KF990217.1 (Anti-lps Nb Cl26) nucleotide sequences BLASTn search revealed highly similar sequences in the SRA_SRX5282797 database (this library NGS database).