Data on genetic diversity of circumsporozoite protein (csp) non-repeat regions from Plasmodium knowlesi clinical isolates of Sabah.

Published: 31 March 2022| Version 1 | DOI: 10.17632/n96gfvksn5.1
Zarina Amin


This dataset presents an analysis in the genetic diversity of malaria circumsporozoite protein (csp) of Plasmodium knowlesi in Sabah; where circumsporozoite protein is one of the targeted candidates for malaria vaccine development and was conducted to evaluate the suitability of csp as a vaccine in relation to its genetic diversity. The data were collected from 26 human blood spot samples from Kudat and Kota Kinabalu hospitals in Sabah in 2012 which were tested positive for malaria. Genomic DNA extraction, nested PCR, cloning and sequencing of the csp genes were carried out and phylogenetic, sequence diversity and natural selection of the csp genes were analysed using bioinformatic tools such as MEGAX and DnaSP ver. 5.10.00 for phylogenetic tree build, mutational analysis and neutral theory of evolution. Analysis and comparison of this gene was done against P. knowlesi csp strain H as a reference sequence (GenBank database XM_002258966.1) showed point mutations at 52 positions among the 237 sequences. The phylogenetic tree revealed that the occurrence of multiple haplotypes was scattered despite of geographical location. The evolutionary history which was inferred using the Neighbor-Joining method (Saitou and Nei, 1987) revealed no geographical clustering to any country listed above; with a total of 76 non-repeat region Pkcsp haplotypes including one haplotype unique to this study (haplotype H12). These data could serve as auxiliary information and/or research data for other researchers in Sabah. It could also serve as guide or reference data to other researchers outside Sabah who may be interested in carrying out similar research in other states.


Steps to reproduce

In this study, blood samples of malaria patients were collected from Kudat Hospital (KDH) and Hospital Queen Elizabeth Kota Kinabalu (QEH) in 2012. This study was approved by Medical Research Ethics Committee of Ministry of Health Malaysia and the UMS Ethics Committee. A total of 26 human blood samples infected with P. knowlesi were collected from symptomatic malaria patients which were spotted on chromatography paper.Genomic DNA extraction was then carried out followed by nested PCR for identification of the genus and species of Plasmodium with specific primers. Next, the genes were then cloned and sequenced. Sequence aligment was then carried out on the csp gene using the CLUSTAL-W tool in Molecular Evolutionary Genetic Analysis X (MEGAX) software. Next, the sequencing data were analyzed for a 453 bp nucleotide sequence, which correspond to non-repeat regions of N-terminal (first 195 bp of the coding sequence) and C-terminal (the last 261 bp of the coding sequence). Non-repeat regions of csp were combined and analyzed. The last three nucleotides which encoded for stop codon were excluded. Phylogenetic, sequence diversity, and natural selection analysis of csp gene was then carried out using MEGAX and DnaSP ver. 5.10.00.


Universiti Malaysia Sabah


Infectious Disease