In this study, we analyzed the mutation and selection landscape of 516 unique and complete genomes of SARS-CoV-2 isolates from India in a 12-month span (from Jan-Dec 2020) to understand how the virus is evolving in this geographical region. We identified 953 genome-wide loci displaying single nucleotide polymorphism (SNP) and the Principal Component Analysis and mutation plots of the datasets indicate an increase in genetic variance with time. The 42% of the polymorphic sites display substitutions in the third nucleotide position of codons indicating that non-synonymous substitutions are more prevalent. These isolates displayed strong evidence of purifying selection in ORF1ab, spike, nucleocapsid, and membrane glycoprotein.
Steps to reproduce
To reproduce the data use the scripts given in the git hub link https://github.com/vishalsnegi/SARS-CoV-2_Geo_India