SARS-CoV-2 Intra-host Mutational Landscape: A Curated Dataset of iSNVs
Published: 30 April 2024| Version 2 | DOI: 10.17632/8nvgtrkzdm.2
Contributors:
Fatima Mostefai, , , Description
This dataset, derived from 128,423 high-quality SARS-CoV-2 NGS libraries, represents a comprehensive and precise collection of intra-host single nucleotide variants (iSNVs) processed through a rigorous workflow to ensure accuracy and reliability. Key steps include stringent quality control, variant calling, application of metrics like Strand Bias Likelihood (S) and Alternative Allele Frequency (AAF) for artifact removal. This iSNV dataset, refined to exclude sequencing artifacts, offers a valuable resource for understanding SARS-CoV-2 intra-host mutational dynamics. We also provide a file with the recommended genomic positions to mask for accurate iSNVs calling. The 477 genomic positions are highly recurrent strand bias artifacts.
Files
Institutions
Institut De Cardiologie de Montreal, Universite de Montreal
Categories
Single Nucleotide Polymorphism, Severe Acute Respiratory Syndrome Coronavirus 2