Coronaviruses proximities through nucleotide sequences

Published: 20 September 2022| Version 1 | DOI: 10.17632/swy5rzbmrd.1
Contributors:
Iskander Akhmetov,

Description

1. Viruses_initial_ds.csv: initial data obtained from SARS CORONAVIRUS ACCESSION dataset at https://www.kaggle.com/datasets/jamzing/sars-coronavirus-accession. 2. pairwise_compare_ds.csv: pairwise comparison of viruses for total nucleotide matches by position, largest match block length, and number of match blocks (or shares). 3. total_matches.csv: square matrix virus to virus total matches length. 4. largest_block.csv: square matrix virus to virus largest match block length. 5. shares.csv: square matrix virus to virus number of shared match blocks. 6. ttl_max_shares_3D.pkl: items 3-5 data reduced by PCA to 3D.

Files

Categories

Bioinformatics, Computational Linguistics, Coronavirus

Licence