Global population analysis of HLA class I affinity toward proteasome-generated peptides from five main SARS-CoV-2 spike protein RBD variants

Published: 3 April 2024| Version 1 | DOI: 10.17632/yc7ht4cgnc.1
, Maxim Ri, Ivan Butenko,


Global data on relative abundance of SARS-CoV-2 S protein RBD peptides in samples, generated by human constitutive (c20S) and immune (i20S) proteasomes, calculated HLA class I EL ranks, percentage of HLA class I alleles coverage, average HLA class I haplotype protective index and percentage of individuals from 28,104 with at least 1 positive HLA class I allele in haplotype. List of 305 HLA class I alleles, covering 98% haplotypes of 28,104 individuals deposited in Allele Frequency Net Database. Referencing of proteasome-generated N-terminally extended peptides to core HLA class I CD8+ T cell epitopes listed in the Immune Epitope Database (IEDB). Convoluted relative abundance of SARS-CoV-2 S protein RBD peptides in samples, generated by c20S and i20S proteasomes. Spearman correlation between HLA class I haplotypes positive for two major SARS-CoV-2 S protein RBD epitopes in different populations and COVID-19-related deaths/cases ratio. Analysis of population frequency of SARS-CoV-2 S protein RBD epitopes-binding HLA class I molecules in different countries. Total amount of COVID-19-related deaths/cases in different countries before and after 12/01/2021. Interactive plot representing relative abundance of SARS-CoV-2 S protein RBD peptides in samples, generated by c20S and i20S proteasomes with indication of positive HLA class I alleles and respective EL ranks. Interactive plot with searchable interface representing sequence distribution of SARS-CoV-2 S protein RBD peptides, generated by proteasomes, with indication of positive HLA class I alleles and respective EL ranks.


Steps to reproduce

The hydrolysis of Wuhan Hu-1 and its lineages Alpha B.1.1.7, Gamma P.1, Delta B.1.617.2, Omicron B.1.1.529 SARS-CoV-2 recombinant S protein RBD variants (1.5 ug) by human constitutive (c20S) and immune (i20S) proteasomes (1 ug) from HeLa cells was carried out in a 20 uL volume of buffer contained 20 mM Tris (pH 7.5), 5 mM MgCl2, and 1 mM DTT. The mixtures were incubated overnight at 37°C. Totally 50 samples were subjected to LS-MS/MS analysis on Exploris 480 (Dataset ID PXD050265). Peptides, which overlapped with C-terminal AviTag-His, N-terminal non-RBD leader sequence or observed in samples without proteasomes were withdrawn. Resulted dataset contained 821 peptides (Supplemental Table 1). Normalization for each peptide was performed by dividing the values of its ion current by the total ion current (TIC) value across all observed RBD peptides. Final relative abundance of each peptide in RBD hydrolyzates represents average value from four independent replicates. Peptides ranging from 8 to 16 amino acids were recruited in analysis of human leukocyte antigen class I (HLA class I) binding. Specifically, we utilized the netMHCpan-4.1 algorithm for 305 HLA class I alleles (Supplemental Table 2) covering 98% of the reqruited population (28,104 individuals with 4-digit HLA class I code or higher) as documented in the Allele Frequency Net Database []. The database initially contains 82,130 samples, of which 28,104 samples were used with HLA class I data in a 4-digit code or higher for the alleles A, B, and C. Any alleles represented by a 6-digit code or higher were convoluted to a 4-digit code. Peptides with a length of 8 amino acids were directly inputted into the algorithm, while peptides with a length more than 8 amino acids were truncated to the length of 8, 9 and 10. For HLA class I alleles that occurred multiple times, the one with the lowest EL rank was selected. Only HLA variants with threshold lower or equal 0.5 were considered. For 237 peptides positive for HLA class I binding the average level of protection and percentage of individuals with at least one positive HLA class I allele (1-6) in each country were calculated (Supplemental Table 3). The CD8+ T cell epitopes sequences within RBD were extracted from IEDB database (Supplemental Table 5). The N-terminally extended proteasome-generated peptides were attributed the core CD8+ T cell epitopes (Supplemental Table 6) and further manually supplemented with data regarding positive CD8+ T cell assays (Supplemental Table 7). Amount of COVID-19 cases and related deaths were calculated from the file dated 20.12.2023 (Supplemental Table 9). Spearman correlation was calculated for each peptide between confirmed COVID-19-caused deaths and registered COVID-19 cases before or after 12/01/21 (Supplemental Table 7). Analysis was accomplished using a custom bash and R scripts available on request.


Proteasome, Severe Acute Respiratory Syndrome Coronavirus 2, COVID-19 Therapy, COVID-19 Mortality, COVID-19 Vaccine


Russian Science Foundation