Proteins with amino acid repeats constitute a rapidly evolvable and human-specific essentialome. Anjali K Singh et al.

Published: 19 June 2023| Version 1 | DOI: 10.17632/v7nmy2296k.1
Contributors:
,
,
,
,

Description

Essential genes, and their protein products, are necessary for organismal survival. These proteins, owing to their functional indispensability, are conserved and perform fundamental biological processes. Contrarily, proteins which contain stretches of identical amino acids (referred as homorepeats) tend to evolve rapidly, as repetitive sequences are hypermutable. Strikingly, eukaryotic essentialomes are enriched for proteins with amino acid homorepeats. This presents a function versus evolution paradox, as to why are the hypermutable homorepeats prevalent in essential proteins with conserved functionalities. In this study, we resolve this paradox by undertaking systems-level analyses in humans by assembling and integrating 60 publicly-available large-scale datasets spanning sequence, structure, molecular networks, regulation, spatio-temporal expression, phenotype and evolution. We found that essential proteins with homorepeats (i) Show molecular pleiotropy, i.e., they are multifunctional and bring about cross-talk between processes, (ii) Participate in regulatory processes and bring about expansive regulation at the genomic and transcriptomic levels, while essential proteins without repeats, are preponderantly involved in housekeeping functions (iii) Rapidly evolve, with amino acid substitutions often affecting functionally important sites such as Post translational modification sites aiding rapid network rewiring and (iv) Are involved in dynamic and human-specific temporal regulation of human embryonic and brain development, facilitating the emergence and evolution of human-specific processes. The presence of repeats in essentialomes brings about a trade-off between robustness and evolvability facilitating rapid adaptability. Here, we have uploaded all molecular interaction networks that were compiled and investigated in this study. We have also added the source code to facilitate repetition of this work and address other fundamental systems- and molecular-level questions. The instructions regarding usage of the codes are provided in individual scripts.

Files

Categories

Evolutionary Biology, Developmental Biology, Systems Biology

Licence