Viral proteins length distributions: A comparative analysis

M.M.F. de Lima,M.O. Costa,R. Silva,U.L. Fulco,J.I.N. Oliveira,M.S. Vasconcelos,D.H.A.L. Anselmo
DOI: https://doi.org/10.1016/j.physa.2023.129367
2024-01-01
Abstract:In this work, we perform a comparative analysis of length distributions of proteins belonging to two virus families, namely Flaviviridae and Coronaviridae. Both families of viruses are highly contagious and represent severe threats to global health systems. To perform the comparative analysis, we retrieved data from the databases Virus Pathogen Database and Analysis Resource (ViPR) and National Center for Biotechnology Information (NCBI). In our investigation, we have considered four distinct Cumulative Distribution Functions (CDFs) of protein length: q -Exponential-CDF and q -Gaussian-CDF, q -Weibull-CDF, and κ -Maxwellian-CDF. Our results show that q -Weibull-CDF is the least appropriate for length distributions among the four CDFs. Also, the q -Exponential, q -Gaussian, and κ -Maxwellian distribution functions fit the cumulative distributions of viral protein lengths with remarkable agreement. Furthermore, our findings suggest that the correlations for viral protein lengths are identical for both species, even though their average lengths differ.
physics, multidisciplinary
What problem does this paper attempt to address?