?:abstract
|
-
Since first identified in December of 2019, COVID-19 has been quickly spreading to the world in few months and COVID-19 cases are still undergoing rapid surge in most countries worldwide. The causative agent, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), adapts and evolves rapidly in nature. With the availability of 16,092 SARS-CoV-2 full genomes in GISAID as of May 13th , we removed the poor-quality genomes and performed mutational profiling analysis for the remaining 11,183 viral genomes. Global analysis of all sequences identified all single nucleotide polymorphisms (SNPs) across the whole genome and critical SNPs with high mutation frequency that contributes to five-clade classification of global strains. A total of 119 SNPs was found with 74 non-synonymous mutations, 43 synonymous mutations, and 2 mutations in intergenic regions. Analysis of geographical pattern of mutational profiling for the whole genome reveals differences between each continent. A transition mutation from C to T represents the most mutation types across the genome, suggesting rapid evolution and adaptation of the virus in host. Amino acid (AA) deletions and insertions found across the genome results in changes in viral protein length and potential function alteration. Mutational profiling for each gene was analyzed and results show that nucleocapsid gene demonstrates the highest mutational frequency, followed by Nsp2, Nsp3, and Spike gene. We further focused on non-synonymous mutational distributions on four key viral proteins, spike with 75 mutations, RNA-dependent-RNA-polymerase with 41 mutations, 3C-like protease with 22 mutations, and Papain-like protease with 10 mutations. Results show that non-synonymous mutations on critical sites of these four proteins pose great challenge for development of anti-viral drugs and other countering measures. Overall, this study provides more understanding of genetic diversity / variability of SARS-CoV-2 and insights for development of anti-viral therapeutics.
|