Department of Microbiology, University of Dhaka, Dhaka, 1000, Bangladesh.
Department of Gynecology, Obstetrics and Reproductive Health, Bangabandhu Sheikh Mujibur Rahman Agricultural University, Gazipur, 1706, Bangladesh.
Sci Rep. 2020 Aug 19;10(1):14004. doi: 10.1038/s41598-020-70812-6.
Severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), a novel evolutionary divergent RNA virus, is responsible for the present devastating COVID-19 pandemic. To explore the genomic signatures, we comprehensively analyzed 2,492 complete and/or near-complete genome sequences of SARS-CoV-2 strains reported from across the globe to the GISAID database up to 30 March 2020. Genome-wide annotations revealed 1,516 nucleotide-level variations at different positions throughout the entire genome of SARS-CoV-2. Moreover, nucleotide (nt) deletion analysis found twelve deletion sites throughout the genome other than previously reported deletions at coding sequence of the ORF8 (open reading frame), spike, and ORF7a proteins, specifically in polyprotein ORF1ab (n = 9), ORF10 (n = 1), and 3´-UTR (n = 2). Evidence from the systematic gene-level mutational and protein profile analyses revealed a large number of amino acid (aa) substitutions (n = 744), demonstrating the viral proteins heterogeneous. Notably, residues of receptor-binding domain (RBD) showing crucial interactions with angiotensin-converting enzyme 2 (ACE2) and cross-reacting neutralizing antibody were found to be conserved among the analyzed virus strains, except for replacement of lysine with arginine at 378th position of the cryptic epitope of a Shanghai isolate, hCoV-19/Shanghai/SH0007/2020 (EPI_ISL_416320). Furthermore, our results of the preliminary epidemiological data on SARS-CoV-2 infections revealed that frequency of aa mutations were relatively higher in the SARS-CoV-2 genome sequences of Europe (43.07%) followed by Asia (38.09%), and North America (29.64%) while case fatality rates remained higher in the European temperate countries, such as Italy, Spain, Netherlands, France, England and Belgium. Thus, the present method of genome annotation employed at this early pandemic stage could be a promising tool for monitoring and tracking the continuously evolving pandemic situation, the associated genetic variants, and their implications for the development of effective control and prophylaxis strategies.
严重急性呼吸综合征冠状病毒 2(SARS-CoV-2)是一种新型的进化分化 RNA 病毒,是目前毁灭性 COVID-19 大流行的罪魁祸首。为了探索基因组特征,我们全面分析了截至 2020 年 3 月 30 日全球范围内向 GISAID 数据库报告的 2492 个 SARS-CoV-2 株的完整和/或近完整基因组序列。全基因组注释显示,SARS-CoV-2 整个基因组的不同位置有 1516 个核苷酸水平的变异。此外,核苷酸(nt)缺失分析发现,除了之前报道的在编码序列 ORF8(开放阅读框)、刺突和 ORF7a 蛋白中发现的缺失外,基因组中还有 12 个缺失位点,特别是在多蛋白 ORF1ab(n=9)、ORF10(n=1)和 3´-UTR(n=2)中。系统的基因水平突变和蛋白谱分析证据表明,大量氨基酸(aa)替换(n=744),表明病毒蛋白具有异质性。值得注意的是,与血管紧张素转换酶 2(ACE2)和交叉反应中和抗体有重要相互作用的受体结合域(RBD)的残基在分析的病毒株中是保守的,除了上海分离株 hCoV-19/Shanghai/SH0007/2020(EPI_ISL_416320)隐蔽表位的第 378 位赖氨酸被精氨酸取代。此外,我们对 SARS-CoV-2 感染的初步流行病学数据的结果表明,欧洲(43.07%)的 SARS-CoV-2 基因组序列中 aa 突变的频率相对较高,其次是亚洲(38.09%)和北美(29.64%),而病死率仍然较高在欧洲温带国家,如意大利、西班牙、荷兰、法国、英国和比利时。因此,在早期大流行阶段采用的这种基因组注释方法可能是监测和跟踪不断演变的大流行情况、相关遗传变异及其对制定有效控制和预防策略的影响的有前途的工具。