Research Article Volume 7 Issue 5
1Natural and Medical Sciences Research Center, University of Nizwa, Oman
2Department of Botany, North Orissa University, India
Correspondence: Tapan Kumar Mohanta, Natural and Medical Sciences Research Center, University of Nizwa, Nizwa, 616, Oman
Received: August 25, 2020 | Published: October 8, 2020
Citation: Mohanta TK, Mohanta YK. Corona virus (CoVid19) genome: genomic and biochemical analysis revealed its possible synthetic origin. J Appl Biotechnol Bioeng 2020;7(5):200-213. DOI: DOI: 10.15406/jabb.2020.07.00235
The Severe acute respiratory syndrome (SARS) corona virus 2 SARS-CoV-2 mediated epidemic is a global pandemic. It has evolved as a curse to the human civilization and at the present situation, where most of the cities in the world are on lockdown. The first genome sequence data of SARS-CoV-2 (CoVid19) and their reports that followed concluded that it was a member of the genus Betacoronavirus and has a bat reservoir. To understand its origin and evolution, we conducted a deep comparative study by comparing the genomes of bat SARS CoV and other SARS CoVs (including human SARS CoV of German isolate). Results revealed that CoVid19 genomes from isolates of China, India, Italy, Nepal, and the United States of America has sequence similarity of 79-80% only with the bat SARS CoV and it has sequence similarity of approximately 60% with the human SARS CoV of German isolate. Whereas, the sequence similarity within the CoVid19 genomes of these countries was 99-100%. If the SARS CoV infection happened to human through the SARS CoV of bat origin, it should have sequence similarity of more than 99% which was absent in this case. Phylogenetic analysis revealed, bat SARS CoV did not fall with the group of SARS CoV of China, India, Italy, Nepal, and USA isolates. The genome analysis revealed the presence of multiple microsatellite repeats sequences. Proteome analysis revealed, the melting temperature (Tm) of surface glycoprotein was less than 55oC, suggesting the steam treatment can be an ideal preventative measure to destabilize the CoVid19, and thus it’s spreading.
Keywords: SARS, corona virus, SARS-CoV-2, CoVid19, MERS, epidemic, pandemic
SARS, severe acute respiratory syndrome; Tm, temperature; CoV, corona virus
Severe acute respiratory syndrome (SARS) corona virus 2 belonged to the family Coronavriridae of the order Nidovirales. It contains a positive-sense single stranded genome. The genome encodes overlapping polyproteins ORF1ab, surface glycoprotein, ORF3a, envelope protein, membrane glycoprotein, ORF6, ORF7a, ORF8, nucleocapsid phosphoprotein, and ORF10. The ORF1ab get processed into the viral polymerase. The Corona virus (CoV) causes disease in variety of the wild and domestic animals including humans. The α- and β-CoVs usually infect the mammals and γ-and δ-CoVs infect birds.1 The β-CoVs including middle east respiratory syndrome (MERS)- CoV and sever acute respiratory syndrome (SARS)-CoVs caused global pandemic since 2002-2003. The SARS-CoV was originated from China and created a global pandemic by infecting more than 8000 individual with mortality rate of 10%.2 Since then, the disease related to CoVs became a dangerous threat to the human civilization. Recently, SARS-CoV-2 (known as CoVid19) had outbreak from the Wuhan city of China and got spread all over the world by infecting more than 238 million people with more than 817000 deaths so far (25th August 2020). This resulted a severe global pandemic and shaken the economy of the world. At the present situation, the whole world is locked down to stop the human-to-human spread of SARS-CoV-2 and we do not know when we will overcome this situation. The lack of medicine or vaccine at the present situation made it uncontrolled. Although several research laboratories of the world are actively working in different strategies to control the SARS-CoV-2, there are several misconception and misrepresentation about the genomics, mutation, and evolution of the SARS-CoV-2 genome in the public domain. Few of the general misconception regarding the SARS-CoV-2 are as follows; SARS-CoV-2 has evolved from bat or pangolin corona virus,3 two CoV genome merged to become SARS-CoV-2, SARS-CoV-2 was synthesized in the laboratory to use as a bio-weapon, the effectiveness of the SARS-CoV-2 in cold weather country is high, garlic and hot water reduces the effectiveness of SARS-CoV-2 and others. To address these highly important aspects, we have conducted genomic, proteomic, and evolutionary study to understand the mutation and biochemical features of the SARS-CoV-2 genomes and proteomes.
SARS-CoV-2 genome do not have significant similarity with bat or pangolin SARS corona virus
The genome size of SARS-CoV-2 genome varies from 27317 to 29903 nucleotides with GC content of 38 to 38.3% (Table 1). The SARS-CoV-2 genome encodes for 10 to 12 complementary DNA sequences (CDS). The isolate of China Wuhan-Hu-1 (accession NC_045512.2) contained 12 CDS whereas other SARS-CoV-2 genomes encode 10 CDS (Table 1). The human SARS corona virus reported in 2003 in Germany also encoded 12 CDS (Table 1). To understand the genomic and evolutionary aspects of SARS-CoV-2 genome, we downloaded 24 whole genome sequences of SARS-CoV-2 originated from different countries of the world. This includes 12 SARS-CoV-2 genomes from China, seven from the United States of America (USA), and one each from Canada, Germany, India, Italy, Nepal, and the United Kingdom. The human corona virus genome of Germany (accession number NC_004718.3) was also considered as a reference for the comparative study as it was reported long back in 2003.4 We made a comparative sequence similarity study of recent SARS-CoV-2 genomes with isolates of human corona virus of German origin. We found SARS-CoV-2 genome had sequence similarity of 60.13% (Nepal CoVid19) to 60.24% (China SARS-CoV-2 Wuhan-Hu-1) with the German origin human corona virus (Table 2). Comparative similarity study of SARS-CoV-2 genomes with bat SARS KHU3-1 showed similarity level ranged from 79.44% (CoVid19 USA) to 79.78 % (CoVid19 China Wuhan-Hu-1) (Table 2). Comparative similarity study of human SARS-CoV-2 with bat SARS WIV1 showed 60.01% (SARS CoV Germany) to 80.10% (SARS-CoV-2 China Wuhan-Hu-1 and CoVid19 USA). The MERS CoV showed sequence similarity of 54.59% with SARS-CoV-2 India to 61.25% with SARS CoV-2 of China Wuhan-Hu-1. The SARS-CoV-2 of Wuhan-Hu-1 was originated recently from Wuhan and found in CoVid19 patients (Table 2). Therefore, we made a comparative study of SARS-CoV-2 Wuhan-Hu-1 isolates with SARS-CoV-2 isolates of other countries. We found SARS-CoV-2 Wuahn-Hu-1 had 100% sequence similarity with the SARS-CoV-2 isolates of Nepal followed by similarity level of 99.99% (Italy and the USA), 99.98% (India), and 60.24% (Germany) (Table 2). Till date (4th April 2020) there was presence of 12 CoVid19 genome sequences from the Chinese origin. Therefore, we made a comparative study by aligning all the full-length genome sequences of all the 12 Chinese SARS-CoV-2 genomes. The Chinese CoVid19 Wuhan-Hu-1 has 5'-untranslated region (UTR) from 1st to 265 nucleotides and 3'-UTR from 29675 to 29903 nucleotides. Alignment showed, there were slight differences in the 5' and 3'-UTR and no mutation/substitution was found in the open reading frame (ORF) in the SARS-CoV-2 genomes of Chinese isolates (Supplementary Figure 1). Similarly, there was seven SARS-CoV-2 isolates of the United States of America (USA) origin. We aligned all the seven CoVid19 genomes of isolates of the USA to find the possible mutation or substitution in them. Resulted showed, all the seven genomes had 100% sequence similarity and no mutation or substitution was found within them (Supplementary Figure 2). All the 5' and 3' UTRs were also found to be conserved (Supplementary Figure 2). Later, we aligned recently reported SARS-CoV-2 sequences of China Wuhan-Hu-1, India, Italy, Nepal, and USA. The SARS-CoV-2 of Indian isolate has substituted/mutated G instead of A at position 1671, Italian isolate had substituted T instead of A at position 2269, Indian CoVid19 had substituted T in the place of Cat position 6481, India and USA had substituted T instead of Cat position 8762 and 8782, respectively, Italy has unknown nucleotide N instead of G at position 11083, India had T instead of C at position 16857, Nepal has substituted T instead of C at position 24019, India has T instead of C at position 24331, and Italy has substituted T instead of G at position 26144 (Supplementary Figure 3).
SARS-CoV-2 strain |
Accession number |
Genome size (Mb) |
GC content (%) |
Number of proteins |
China (Wuhan-Hu-1) |
MN908947.3 |
0.029903 |
38 |
10 |
China (Wuhan-Hu-1) |
NC_045512.2 |
0.029903 |
38 |
12 |
India |
MT050493.1 |
0.029851 |
38 |
10 |
Germany |
NC_002645.1 |
0.027317 |
38.3 |
12 |
Italy |
MT066156.1 |
0.029867 |
38 |
10 |
Nepal |
MT072688.1 |
0.029811 |
38 |
10 |
United States of America |
MN985325.1 |
0.029882 |
38 |
10 |
Table 1 Genomic details of different isolates of SARS corona virus 2 from different countries of the world
Corona virus isolate from Country |
Accession number |
Similarity with human CoVid/German (%) |
Similarity with Bat SARS HKU3-1 (%) |
Similarity with Bat SARS WIV1 (%) |
MERS Corona Virus (%) |
Similarity with SARS Cov2 Wuhan-Hu-1 (%) |
China (Wuhan-Hu-1) |
MN908947.3 |
60.24 |
79.78 |
80.1 |
61.25 |
***** |
India |
MT050493.1 |
60.15 |
79.74 |
80.07 |
54.59 |
99.98 |
Germany |
NC_002645.1 |
**** |
59.77 |
60.01 |
60.24 |
60.24 |
Italy |
MT066156.1 |
60.17 |
79.76 |
80.08 |
61.23 |
99.99 |
Nepal |
MT072688.1 |
60.13 |
79.74 |
79.67 |
54.65 |
100 |
United States of America |
MN985325.1 |
60.19 |
79.44 |
80.1 |
61.15 |
99.99 |
Table 2 Comparative genomic analysis of SARS-CoV-2 isolates from different countries with SARS corona virus of source organism
SARS CoVid19 genome is closer to the Bat SARS corona virus genome
To understand the evolutionary linkage of human SARS-CoV-2 with bat SARS CoV and other SARS CoV, we constructed a phylogenetic tree by considering the whole genome sequences of the SARS CoVs. In the study, there were five SARS CoV2 (CoVid19) isolates from different countries whose genome was reported recently. In addition, there was genome sequences of bat CoV, beta SARS CoV of Canada, MERS CoV, United Kingdom beta CoV, bovine CoV, and human CoV 229E (German isolate) as well. The bat SARS CoV HKU3-1 and bat SARS CoV WIV-1 were found close to the human SARS-CoV-2, but fall in a separate group (Figure 1). The bat SARS CoV genome did not group with the SARS CoV2 (CoVid19). However, none of the other CoVs were found closer to the human SARS-CoV-2. The time tree analysis of SARS CoV2 genomes revealed their origin from 0.00 million years ago suggesting their recent origin (Figure 2). The recombination events of the SARS-CoV-2 with other SARS CoV genomes showed no recombination event within themselves or between other SARS CoVs (Figure 3). To understand the nucleotide substitution, a maximum composite likelihood estimate of the pattern of nucleotide substitution was conducted. It showed higher rate of transition compared to the transversion (Table 3). The substitution of T to C nucleotide was 58.98 and the substitution of C to T nucleotide was 34.06 (Table 3). The substitution of purines A to G nucleotide was one and substitution of G to A nucleotide was 0.72. The substitution of A to C/T or G to C/T nucleotide and vice versa was less than one (Table 3). The transition rate of SARS-CoV-2 genome of isolates from China Wuhan-Hu-1, India, Italy, Nepal and USA from C to T nucleotide was 26.82 whereas the transition from T to C nucleotide was 46.86 (Table 3). However, the transversion rate was found below 3 (Table 3).
|
A |
T |
C |
G |
A |
- |
0.85 |
0.49 |
0.72 |
T |
0.75 |
- |
34.06 |
0.54 |
C |
0.75 |
58.98 |
- |
0.54 |
G |
1 |
0.85 |
0.49 |
- |
Substitution of SARS-CoV-2 Isolates of China, India, Italy, Nepal, and USA isolates |
||||
A |
T |
C |
G |
|
A |
- |
2.77 |
1.58 |
3.6 |
T |
2.57 |
- |
26.82 |
1.69 |
C |
2.57 |
46.86 |
- |
1.69 |
G |
5.48 |
2.77 |
1.58 |
- |
Table 3 Maximum composite likelihood estimate of the pattern of nucleotide substitution of SARS CoV genomes
SARS-CoV-2 genome contain microsatellite repeats
Microsatellites are the repetitive DNA motifs of length ranged from one to six or more nucleotides. Analysis revealed the presence of at least 34 unique microsatellites repeat sequences in SARS-CoV-2 genome (Supplementary Table 1). The microsatellite repeats sequences TGTGTG and ACACAC were found 12 times, GTGTGT nine times, ATATAT, and CACACA eight times (Supplementary Table 1). The microsatellites sequences were mapped with the CDS of CoVid19 genome and it was found in the ORF1ab, surface glycoprotein, envelope protein, ORF3a, nucleocapsid phosphoprotein. The microsatellite repeats sequence GTGTGTGTGT found at the position 20486 did not mapped to the CDS, suggesting its occurrence in the non-coding region. The ORF6, ORF7a and ORF8 did not have any microsatellite repeats. The microsatellites present in the coding region might cause phenotypic change and disease.
Repeats in CoVid19 |
Total No. of Repeats |
Position |
Mapped in ORF |
TGTGTG |
12 |
84, 1489, 2327, 4438, 10844, 11546, 14827, 15442, 15728, 16483, 20486, 26359 |
ORF1ab, Envelope protein, |
ACACAC |
12 |
298, 4571, 6188, 8954, 9116, 10999, 12917, 13162, 13661, 16213, 18111, 18553 |
ORF1ab, Surface glycoprotein |
TTCTTCTTC |
2 |
626, 22320 |
ORF1ab, Surface glycoprotein, |
AAAAAA |
7 |
1813, 11990, 29870 |
ORF1ab |
GTGTGT |
9 |
2421, 5515, 17508, 19055, 20486, 21603, 24654, 27458, 29687 |
ORF1ab, Surface glycoprotein |
GAAGAAGAA |
2 |
3055, 3073 |
ORF1ab |
AAGAAGAAG |
2 |
3188, 29389 |
ORF1ab |
GATGATGAT |
1 |
3205 |
ORF1ab |
ATATAT |
8 |
4116, 7254, 11727, 13777, 13948, 19903, 22168, 29593 |
ORF1ab, Surface glycoprotein, ORF10 |
TATATA |
6 |
4237, 16510, 22664, 25186, 26660, 29563 |
ORF1ab, Surface glycoprotein, ORF10 |
TCTCTC |
5 |
4666, 7813, 18566, 22073, 25147 |
ORF1ab, Surface glycoprotein |
CTTCTTCTT |
2 |
4736, 14756 |
ORF1ab |
AGAGAG |
5 |
4850, 6121, 14270, 14484, 22954 |
ORF1ab, Surface glycoprotein |
GAGAGA |
3 |
4950, 7674, 22954 |
ORF1ab, Surface glycoprotein |
CACACA |
8 |
5170, 6538, 13162, 19151, 19317, 24858, 26130, 29545 |
ORF1ab, Surface glycoprotein, ORF3a, |
TCTCTCTCTC |
1 |
7813 |
ORF1ab |
TTTTTT |
4 |
9627, 11074, 19983, 21101 |
ORF1ab |
TTTTTTTT |
1 |
11074 |
ORF1ab |
ATGATGATG |
1 |
11366 |
ORF1ab |
ATCATCATC |
1 |
11910 |
ORF1ab |
ACACACAC |
1 |
13162 |
ORF1ab |
TGATGATGA |
1 |
13895 |
ORF1ab |
CTCTCT |
4 |
7813, 15711, 17122, 22445, |
ORF1ab, Surface glycoprotein |
GTGTGTGTGT |
1 |
20486 |
NA |
GAGAGAGA |
1 |
22954 |
Surface glycoprotein |
AGTAGTAGT |
1 |
23088 |
Surface glycoprotein |
TGTTGTTGT |
1 |
25642 |
ORF3a |
AATAATAAT |
1 |
25757 |
OR3a |
CGACGACGA |
1 |
26191 |
ORF3a |
GTGGTGGTG |
1 |
28556 |
Nucleocapsid phosphoprotein |
TGCTGCTGC |
1 |
28934 |
Nucleocapsid phosphoprotein |
CAACAACAACAA |
1 |
28987 |
Nucleocapsid phosphoprotein |
CTGCTGCTG |
1 |
29021 |
Nucleocapsid phosphoprotein |
AAAAAAAAAAAAAAAA |
1 |
29870 |
NA |
AAAAAAAAAAAAAAAA |
|
Supplementary Table 1 Microsatellite repeats of SARS-CoV-2 genome
Few CoVid19 proteins undergone amino acid substitution/mutation
Multiple sequence alignment revealed, a few SARS-CoV-2 proteins have undergone substitution/mutation. In the ORF1ab of Indian isolate amino acid P (proline) was substituted to L (leucine) at the position 2079 and amino acid T (threonine) was substituted for I (isoleucine) at position 5538 of the protein sequence (Table 4) (Supplementary Figure 4). However, in ORF1ab of isolate of Italy, amino acid L was substituted for X (unknown) at position 3606 (Table 4) (Supplementary Figure 4). In the surface glycoprotein of Indian isolate, amino acid A (alanine) at the position 929 was substituted for V (valine) (Supplementary Figure 5). In ORF3a, amino acid G (glycine) was substituted at position 251 for V in Italian isolate (Table 4) (Supplementary Figure 6). In ORF8, amino acid L was substituted for S (serine) at position 84 in Indian and USA isolates (Table 4) (Supplementary Figure 7). No mutation or substitution was observed for envelope protein, membrane glycoprotein, nucleocapsid phosphoprotein, ORF6, ORF7a, and ORF10.
Name of the protein |
Substitution (position in the sequence) |
Substituted amino acid |
Isolate of the Country |
Envelope protein |
NA |
NA |
NA |
Membrane glycoprotein |
NA |
NA |
NA |
Nucleocapsid phosphoprotein |
NA |
NA |
NA |
ORF1ab |
2079 |
P > L |
India |
3606 |
L > X |
Italy |
|
5538 |
T > I |
India |
|
Surface glycoprotein |
929 |
A > V |
India |
ORF3a |
251 |
G > V |
Italy |
ORF6 |
NA |
NA |
NA |
ORF7a |
NA |
NA |
NA |
ORF8 |
84 |
L > S |
India |
84 |
L > S |
USA |
|
ORF10 |
NA |
NA |
NA |
Table 4 Substitution of SARS corona virus SARS-CoV-2 proteins of isolates
The melting temperature (Tm) of membrane glycoprotein is less than 55oC
We studied the Tm of all the ten proteins found in the genome of SARS-CoV-2 (CoVid19). Analysis revealed, the Tm of the membrane glycoprotein was less than 55oC. The Tm of ORF1ab, surface glycoprotein, ORF3a, envelope protein, and nucleocapsid phosphoprotein was found 55-65oC (Supplementary Figure 8). However, the Tm of ORF6, ORF7a, and ORF10 was found greater than 65oC (Supplementary Table 2). The half-life period of all the proteins were found above 30 hours for reticulocytes/in vitro and more than 20 hours for in vivo (Supplementary Table 3). All the proteins were also found to be stable and the stability of the nucleocapsid phosphoprotein was highest (instability index 55.09). The stability of nucleocapsid phosphoprotein was followed by ORF7a, ORF8, membrane glycoprotein, envelope protein, ORF1ab, surface glycoprotein, ORF3a, ORF6, and ORF10 (Supplementary Table 3).
Protein ID |
Protein Name |
Tm (oC) |
Tm Index |
China |
|||
YP_009724389.1 |
ORF1ab Polyprotein |
55-65 |
0.563 |
YP_009725295.1 |
ORF1a Polyprotein |
55-65 |
0.461 |
YP_009724390.1 |
Surface Glycoprotein |
55-65 |
0.464 |
YP_009724391.1 |
ORF3a |
55-65 |
0.224 |
YP_009724392.1 |
Envelope |
55-65 |
0.52 |
YP_009724393.1 |
Membrane Glycoprotein |
< 55 |
-0.34 |
YP_009724394.1 |
ORF6 |
> 65 |
1.055 |
YP_009724395.1 |
ORF7a |
> 65 |
2.968 |
YP_009725318.1 |
ORF7b |
< 55 |
-0.82 |
YP_009724396.1 |
ORF8 |
> 65 |
1.465 |
YP_009724397.2 |
Nucleocapsid Phosphoprotein |
55-65 |
0.318 |
YP_009725255.1 |
ORF10 Protein |
> 65 |
1.637 |
India |
|||
QIA98582.1 |
ORF1ab Polyprotein |
55-65 |
0.567 |
QIA98583.1 |
Surface Glycoprotein |
55-65 |
0.461 |
QIA98584.1 |
ORF3a |
55-65 |
0.224 |
QIA98585.1 |
Envelope Protein |
55-65 |
0.52 |
QIA98586.1 |
Membrane Glycoprotein |
< 55 |
-0.34 |
QIA98587.1 |
ORF6 |
> 65 |
1.055 |
QIA98588.1 |
ORF7a |
> 65 |
1.773 |
QIA98589.1 |
ORF8 |
> 65 |
1.465 |
QIA98590.1 |
Nucleocapsid Phosphoprotein |
55-65 |
0.318 |
QIA98591.1 |
ORF10 |
> 65 |
1.637 |
Italy |
|||
QIA98553.1 |
ORF1ab polyprotein |
||
QIA98554.1 |
surface glycoprotein |
55-65 |
0.464 |
QIA98555.1 |
ORF3a |
55-65 |
0.328 |
QIA98556.1 |
envelope protein |
55-65 |
0.52 |
QIA98557.1 |
membrane glycoprotein |
< 55 |
-0.34 |
QIA98558.1 |
ORF6 |
> 65 |
1.055 |
QIA98559.1 |
ORF7a |
> 65 |
1.773 |
QIA98560.1 |
ORF8 |
> 65 |
1.465 |
QIA98561.1 |
nucleocapsid phosphoprotein |
55-65 |
0.318 |
QIA98562.1 |
ORF10 |
> 65 |
1.637 |
Nepal |
|||
QIB84672.1 |
ORF1ab polyprotein |
55-65 |
0.563 |
QIB84673.1 |
surface glycoprotein |
55-65 |
0.464 |
QIB84674.1 |
ORF3a |
55-65 |
0.224 |
QIB84675.1 |
Envelope protein |
55-65 |
0.52 |
QIB84676.1 |
membrane glycoprotein |
< 55 |
-0.34 |
QIB84677.1 |
ORF6 |
> 65 |
1.055 |
QIB84678.1 |
ORF7a |
> 65 |
1.773 |
QIB84679.1 |
ORF8 |
> 65 |
1.465 |
QIB84680.1 |
nucleocapsid phosphoprotein |
55-65 |
0.318 |
QIB84681.1 |
ORF10 |
> 65 |
1.637 |
United States of America |
|||
QHO60603.1 |
ORF1ab polyprotein |
55-65 |
0.563 |
QHO60594.1 |
surface glycoprotein |
55-65 |
0.464 |
QHO60595.1 |
ORF3a |
55-65 |
0.224 |
QHO60596.1 |
envelope protein |
55-65 |
0.52 |
QHO60597.1 |
membrane glycoprotein |
< 55 |
-0.34 |
QHO60598.1 |
ORF6 |
> 65 |
1.055 |
QHO60599.1 |
ORF7a |
> 65 |
1.773 |
QHO60600.1 |
ORF8 |
> 65 |
1.465 |
QHO60601.1 |
nucleocapsid phosphoprotein |
55-65 |
0.318 |
QHO60602.1 |
ORF10 |
> 65 |
1.637 |
Supplementary Table 2 Predicted melting temperature (Tm) of SARS-CoV-2 proteins
Proteins |
Isolate Country |
Molecular formula |
Half-life in reticulocytes/vitro (Hrs) |
Half-life in vivo (Hrs) |
Instability Index (II) |
Envelope protein |
China |
C390H625N91O103S4 |
30 |
> 20 |
38.68/Stable |
Envelope protein |
India |
C390H625N91O103S4 |
30 |
> 20 |
38.68/Stable |
Envelope protein |
Italy |
C390H625N91O103S4 |
30 |
> 20 |
38.68/Stable |
Envelope protein |
Nepal |
C390H625N91O103S4 |
30 |
> 20 |
38.68/Stable |
Envelope protein |
USA |
C390H625N91O103S4 |
30 |
> 20 |
38.68/Stable |
Membrane glycoprotein |
China |
C1165H1823N303O301S8 |
30 |
> 20 |
39.14/Stable |
Membrane glycoprotein |
India |
C1165H1823N303O301S8 |
30 |
> 20 |
39.14/Stable |
Membrane glycoprotein |
Italy |
C1165H1823N303O301S8 |
30 |
> 20 |
39.14/Stable |
Membrane glycoprotein |
Nepal |
C1165H1823N303O301S8 |
30 |
> 20 |
39.14/Stable |
Membrane glycoprotein |
USA |
C1165H1823N303O301S8 |
30 |
> 20 |
39.14/Stable |
Nucleocapsid phosphoprotein |
China |
C1971H3137N607O629S7 |
30 |
> 20 |
55.09/Stable |
Nucleocapsid phosphoprotein |
India |
C1971H3137N607O629S7 |
30 |
> 20 |
55.09/Stable |
Nucleocapsid phosphoprotein |
Italy |
C1971H3137N607O629S7 |
30 |
> 20 |
55.09/Stable |
Nucleocapsid phosphoprotein |
Nepal |
C1971H3137N607O629S7 |
30 |
> 20 |
55.09/Stable |
Nucleocapsid phosphoprotein |
USA |
C1971H3137N607O629S7 |
30 |
> 20 |
55.09/Stable |
ORF1ab |
China |
C35644H55333N9253O10496S394 |
30 |
> 20 |
33.31/Stable |
ORF1ab |
India |
C35646H55339N9253O10495S394 |
30 |
> 20 |
33.25/Stable |
ORF1ab |
Italy |
C35638H55322N9252O10495S394 |
30 |
> 20 |
33.36/Stable |
ORF1ab |
Nepal |
C35644H55333N9253O10496S394 |
30 |
> 20 |
33.31/Stable |
ORF1ab |
USA |
C35644H55333N9253O10496S394 |
30 |
> 20 |
33.31/Stable |
ORF3a |
China |
C1440H2189N343O404S11 |
30 |
> 20 |
32.96/Stable |
ORF3a |
India |
C1440H2189N343O404S11 |
30 |
> 20 |
32.96/Stable |
ORF3a |
Italy |
C1443H2195N343O404S11 |
30 |
> 20 |
32.96/Stable |
ORF3a |
Nepal |
C1440H2189N343O404S11 |
30 |
> 20 |
32.96/Stable |
ORF3a |
USA |
C1440H2189N343O404S11 |
30 |
> 20 |
32.96/Stable |
ORF6 |
China |
C334H532N78O96S3 |
30 |
> 20 |
31.16/Stable |
ORF6 |
India |
C334H532N78O96S3 |
30 |
> 20 |
31.16/Stable |
ORF6 |
Italy |
C334H532N78O96S3 |
30 |
> 20 |
31.16/Stable |
ORF6 |
Nepal |
C334H532N78O96S3 |
30 |
> 20 |
31.16/Stable |
ORF6 |
USA |
C334H532N78O96S3 |
30 |
> 20 |
31.16/Stable |
ORF7a |
China |
C633H988N156O171S7 |
30 |
> 20 |
48.66/Stable |
ORF7a |
India |
C633H988N156O171S7 |
30 |
> 20 |
48.66/Stable |
ORF7a |
Italy |
C633H988N156O171S7 |
30 |
> 20 |
48.66/Stable |
ORF7a |
Nepal |
C633H988N156O171S7 |
30 |
> 20 |
48.66/Stable |
ORF7a |
USA |
C633H988N156O171S7 |
30 |
> 20 |
48.66/Stable |
ORF8 |
China |
C633H961N155O177S8 |
30 |
> 20 |
45.79/Stable |
ORF8 |
India |
C630H955N155O178S8 |
30 |
> 20 |
46.24/Stable |
ORF8 |
Italy |
C633H961N155O177S8 |
30 |
> 20 |
45.79/Stable |
ORF8 |
Nepal |
C633H961N155O177S8 |
30 |
> 20 |
45.79/Stable |
ORF8 |
USA |
C630H955N155O178S8 |
30 |
> 20 |
46.24/Stable |
ORF10 |
China |
C206H312N50O54S3 |
30 |
> 20 |
16.06/Stable |
ORF10 |
India |
C206H312N50O54S3 |
30 |
> 20 |
16.06/Stable |
ORF10 |
Italy |
C206H312N50O54S3 |
30 |
> 20 |
16.06/Stable |
ORF10 |
Nepal |
C206H312N50O54S3 |
30 |
> 20 |
16.06/Stable |
ORF10 |
USA |
C206H312N50O54S3 |
30 |
> 20 |
16.06/Stable |
Surface glycoprotein |
China |
C6336H9770N1656O1894S54 |
30 |
> 20 |
33.01/Stable |
Surface glycoprotein |
India |
C6338H9774N1656O1894S54 |
30 |
> 20 |
33.01/Stable |
Surface glycoprotein |
Italy |
C6336H9770N1656O1894S54 |
30 |
> 20 |
33.01/Stable |
Surface glycoprotein |
Nepal |
C6336H9770N1656O1894S54 |
30 |
> 20 |
33.01/Stable |
Surface glycoprotein |
USA |
C6336H9770N1656O1894S54 |
30 |
> 20 |
33.01/Stable |
Supplementary Table 3 Half-life period and instability index of SARS-CoV-2 proteins
Amino acid composition of leu was highest and Trp was lowest in CoVid19 proteome
To understand the amino acid composition, we analysed all the full-length protein sequences of the SARS-CoV-2 proteomes. We found Leu (9.489%) was the highest and Trp (1.118%) was the lowest abundant amino acid in the SARS-CoV-2 proteome (Supplementary Table 4). The highest abundance of Leu amino acids in CoVid19 proteome was followed by Val (8.084%), Thr (7.428%), and Ser (6.785%) (Supplementary Table 4). Principal component analysis of amino acid composition revealed the grouping of Asn, Tyr, Thr, Phe, and Ser; Pro, Gly, Arg, and Cys; and Trp, His, Gln, Asp, Lys, and Glu (Figure 4). The ORF1ab encodes for highest number (7096) of amino acids whereas ORF10 encodes lowest number of amino acids (38) (Supplementary Figure 9).
Amino |
SARS-CoV-2 Sequences from Different Countries |
Average (%) |
||||
Acids |
China |
India |
Nepal |
Italy |
USA |
|
Ala |
656 |
655 |
637 |
656 |
656 |
6.772 |
Cys |
294 |
294 |
290 |
294 |
294 |
3.045 |
Asp |
509 |
509 |
503 |
509 |
509 |
5.219 |
Glu |
439 |
439 |
432 |
439 |
439 |
4.608 |
Phe |
494 |
494 |
483 |
494 |
494 |
5.069 |
Gly |
576 |
576 |
562 |
575 |
576 |
5.934 |
His |
187 |
187 |
182 |
187 |
187 |
1.909 |
Ile |
508 |
508 |
488 |
508 |
508 |
5.196 |
Lys |
562 |
562 |
555 |
562 |
562 |
5.839 |
Leu |
919 |
919 |
884 |
918 |
918 |
9.489 |
Met |
205 |
205 |
201 |
205 |
205 |
2.139 |
Asn |
531 |
531 |
520 |
531 |
531 |
5.457 |
Pro |
394 |
393 |
391 |
394 |
394 |
4.035 |
Gln |
364 |
364 |
360 |
364 |
364 |
3.732 |
Arg |
350 |
350 |
336 |
350 |
350 |
3.54 |
Ser |
659 |
660 |
644 |
659 |
660 |
6.785 |
Thr |
717 |
716 |
704 |
717 |
717 |
7.428 |
Val |
780 |
782 |
768 |
781 |
780 |
8.084 |
Trp |
110 |
110 |
103 |
110 |
110 |
1.118 |
Tyr |
447 |
447 |
438 |
447 |
447 |
4.593 |
Xaa |
|
|
|
1 |
|
|
Supplementary Table 4 Amino acid composition of SARS-CoV-2 from different countries of the word
Molecular weight ranged from 4.449 to 794.057 kDa and isoelectric point (pI) ranged from 4.495 to 9.487
The molecular weight of the CoVid19 proteins ranged from 4.449 (ORF10) to 794.057 (ORF1ab) kDa) (Supplementary Table 5). The molecular weight of the other SARS-CoV-2 proteins were 141.178 (surface glycoprotein), 45.625 (nucleocapsid phosphoprotein), 31.122 (ORF3a), 25.146 (membrane glycoprotein), 13.831 (ORF8), 8.365 (envelope protein), and 7.272 (ORF6) kDa) (Supplementary Table 5). Except for ORF1ab and surface glycoprotein, all other proteins were found below 50 kDa. The pI of SARS-CoV-2 proteome ranged from 4.495 (ORF6) to 9.487 (nucleocapsid phosphoprotein) (Supplementary Table 6). The ORF1ab (5.982), surface glycoprotein (5.906), ORF3a (5.321), ORF8 (5.219) and ORF6 (4.495) were found to have pI below seven) (Supplementary Table 6). Analysis of palmitoylation sites in CoVid19 proteins revealed the presence of palmitoylation sites in SARS-CoV-2 proteins (Supplementary Table 7). Co-valent attachment of palmitic acid occurs at the cysteine residue of the protein to increase the hydrophobicity and membrane association (Supplementary Figure 10).
Proteins |
Molecular Weight (KDa) of SARS-CoV-2 proteins from different Countries |
||||
China |
India |
Italy |
USA |
Nepal |
|
ORF1ab |
794.0578 |
794.0719 |
793.9446 |
794.0578 |
794.0578 |
Surface glycoprotein (S) |
141.1785 |
141.2065 |
141.1785 |
141.1785 |
141.1785 |
ORF3a |
31.12294 |
31.12294 |
31.16502 |
31.12294 |
31.12294 |
Envelope protein (E) |
8.36504 |
8.36504 |
8.36504 |
8.36504 |
8.36504 |
Membrane glycoprotein (M) |
25.14662 |
25.14662 |
25.14662 |
25.14662 |
25.14662 |
ORF6 |
7.27254 |
7.27254 |
7.27254 |
7.27254 |
7.27254 |
ORF7a |
13.74417 |
13.74417 |
13.74417 |
13.74417 |
13.74417 |
ORF8 |
13.83101 |
13.80493 |
13.83101 |
13.80493 |
13.83101 |
Nucleocapsid (N) |
45.6257 |
45.6257 |
45.6257 |
45.6257 |
45.6257 |
ORF10 |
4.44923 |
4.44923 |
4.44923 |
4.44923 |
4.44923 |
Supplementary Table 5 Molecular weight of SARS-CoV-2 proteins
Proteins |
pI of SARS-CoV-2 proteins from different Countries |
||||
China |
India |
Italy |
USA |
Nepal |
|
ORF1ab |
5.982 |
5.982 |
5.982 |
5.982 |
5.982 |
Surface glycoprotein (S) |
5.906 |
5.906 |
5.906 |
5.906 |
5.906 |
ORF3a |
5.321 |
5.321 |
5.321 |
5.321 |
5.321 |
Envelope protein (E) |
7.761 |
7.761 |
7.761 |
7.761 |
7.761 |
Membrane glycoprotein (M) |
9.084 |
9.084 |
9.048 |
9.048 |
9.048 |
ORF6 |
4.495 |
4.495 |
4.495 |
4.495 |
4.495 |
ORF7a |
7.249 |
7.249 |
7.249 |
7.249 |
7.249 |
ORF8 |
5.219 |
5.219 |
5.219 |
5.219 |
5.219 |
Nucleocapsid (N) |
9.487 |
9.487 |
9.487 |
9.487 |
9.487 |
ORF10 |
8.302 |
8.302 |
8.302 |
8.302 |
8.302 |
Supplementary Table 6 Isoelectric point of SARS-CoV-2 proteins
Proteins |
Palmitoylation |
Sites |
Score |
Envelope protein |
ILTALRLCAYCCNIV |
40 |
7.195 |
ALRLCAYCCNIVNVS |
43 |
18.349 |
|
LRLCAYCCNIVNVSL |
44 |
7.773 |
|
Membrane glycoprotein |
NA |
NA |
NA |
Nucleocapsid phosphoprotein |
NA |
NA |
NA |
ORF1ab |
ARAGKASCTLSEQLD |
213 |
15.061 |
GHNLAKHCLHVVGPN |
1114 |
11.529 |
|
NSQTSLRCGACIRRP |
5340 |
14.034 |
|
ORF3a |
IIMRLWLCWKCRSKN |
130 |
4.168 |
ORF6 |
NA |
NA |
NA |
ORF7a |
ALITLATCELYHYQE |
15 |
23.709 |
ELYHYQECVRGTTVL |
23 |
14.058 |
|
ORF8 |
VAAFHQECSLQSCTQ |
20 |
12.122 |
ORF10 |
TIYSLLLCRMNSRNY |
19 |
24.543 |
Surface glycoprotein |
LPLVSSQCVNLTTRT |
15 |
23.1 |
Myristylation |
|||
NA |
NA |
NA |
NA |
Supplementary Table 7 Prediction of palmitoylation sites in SARS-CoV-2 proteins
Sequence analysis and similarity study of SARS-CoV-2 (CoVid19) genomes with bat SARS CoVs, MERS CoV, human CoV HKU1 and other revealed that bat SARS CoV and human SARS CoVs (229E German isolate) are not the direct and immediate contributor to the human SARS-CoV-2 (CoVid19) genome. If the genome would have come from either bat SARS CoV or human SARS CoVs 229E, there would be more than 99% of sequence similarity with the direct donor. The rate of mutation of the nucleotides are not so frequent that SARS-CoV-2 (CoVid19) will mutate to such an extent that at short frame of time (a few months) it will result only 80% sequence similarity with bat SARS CoV or human SARS CoV 229E (Table 2). The mutation rate of human genome is 2.5x10-8 or 175 mutation per diploid genome per generation.5 The mutation rate of RNA viral genome ranged from 10-6 to 10-4 substitution per nucleotide and nucleotide substitution are more common than insertions or deletions.6 The human SARS CoV 229E genome of German isolate reported long ago in 2003 and it’s also showed only 60% sequence similarity with SARS-CoV-2 (CoVid19) (Table 2). However, when sequence similarity study of SARS-CoV-2 was conducted with recent reports of SARS-CoV-2 isolates from China, India, Nepal, and USA, it showed 99% to 100% sequence similarity with each other (excluding SARS CoV 229E German isolate) (Table 2). The phylogenetic tree also did not show any close grouping with the bat SARS CoV (Figure 1). The bat SARS CoV falls in a separate group in the phylogenetic tree and if the SARS-CoV-2 genome would have directly come from bat SARS CoV, they would have certainly grouped with the SARS-CoV-2 genomes (Figure 1). The classic example is that, human SARS CoV 229E of German isolate reported in 2003 fall far distantly. The recent isolates of SARS-CoV-2 of different countries have not undergone significant mutation. Instead it was observed that, the recent SARS-CoV-2 genomes have undergone some substitutions. The substituted G nucleotide instead of A (Indian isolate), substituted T nucleotide instead of A (Italian isolate), substituted T nucleotide instead of C (Indian and USA isolates), substituted N nucleotide instead of G (Italian isolate), substitution of T nucleotide instead of C (Indian and Nepal isolate), and substituted T nucleotide instead of G (Italian isolates) were the classic examples of SARS-CoV-2 substitution (Supplementary Figure 3). Maximum composite likelihood analysis for pattern nucleotide substitution resulted high rate of transition from T to C nucleotide and a lower rate of transversion (Table 3). However, the transition rate of the genomes of SARS-CoV-2 isolates of countries China, India, Italy, Nepal, and USA was lower than the transition rate of SARS CoVs with bat CoV, MERS CoV, SARS CoV of Canada, bovine CoV, SARS CoV of Germany and others. The time tree analysis also revealed the recent origin of SARS-CoV-2 which date back to 0.00 million years ago, suggesting their evolution from a recent synthetic source (Figure 2). Study reported the shifting of SARS CoV from one host to another.7 Study also reported about the recombination history of bat SARS CoV of Kenya and German isolate 229E.8,9 However, our analysis did not result any recombination within the SARS CoV genomes or SARS-CoV-2 genome (Figure 3), suggesting their recent synthetic origin.
The substitution of nucleotides led to the substitution of amino acids in the CDS. In ORF1ab that encode for viral RNA polymerase found to have amino acid P>L (Indian isolate) substitution, L>X (Italian isolate) substitution, and T>I (Indian isolate) substitution. Indian isolate has two substitutions in ORF1ab. The substitution of amino acid P to L in human immune deficiency (HIV) reverse transcriptase (RT) virus led to sensitize RT7 to 10 folds to Nevirapine antiviral drug.10 However, the substitution of amino acid T>I show resistant to ganciclovir in human cytomegalovirus.10 The substitution of amino acid A>V found in surface glycoprotein of Indian isolates. The substitution of amino acid A>V in Zika virus NS2A protein affects viral RNA synthesis and attenuates the virus in vivo.11 Substitution of amino acid G>V was found in ORF3a in Italian SARS-CoV-2 isolate. The substitution of amino acid G>V in Thermoplasma acidophilum citrate synthase interfere with the stability and activity of the protein. It also lead to the temperature sensitive altered drug resistance in cytoplasmic loop of the P-glycoprotein.12,13 In addition, substitution of amino acid G>V lead to delayed folding in type-I pro-collagen protein.14 ORF8 has amino acid L>S substitution in Indian and USA CoVid19 isolate. The substitution of amino acid L>S induces mecillinam and quinolones resistance.15,16 The genomic and CDS sequences of the SARS-CoV-2 isolates contained short microsatellite repeats and the presence of microsatellite repeats might favours the substitution and polymorphism in SARS-CoV-2 genome.17,18 The substitution and recombination study of bat CoV was studied long before and it was reported the coexistence of different genotype in the same bat.19 However, no such different genotype was observed in the human SARS-CoV-2 till now. Lau et al., (2010) conducted a recombination study of bat corona virus Ro-batCoV HKU9 genome and generated a recombinant bat CoV.19 However, they have not mentioned what was the possible objective and implication of the generated recombinant bat CoV. The lack of high sequence similarity of SARS-CoV-2 genome with bat and CoV genome proved that, the present SARS-CoV-2 genome did not come from the bat CoV directly. Indeed, the skeleton was sourced from the bat CoV and some synthetic nucleotides were inserted in the bat CoV genome to generate a SARS-CoV-2 genome. Further, human SARS CoV 229E and Chinese SARS CoV (accession: NC_045512.2) had 12 CDS, Canada SARS CoV (accession: NC_004718.3) had 14 CDS whereas the SARS-CoV-2 contain only 10 CDS. It is yet to know why the previous SARS CoV genome contained 12-14 CDS and recent SARS-CoV-2 (CoVid19) genome contained only 10 CD. In addition, the generation of recombinant bat CoV genome by Lau et al.,19 directly linked towards the generation of recombinant/synthetic CoV genome. This proves that the recent CoVid19 genome might be synthetic in origin.
Proteomic analysis revealed, out of ten SARS-CoV-2 proteins, six of them are have melting temperature (Tm) ranged between 55-65oC whereas ORF6, ORF8, and ORF10 had Tm greater than 65oC. Only the membrane glycoprotein had Tm below 55oC (Supplementary Table 2). If the membrane glycoprotein of SARS-CoV-2 possess Tm less than 55oC, this protein most possibly highly temperature sensitive and this protein can be targeted to destabilize SARS-CoV-2 through application of high temperature. Application of steam through the airways (nose and mouth) has the potential to destabilize the CoVid19 surface glycoprotein and if a person at the early stage of infection receives steam treatment it can be of very useful to reduce the impact of the virus. Chan et al.,20 reported that the viability of SARS CoV lost at >3Log10 at 38oC and relative humidity of greater than 98%. Therefore, the steam application can be a highly viable method to fight SARS-CoV-2 as it will provide high temperature and humidity simultaneously. L-arginine is used to supress the protein aggregation.21 Therefore, application of saline drips with L-arginine supplement to the SARS-CoV-2 patient may inhibit the aggregation of viral proteins inside the cell thereby lowering the formation of more virus inside the cell. This might be a valuable step towards the suppression of formation of new SARS-CoV-2.
Sequence data
Various corona virus isolates were downloaded from the NCBI database. In total 30 full length corona virus genomes were downloaded. They were bat SARS CoV HKU 3-1 (accession: DQ022305.2), bat SARS CoV WIV1 (accession: KF367457.1), bovine CoV (accession: NC_003045.1), beta CoV from Canada (accession: NC_004718.3), SARS CoVid19 (CoV2) from China (accession: NC_045512.2, MN938384.1, MN975262.1, MN988668.1, MN988669.1, MN996527.1, MN996528.1, MN996529.1, MN996530.1, MN996531.1, MT135041.1, MT135043.1, and MN908947.3), human SARS CoV from Germany (accession: NC_002645.1), SARS CoV2 India (accession: MT050493.1), SARS CoV2 Italy (accession: MT066156.1), MERS CoV (accession: NC_019843.3), SARS CoV2 from Nepal (accession: MT072688.1), beta CoV from the United Kingdom (accession: KC164505.2), and SARS CoV2 of the United States of America (accession: MN985325.1, MN988713.1, MN994467.1, MN994468.1, MN997409.1, MT027063.1 and MT027062.1). The term CoV2 was kept for recently sequence CoVid19 genome originated from the CoVid19 patient.
Analysis of sequence similarity
To find the possible donor of human SARS-CoV-2 from bat CoVs, we aligned the full-length whole genome sequences of SARS-CoV-2 isolates of China, India, Italy, Nepal, and USA with the human SARS CoV isolates of German, bat SARS CoV HKU3-1, bat SARS CoV WIV1, MERS CoV, and SARS CoV2 Wuhan-Hu-1. Sequence alignment was conducted using MUSCLE program (https://www.ebi.ac.uk/Tools/msa/muscle/). We aligned the full-length genome sequence isolates of China, India, Nepal, Italy, and China SARS-CoV-2 genomes to understand the nucleotide similarity and variation among them. There was more than 12 SARS-CoV-2 isolates from China alone. We aligned all the SARS-CoV-2 Chinese isolates to find the variation within the Chinese population. Similarly, there was seven SARS-CoV-2 isolates from the USA. We also aligned all the full-length SARS-CoV-2 genomes of the USA isolates together. The full length CDS and protein sequences were downloaded from the NCBI in fasta format. Multiple sequence alignment of CDS sequences were also conducted using the MUSCLE programme. The protein sequences of the SARS-CoV-2 proteins were aligned using Multalin software to find the substitution in amino acids. The presence of microsatellite markers in the SARS-CoV-2 genome was analysed using the microsatellite repeat finder (http://insilico.ehu.es/mini_tools/microsatellites/). Default parameters were used to find the microsatellite repeats.
Construction of the phylogenetic tree
To construct the phylogenetic tree, the CDS sequences and full-length whole genome sequences of SARS-CoV-2 genomes were aligned using MUSCLE multiple sequence alignment program. The aligned sequence files were converted to MEGA file format using MEGA6 software.22 Prior to the construction of the phylogenetic tree, a model selection was conducted in MEGA6 software. The phylogenetic tree was constructed using the lowest BIC score of the model selection result. The phylogenetic tree was constructed using the maximum likelihood approach. The statistical parameters used to construct the phylogenetic tree was; model/method, general time reversible model; substitution type, nucleotides; rates among sites, gamma distributed with invariant sites (G+I); no of discrete gamma parameters, 5; and number of bootstrap replicates, 1000. The codon usage bias and maximum likelihood estimate of substitution was studied using MEGA6 software. The program used to analyse the maximum likelihood substitution was; substitution pattern estimation (ML); model/method, general time reversible model; rates among sites, gamma distributed with invariant sites; number of discrete gamma parameters, 5. The time tree (Reltime ML) was conducted using MEGA622.The recombination event study of CoVid19 with other SARS CoVs were analysed using the IcyTree.23
CoVid19 proteome analysis
The isoelectric point and molecular weight of the SARS-CoV-2 proteins of the isolates of China, Indian, Italy, Nepal, and USA were calculated using IPC isoelectric point calculator in a Linux based platform.24 The amino acid composition was also calculated using a Linux based code. The principal component analysis of amino acid composition of the SARS-CoV-2 proteins was conducted using scientific statistical analysis software Past3 (https://folk.uio.no/ohammer/past/). The half-life period of the SARS-CoV-2 proteins was calculated using Protoparam tool (https://web.expasy.org/protparam/).19 The melting temperature (Tm) of CoVid19 proteins were analysed using protein Tm predictor (http://tm.life.nthu.edu.tw/). Palmitoylation of SARS-CoV-2 proteins was analysed using CSS palm software.25
The lack of significant sequence similarity of bat SARS CoV genome with SARS-CoV-2 genome showed the origin of SARS-CoV-2 other than bat SARS or human SARS CoV (German). Most possibly it was a synthetic genome (with bat CoV as a skeleton) as no recombination events was found within or between SARS CoVs. The phylogenetic tree also supported the origin of SARS-CoV-2 other than bat SARS CoV. The time tree analysis also revealed the recent origin of SARS-CoV-2. The publication by Lau et al.,19 support the finding that laboratory based recombination study of bat SARS CoV was conducted in China to generate recombinant bat SARS CoV.19 The lack of explanation regarding the application of recombinant bat SARS CoV by Lau et al.,19 make it doubtful towards the natural origin of SARS-CoV-2. The presence of low Tm of CoVid19 surface glycoprotein might get destabilize by the application of high temperature steam to stop the viral activities.
TKM: Conceived the idea, analysed the data, drafted the manuscript. YKM: drafted and revised the manuscript.
There is no competing of interest to declare.
None.
©2020 Mohanta, et al. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.