The Role of Replicase Polyprotein 1ab in SARS-CoV-2 and the Analysis of its Cleavage Sites

A close up of a logo Description automatically generated


The 2019 novel coronavirus (SARS-CoV-2) has caused a major pandemic affecting millions of people. The virus is a beta coronavirus and is closely related to the MERS and other SARS viruses. Replicase Polyprotein 1ab is the key polyprotein which is vital for virus replication as it contains the key proteinases that are responsible for the cleavages of the polyprotein. Once the polyprotein is cleaved into the smaller proteins, the individual proteins are important in the transcription process of the RNA in the virus. The SARS-CoV-2 genome was retrieved and then BLAST (Basic Local Alignment Search Tool which is a programme used to compare and contrast the sequence of an organism, and it’s contents, with other species that may contain similar parts of the sequence. The sequence may be the amino acid or nucleotide sequence.) was used to understand the components of the virus, the similarities between SARS-CoV-2, and other known viruses. The cleavage sites and processes have been identified which led to the conclusion that the two proteinases (3c-like proteinase and the papain-like proteinase) are responsible for the cleavage of the polyprotein. We also elucidated on the specific role of each of the smaller proteins within the polyprotein and their importance in viral replication.


The novel coronavirus (SARS-CoV-2) broke out in November/December 2019 in Wuhan, situated in the Chinese province of Hubei. This virus created a worldwide pandemic that has affected almost all countries in the world. As of 30th August 2020 there have been 25,051,178 recorded cases of the virus and 843,641 recorded deaths as a result of the novel coronavirus [1].This is a new Severe Acute Respiratory Syndrome virus and according to current research and evidence, SARS-CoV-2 is mainly transmitted by respiratory droplets, contact pathways, and there is growing research to suggest that airborne transmission is also possible. [2].

SARS-CoV-2 is one of the many types of coronaviruses; these viruses are known to cause illnesses which range from a common cold to a more severe respiratory illness. The novel coronavirus is quite similar to the MERS-CoV which is a Middle East Respiratory Syndrome virus. Over the past two decades there has been a significant number of coronavirus breakouts, each type of virus has certain unique features however there are a number of similarities as well. The first SARS virus outbreak was in 2002 and emerged in the Chinese province of Guangdong. This epidemic spread to about 26 countries and caused more than 8000 cases as well as 774 deaths. From all three of the Coronaviruses mentioned they share the similarity of how they quickly spread to different countries and the early phases of the viruses are similar [3].

SARS-CoV-2 is a single stranded, RNA beta coronavirus that is surrounded by a fatty outer layer called an envelope. It has a ‘crown’, otherwise known as ‘corona’, of proteins around the RNA molecule [4]. Coronaviruses are zoonotic and hence can be transmitted from animals to humans. There is research to believe that the virus originated from bats which then spread to humans in China [5]. Both SARS-CoV and SARS-CoV-2 are closely related in this sense as it is thought that they both have originated from bats, this suggests that these organisms may be a reservoir host for the virus [6].

Since the virus has been infecting humans for a relatively short period of time (first case was recorded in December 2019), there is a lot of unknown information about SARS-Cov-2 and hence in this study we aim to investigate the functions and properties of the different polyproteins present in the virus and to identify the protein chains to which the polyproteins are cleaved to.


First the translated format of the whole SARS-CoV-2 genome was obtained from NCBI GenBank [7] (a database that contains nucleotide and protein sequences for more than 300,000 organisms which is obtained through laboratories and research). By using protein-protein BLAST [8] the FASTA code is then broken down into its components. This website provides us with a way to take the protein code for the genome (FASTA code) and break down the meaning of the sequence. Using BLAST we can figure out the different proteins present in the SARS-CoV-2 sequence and find out any common organisms that may share these proteins as well. For this project, the focus is the biggest polyprotein present in SARS-CoV-2, which is Replicase Polyprotein 1ab. Using Uniprot, a large data source, this polyprotein was identified and researched in depth to look at the components of Replicase Polyprotein 1ab as well as the specific functions of each protein. In addition, Uniprot was used to identify the post translation modifications and the processing events which contains key information to retrieve the protein sequences of SARS-CoV-2. This data is key to see the overall components of the coronavirus and how SARS-CoV-2 replicates with the help of Replicase Polyprotein 1ab.

Replicase Polyprotein 1ab

From the Uniprot database the components of this polyprotein have been identified and listed below. [9]

Table 1. Summary of the functions of the proteins present in Replicase Polyprotein 1ab



References (other than Uniprot)

Nsp1 (Host translation inhibitor)

Inhibits host translation and restricts the phenotype expression of the host gene. This is done by promoting host mRNA degradation. This blocks the innate immune response by the host cell.




No confirmed functions in replication. Interacts with two host proteins; prohibitin 1 and 2




Has multiple functions since it is a multi-domain transmembrane protein. Responsible for the cleavages at the N-terminus of the replicase polyprotein. Also cleaves viral polyprotein and blocks host innate immune response.



Involved in the assembly of double-membrane vesicles (DMVs) necessary for viral replication.


Nsp5 (3C-like proteinase)

The main proteinase that cleaves Replicase Polyprotein 1ab at the C-terminus at 11 sites.



Involved in induction of autophagosomes which engulf the cytoplasmic components of the host cell for degradation.



Forms hexadecameric complex with nsp8. May be involved in viral replication by acting as a primase.



Forms hexadecameric complex with nsp7. May be involved in viral replication by acting as a primase.



RNA binding protein



Stimulates nsp14 and nsp16 in their exoribonuclease and methyltransferase activities.


Nsp12 (RNA-directed RNA polymerase)

Responsible for the replication and transcription of the viral RNA genome


Nsp13 (Helicase)

Catalyses the unwinding of duplex nucleic strands helping with translation.


Nsp14 (Proofreading exoribonuclease)

Important in proofreading the genome. Nsp14 plays a vital role in prevention of nucleotide errors during genome replication.


Nsp15 (Uridylate-specific endoribonuclease)

Cleaves at uridine (compound formed by partial hydrolysis of RNA) residues at the 3’ and the 5’.


Nsp16 (2’-O-methyltransferase)

Methyltransferase that helps with cap methylation of the mRNA. This is important for the virus to evade the immune system.


*nsp = non-structural protein

A close up of a logo

Description automatically generated

Figure 1. Structure of Replicase Polyprotein 1ab (from [21])

Cleavage of Replicase Polyprotein 1ab

Some proteins are synthesised as a large precursor polypeptide known as a polyprotein that needs proteolytic cleavage into smaller chains. Proteolytic cleavage is the breakdown of these peptides into amino acids with the use of the protease enzyme [22].

Polyproteins are very common in viruses and they consist of several proteins that, after synthesis, are cleaved to produce several functionally distinct polypeptides. Polyproteins are a product of a single gene as a single protein which is usually non-functional. However, once the polyprotein is cleaved at specific sites in the virus, the individual and smaller proteins perform their own specific functions.

A polyprotein is cleaved by various proteases within the virus at specific sites. SARS-CoV-2 encodes several proteins, among which there are two proteases that are vital for viral replication. The main protease found in the coronavirus is 3C-like protease which is formally known as C30 Endopeptidase. The other key protease is papain-like protease. These two proteases cleave the two translated polyproteins; replicase polyprotein 1a as well as replicase polyprotein 1ab [23]. The enzymes cleave the coronavirus polyprotein at specific sites which then makes each smaller protein active and readily able to perform its function.

3C-like protease is encoded within the non-structural protein 5 (nsp5). The 3C-like protease shares a key digestive enzyme, chymotrypsin with 3C protease, a homologous protease found in picornaviruses which is a group of many related RNA viruses that can infect vertebrates as well [24]. The 3C-like protease has a key function in viral replication which is shown by the important role in the cleavage process of replicase polyprotein 1ab.

At each cleavage site a component of replicase polyprotein 1ab is cleaved. 3C-like protease is responsible for the majority of the cleavage sites; eleven, this is out of the fifteen cleavage sites where the protein is cleaved to each of the fifteen different chains. The other four chains are cleaved by the papain-like protease which is encoded within the non-structural protein 3 (nsp3). The papain-like protease cleaves the nsp1/2, nsp 2/3, and nsp 3/4 boundaries and the others are done by the main protease [25].

A screenshot of a cell phone

Description automatically generated

Figure 2. Depiction of the cleavage sites of Replicase Polyprotein 1ab and which proteases cleave each specific site (from [26])

The FASTA code of Replicase polyprotein 1ab that we have used in this contains 7096 amino acids and of this code we can see at which amino acids the proteases cleave the polyprotein. By looking at the processing section of the UniProt page for Replicase Polyprotein 1ab within SARS-CoV-2 we can see the lengths of each protein within the larger polyprotein [27].

A screenshot of a social media post

Description automatically generated

Figure 3. BLAST search results of the SARS-CoV-2 genome from the Uniprot database. Directly taken from [27]

The image above directly provides us with the cleavage sites of our polyprotein.

The protein sequences for replicase polyprotein 1ab suggest the importance of the proteinases in the act of viral replication for SARS-CoV-2. From the research completed the cleavage sites for replicase polyprotein 1ab have been identified. From this we know the places where the papain-like protease as well as the main protease (3C-like protease) work on the polyprotein. Each of the proteins held within the main polyprotein have a specific function which has been identified above displaying the importance of each part of Replicase Polyprotein 1ab in the viral replication process.


The protein sequences for each of the smaller proteins within the polyprotein have been retrieved and these sequences have been identified to be each of the different non-structural proteins (nsp) present within Replicase Polyprotein 1ab. The cleavage sites are important to understand the places at which each different protease acts on the polyprotein and splits the chain. Above the functions for each of the 15 proteins are shown and hence we know the importance of each protein in the viral replication process for SARS-CoV-2. The overall function of Replicase Polyprotein 1ab is the viral replication of SARS-CoV-2 and the components of the polyprotein inhibit the host’s immune response system which makes viral replication possible for the novel coronavirus. Since the host immune response is negligible the virus is able to impact the host organism severely as it can replicate quickly. The replication system is very similar to previous coronaviruses such as SARS and MERS.


Many thanks to Dr Preeti Choudhary, European Bioinformatics Institute (EBI) and Dr Clare Roper, Wimbledon High School for their guidance during this research project.


  1. Coronavirus Resource Center. 2020.
  2. Modes of Transmission of Virus Causing Covid-19. 29 March 2020.
  3. Jaimes, Javier. “Phylogenetic Analysis and Structural Modeling of SARS-CoV-2 Spike Protein Reveals an Evolutionary Distinct and Proteolytically Sensitive Activation Loop.” Journal of Molecular Biology (2020) 3309-3311.
  4. Liverpool, Layal. Coronavirus. April 14 2020.
  5. Andersen, Kristian G. “The proximal origin of SARS-CoV-2.” Nature Medicine (March 17 2020) 450-452.
  6. Walls, Alexandra C. “Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein .” Science Direct 281-292.
  7. Wu, F. Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, complete genome . March 18 2020.
  8. n.d. BLAST.
  9. UniProtKB – P0DTD1 [R1AB_SARS2]. April 22 2020.
  10. Kamitani, Wataru. “A two-pronged strategy to suppress host protein synthesis by SARS coronavirus Nsp1 protein .” A natureresearch journal (2009) 1134–1140.
  11. Huang, Cheng. “Alphacoronavirus Transmissible Gastroenteritis Virus nsp1 Protein Suppresses Protein Translation in Mammalian Cells and in Cell-Free HeLa Cell Extracts but Not in Rabbit Reticulocyte Lysate .” Journal of Virology (2010).
  12. Graham, Rachel L. “The nsp2 Replicase Proteins of Murine Hepatitis Virus and Severe Acute Respiratory Syndrome Coronavirus Are Dispensable for Viral Replication .” Journal of Virology (2005).
  13. Cornillez-Ty, Cromwell T. “Severe Acute Respiratory Syndrome Coronavirus Nonstructural Protein 2 Interacts with a Host Protein Complex Involved in Mitochondrial Biogenesis and Intracellular Signaling .” Journal of Virology (2009).
  14. Serrano, Pedro. “Nuclear magnetic resonance structure of the N-terminal domain of nonstructural protein 3 from the severe acute respiratory syndrome coronavirus.” Journal of Virology (2007).
  15. Clementz, Mark A. “Mutation in murine coronavirus replication protein nsp4 alters assembly of double membrane vesicles .” ScienceDirect (2008) 118-129.
  16. Lu, Y. “Identification and characterization of a serine-like proteinase of the murine coronavirus MHV-A59. .” Journal of Virology (1995).
  17. Zhai, Yujia. “Insights into SARS-CoV transcription and replication from the structure of the nsp7–nsp8 hexadecamer .” A natureresearch journal (2005) 980–986.
  18. Egloff, Marie-Pierre. “The severe acute respiratory syndrome-coronavirus replicative protein nsp9 is a single-stranded RNA-binding subunit unique in the RNA virus world.” Proceedings of the National Academy of Sciences of the United States of America, March 16 2004.
  19. Eckerle, Lance D. “High Fidelity of Murine Hepatitis Virus Replication Is Decreased in nsp14 Exoribonuclease Mutants .” Journal of Virology (2007).
  20. Bhardwaj, Kanchan. “RNA Recognition and Cleavage by the SARS Coronavirus Endoribonuclease .” Journal of Molecular Biology (2006) Volume 362, Issue 2 243-256 .
  21. UniProtKB – P0DTD1 (R1AB_SARS2). April 22 2020.
  22. Creighton, Thomas E. Proteins : structures and molecular properties. New York: W.H. Freeman. 1993
  23. Chen, Yu Wai. “Prediction of the SARS-CoV-2 (2019-nCoV) 3C-like protease (3CL pro) structure: virtual screening reveals velpatasvir, ledipasvir, and other drug repurposing candidates .” F1000Research (April 9 2020).
  24. Fan, Keqiang. “Biosynthesis, Purification, and Substrate Specificity of Severe Acute Respiratory Syndrome Coronavirus 3C-like Proteinase.” Journal of Biological Chemistry (2004).
  25. Fehr, Anthony R. Coronaviruses: An Overview of Their Replication and Pathogenesis . New York: Humana Press. 2015
  26. Tong, Tommy. “Drug Targets in Severe Acute Respiratory Syndrome (SARS) Virus and other Coronavirus Infections.” Infectious disorders drug targets (2009) 223-245.
  27. UniProtKB – P0DTD1 (R1AB_SARS2). April 22 2020.

About the Author

Rhea Sheth is currently studying in year 12 in London, UK. She is interested in Maths and the Sciences and enjoys exploring the connection between the subjects. Aiming to pursue further education in this field she is particularly interested in medicine and human physiology. She is also involved in many sports, swimming being her favourite.

Leave a Reply

Your email address will not be published. Required fields are marked *