In silico study: molecular docking of SARS-Cov-2 endoribonuclease on active compounds of Gmelina arborea Roxb. bark

Infection by SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) triggers COVID-19 disease of the respiratory tract similar to pneumonia. The virus encodes four structural proteins and 16 non-structural proteins (nsp), one of which includes nsp15 or endoribonuclease (NendoU). NendoU plays an important role in viral replication and transcription and reduces the stimulation of immune cell responses. Active compounds in Gmelina arborea Roxb. bark have antioxidant properties that can inhibit the NendoU activity of SARS-CoV-2. This study aims to analyze the potential of compounds from Gmelina arborea Roxb. bark in inhibiting SARS-CoV-2 NendoU within in silico using the YASARA structure application. Balnophonin is the best test ligand based on binding ΔG value, dissociation constant (Kd), prediction of physicochemical characteristics, pharmacokinetics, and toxicity. Therefore, balanophonin can be developed as an effective alternative drug to inhibit SARS CoV-2 NendoU.


Introduction
SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) is a virus that causes respiratory infections similar to pneumonia, which is now called COVID-19 (coronavirus disease 2019).The disease was first identified in Wuhan, China, in December 2019 and spread rapidly throughout the world that in January 2020, the World Health Organization (WHO) declared COVID-19 as a public health emergency (Handayani et al., 2020;Aditia, 2021).SARS-CoV-2 is composed of four structural proteins, including spike protein (S), envelope (E), membrane (M), and nucleocapsid (N), as well as 16 non-structural proteins (nsp), one of which is nsp15 or endoribonuclease (NendoU) (Boophati et al., 2020).
Endoribonuclease (NendoU) is an essential protein and highly conserved in all coronaviruses (CoVs) and plays an important role in viral replication and transcription.The protein cleaves the polyuridylate sequences in negative-sense viral RNA, limiting its length and abundance.This mechanism reduces stimulation of the interferon (IFN) immune cell response, thus blocking or inhibiting the activation of the MDA5 sensor in host cells (Mandilara et al., 2021).The docking of drug compounds on these proteins has the potential to inhibit the SARS-CoV-2 replication process in host cells, especially at the catalytic site (Khan et al., 2021).According to Deng et al. (2019), loss of NendoU function will result in a protective immune response.Therefore, inhibition of endoribonuclease makes it an effective therapeutic agent for COVID-19.
Several active compounds from natural materials were reported to be effective in inhibiting SARS-CoV and MERS-CoV.These compounds can be tested against SARS-CoV-2 which has genetic similarities with the two viruses (Septiana, 2020).One of the natural ingredients with various benefits is white teak (Gmelina arborea Roxb.).According to Falah et al. (2008), one part of white teak that is widely used for traditional medicine is its bark.Active compounds in white teak bark are known to have high antioxidant activities.According to Fedoreyev et al. (2018), antivirals and antioxidants have a close relationship, as viral infections are often accompanied by oxidative stress, which is a key factor in viral pathogenesis.Therefore, this study aims to analyze the potential of compounds from the white teak bark identified by Falah et al. (2008) in inhibiting SARS-CoV-2 NendoU in silico.

Quantitative structure-activity relationship analysis
Quantitative structure-activity relationship (QSAR) analysis was conducted to predict ligand bioactivities, especially the antioxidant and free radical scavenging activities, using Way2drug Prediction of Activity Spectra for Substances (PASS) Online web server (www.way2drug.com/passonline) (Filimonov et al., 2014).

Prediction of ligand binding sites
The potential of a protein pocket to bind a ligand was predicted using the PrankWeb (P2Rank) webserver (prankweb.cz)(Krivak & Hoksza, 2018).Protein structure in PDB format was uploaded in the entry field and set to chain A. The output is a table with information such as name, pocket score, probability score, residues in the pocket, and pocket visualization.

Protein and ligand preparation
The 3D structure of the protein was prepared using YASARA Structure (Krieger et al., 2009), which includes the removal of water molecules and unnecessary residues, addition of hydrogen atoms, removal of aliphatic hydrogen bonds, and separation of protein structure and natural ligands.The results of protein and natural ligand preparation were saved in *.pdb and *.sdf format, respectively.Meanwhile, the 3D structures of the comparison and test ligands were prepared and optimized using YASARA Structure and saved in *.sdf format (Agistia et al., 2013).

Validation of molecular docking
Validation of the docking method was performed by re-docking the natural ligand uridine-5'monophosphate (U5P) on the protein with a grid box sized between 0.25-5.0Å and an interval of 0.25 Å.After obtaining the docking score, the root mean square deviation (RMSD) value was determined using YASARA Structure (Siagian et al., 2022).

Results and Discussion
Quantitative structure-activity relationship (QSAR) analysis QSAR analysis aims to determine the relationship between the structure and biological activity of a compound (Ishikawa et al., 2012).Based on the analysis, all test ligands showed antioxidant and free radical scavenging activity with a minimum Pa value of 0.3, except gmelinol which had a Pa value of less than 0.3 in antioxidant activity (Table 1).The minimum Pa value of 0.3 indicates that the biological activity possessed by the compound is moderate (Filimonov et al., 2014;Parikesit & Nurdiansyah, 2021;Daniel et al., 2023).However, compounds with low Pa values are not necessarily certain to have low activity because not many studies have been conducted on these compounds (Ivanov et al., 2018).These results were consistent with the research done by Falah et al. (2008) which states that Gmelina arborea Roxb.has antioxidant activity that can inhibit cell damage, mainly through free radical scavenging properties.

Ligand binding site prediction
Before molecular docking, prediction was done with PrankWeb (P2Rank) to determine the protein pocket position used during docking.PrankWeb (P2Rank) is a machine learning-based tool to predict ligand binding sites (Krivak & Hoksza, 2018).Predictions show that the 6WLC chain A has eight binding pockets (Figure 1a).The best pocket was determined based on pocket score, probability score, and amino acid residues found in the pocket.Pocket 1 has the highest pocket and probability scores with 12 of its 17 residues an active site of the protein (Table 2).According to Kim et al., 2021, the active site of the protein consists of catalytic and substrate binding sites.The catalytic site is composed of three residues: His250, Ser294, and Tyr343, while the substrate binding site consists of His235, Gln245, Gly248, Lys290, Val292, Cys293, Trp333, Glu340, Thr341, Pro344, Lys345, and Leu346.The similarity of the residues making up pocket 1 and the active site of the protein suggests they are in the same position (Figure 1).In addition, there is a natural ligand uridine-5'-monophosphate (U5P) docked in the pocket (Figure 1b).Therefore, molecular docking was done in pocket 1.

Structure preparation and grid box validation
Protein structure preparation was done by removing water molecules and unnecessary residues around the molecule.This aimed to reduce the potential of interference during the docking process with the ligand (Susanti et al., 2019).Ligand structures were prepared using the energy minimization method to stabilize the bond arrangement (Hanif et al., 2020) and optimize the molecular docking process.After preparation, grid box validation was done with RMSD value as the parameter.RMSD value illustrates the level of deviation of the experimental ligand docking results from the crystallographic ligand on the protein binding site.The higher the RMSD value, the greater the deviation of the docked ligand position from the crystallography results (Susanti et al., 2019).Based on the analysis results, the lowest RMSD was obtained with a 2.5Å grid box size of 0.0034Å.The RMSD value was considered as valid and can be used for molecular docking of the test and comparison ligands because it is less than 3Å (Susanti et al., 2019).In addition, based on the visualization results, the pose of the validated U5P ligand overlaps with the U5P ligand in the crystallographed protein (Figure 2).

Molecular docking
Molecular docking analysis is a method of designing drug molecules using computer-aided drug discovery (CADD) techniques that aim to determine the binding pose of ligands to protein binding sites and their physicochemical interactions (Ahmed et al., 2019).The docking was carried out against the natural ligand U5P, the comparison ligand tipiracil, and the test ligands identified in the white teak bark (Falah et al., 2008).Tipiracil is a uracil-derived compound used in colorectal cancer treatment, together with trifluridine, and has been recognized by the Food and Drug Administration (FDA) (Kish & Uppal, 2016).Meanwhile, because Nsp15 (NendoU) is a uridine-specific protein, tipiracil can be used as the comparison ligand (Kim et al., 2021).
The Gibbs binding free energy (ΔG) and dissociation constant (Kd) were used as parameters to analyze the molecular docking results.The binding ΔG value describes the stability of proteinligand complex formation (Du et al., 2016).Meanwhile, the dissociation constant is the point where all hydrogen bonds in the protein-ligand interaction are irreversibly broken (Rocheleau et al., 2016).According to Noviardi & Fachrurrazie (2015), the dissociation constant and ∆G value are directly related, so the more negative the ∆G, the smaller the Kd and the stronger and more stable the ligand-protein complex is.The docking results showed that balanophonin had the most negative ΔG and the smallest Kd compared to the other ligands (Table 3).Therefore, balanophonin is the best ligand because it inhibits the NendoU protein more potently than the natural ligand and its comparison ligands.In addition to the ΔG and Kd, the docking results can be analyzed by 2D visualization (Figure 3).Based on the analysis, the number of hydrogen and hydrophobic bonds formed was highly variable (Table 4).The difference in the number of hydrophobic interactions is because there are fewer amino acids with nonpolar side chains in the protein than polar side chains.Meanwhile, the difference in the number of hydrogen bonds is likely due to variations in the number of hydrogen bond donors and acceptors in each compound (Chen et al., 2016).
The most hydrogen bonds resulted from interactions with the natural and test ligand (L08) (- 3).This is because the protein used is a uridine-specific protein so the natural ligands have the most bonds with its active site (Kim et al., 2021), while the L08 ligand binds to many residues due to its large size.The large number of hydrogen bonds causes the ΔG to be more negative, thus strengthening the bond between the ligand and the protein (Tallei et al., 2020;Alimah et al., 2022).However, the most negative ΔG value is the test ligand balanophonin (L07).This may occur because the ΔG is also affected by the bond distance formed.According to Prasetiawati et al. (2021), a hydrogen bond is strong if the length is less than 2.8Å.Balanophonin and (-)-p-hydroxyphenylethyl[5′″-O-(3,4-dimethoxycinnamoyl)-b-d-apiofuranosyl (1′→ 6″)]-b-d-glucopyra-noside have hydrogen bonds with a distance of 2.8Å (Table 4).However, in balanophonin, the bond is formed on the residues that compose the protein catalytic site, making the bond more stable and resulting in more negative ΔG.Meanwhile, the most hydrophobic bonds result from interactions with gmelinol (Table 4).According to Arwansyah et al., (2014), the number of hydrophobic interactions plays a role in determining the stability of the ligand docked to the protein.

Ligand physicochemical characteristics prediction
The physicochemical characteristics of ligands can be analyzed based on Lipinksi's five rules: (1) molecular weight less than 500 Da, (2) log P value less than 5, (3) number of hydrogen bond donors less than 5, (4) number of hydrogen bond acceptors less than 10, and (5) molar refractivity ranging between 40 and 130 (Lipinski et al., 2001).Molecular weight affects the ability of compounds to passively diffuse through cell membranes because the greater the weight, the more difficult it is for molecules to pass through cell membranes.The log P value is the solubility coefficient of the compound in fat or water.The greater the logP value, the more hydrophobic the compound.The number of hydrogen bond acceptors and donors affects the amount of energy required in the absorption process (Syahputra et al., 2014).Meanwhile, the molar refractivity describes the steric properties of the compound against the protein (Marilia et al., 2021).A ligand is considered to fulfill Lipinski's rule if it does not violate more than two parameters (Petit et al., 2012).Based on the prediction results, only the ligand L08 did not pass Lipinski's rule (Table 5).Drugs that violate the Lipinski rule will cause problems in oral use and are often administered via intravenous injection (Amirian & Levy, 2020).

Ligand pharmacokinetic characteristics prediction
Absorption, distribution, metabolism, and excretion (ADME) prediction analyzes the pharmacokinetic characteristics profile of test compounds based on several parameters, including topological polar surface area (TPSA), %absorption, water solubility, GI absorption, log Kp, and bioavailability score (Nusantoro & Fadlan, 2020).TPSA estimates several ADME properties related to biological crossing barriers with acceptable values between 20 and 130Å 2 (Daina et al., 2017).Based on the analysis, only the natural ligand and L08 did not meet this parameter (Table 6).Another parameter, absorption, is used to predict the amount of compound absorbed through the human small intestine.The analysis showed that all ligands had an absorption rate of more than 40% (Table 6).This indicates that all ligands are well absorbed because they have absorption values >30% (Pires et al., 2015).
Water and gastrointestinal (GI) solubility is important since the injected drug candidate must be water and gastrointestinal-soluble (Daina et al., 2017).If the compound is not sufficiently soluble in the GI, it will inhibit its absorption through the portal vein system (Lagorce et al., 2017).The analysis showed that all ligands were categorized as soluble to very soluble in water.Meanwhile, in GI solubility, the natural ligand and test ligand L08 have low solubility (Table 6).The log Kp value is a multiple linear regression that predicts the compound permeability coefficient to the skin.Compounds with low skin permeability are characterized by log Kp > -2.5 (Pires et al., 2015).Based on the results obtained, all compounds have a good level of skin permeability (Table 6).The score on the bioavailability parameter describes the ability of the test compound to reach the target site of action where at least 10% or 0.10 of the compound is present at the target (Daina et al., 2017).The analysis showed that all ligands had bioavailability scores above 0.10 (Table 6).

Prediction of ligand toxicity characteristics
Toxicity analysis was done to determine the compound potential toxic effects that can cause damage to the body after going through metabolic processes (Noviardi et al., 2020).The parameters analyzed were hERG inhibition, carcinogenicity, and acute oral toxicity.Inhibition of hERG (human Ether-a-go-go-Related Gene) causes ventricular arrhythmia in the heart (Dickson et al., 2020).Based on the analysis, only the test ligand gmelinol (L06) can cause inhibition of hERG with a probability of more than 70% (Table 7).Carcinogenicity is defined as the nature or ability of a compound to trigger neoplasia or new tissue growth (Arief et al., 2018).The analysis showed that only the test ligand 2,6dimethoxy-p-benzoquinone (L04) was carcinogenic.Meanwhile, acute oral toxicity parameters can be determined using the LD50 value.Acute oral toxicity is exposure to a compound within 24 hours.This parameter can be categorized into 5 level groups based on the LD50 value: Ia (<5 mg/kg), Ib (5 -50 mg/kg), II (50 -500 mg/kg), III (500 -2000 mg/kg) and IV (>2000 mg/kg) (Pannindriya et al., 2021).Based on the analysis results, all test compounds fall into category III, which is a safe category (Table 7).

Conclusion
Balanophonin is the best ligand because it has the most negative binding ΔG and the smallest Kd.In addition, this ligand has good physicochemical characteristics, pharmacokinetics, and low ligand toxicity because it meets all test parameters.In conclusion, balanophonin can be developed as an effective alternative drug in inhibiting SARS-CoV-2 endoribonuclease.
is the probability of belonging to the class of "actives" Deskripsi: Pa merupakan probabilitas senyawa termasuk dalam kelompok senyawa aktif

Table 1 .
QSAR analysis of test ligands using the PASS Online server Tabel 1. Analisis QSAR dari ligan uji menggunakan PASS Online