Structure-based virtual screening of bioherbicide candidates for weeds in sugarcane plantation using in silico approaches

Weeds in sugarcane have negatively affected the sugar yield rate. Several approaches have been carried out to overcome the weeds, including the usage of diuron as synthetic herbicide. However, the long-term usage of diuron is known to have a negative effect leads to the production of 3,4Dichloroaniline responsible for soil leach and bioaccumulation. Therefore, this study aimed to find a potential natural herbicide. By mimicking the diuron's mode of action which inhibits the process of photosynthesis through blocking the Photosystem II protein D1 (psbA) of the weeds, fourteen compounds as potential candidate bioherbicides were virtually docked by PyRx v.0.9.5 software to the specific site. Three important species of the weeds were chosen including Eleusine indica, Praxelis clematidea, and Momordica charantia. The binding affinity score was further calculated and ranked to screen the top six compounds as bioherbicide candidates. Interaction of each complex and the biological activity prediction were then performed by Discovery Studio software and PASS server, respectively. Aurachin P, Aurachin A, and Cyanobacterin were placed in the top ranked compounds with high binding affinity score around -6 to -9 kcal mol toward the psbA. The amino acid interaction involved in the complex shows 50-90% similar to the control, psbA and diuron complex. Besides, the biological activity prediction of Aurachin P, Aurachin A, and Cyanobacterin exhibits the terms related to the inhibition of photosynthesis process via enzymatic pathway. Thus, the active compounds might have inhibition action in the photosynthesis process and control the weeds in sugarcane. [

the long-term usage of diuron is known to have a negative effect leads to the production of 3,4-Dichloroaniline responsible for soil leach and bioaccumulation. Therefore, this study aimed to find a potential natural herbicide. By mimicking the diuron's mode of action which inhibits the process of photosynthesis through blocking the Photosystem II protein D1 (psbA) of the weeds, fourteen compounds as potential candidate bioherbicides were virtually docked by PyRx v.0.9.5 software to the specific site. Three important species of the weeds were chosen including Eleusine indica, Praxelis clematidea, and Momordica charantia. The binding affinity score was further calculated and ranked to screen the top six compounds as bioherbicide candidates. Interaction of each complex and the biological activity prediction were then performed by Discovery Studio software and PASS server, respectively. Aurachin P, Aurachin A, and Cyanobacterin were placed in the top ranked compounds with high binding affinity score around -6 to -9 kcal mol -1 toward the psbA. The amino acid interaction involved in the complex shows 50-90% similar to the control, psbA and diuron complex. Besides, the biological activity prediction of Aurachin P, Aurachin A, and Cyanobacterin exhibits the terms related to the inhibition of photosynthesis process via enzymatic pathway. Thus, the active compounds might have inhibition action in the photosynthesis process and control the weeds in sugarcane.

Introduction
Controlling weeds in sugarcane is a challenge that farmers should face. The growth of weeds itself is faster than the sugarcane (Saccharum officinalis L.), mainly in the early stages of crop growth. Yields of sugarcane reported decrease around 24 to 93% as a result of nutrient loss by competing with crops for water, nutrients, and sunlight (Singh & Kumar, 2013). Singh et al. (2011) highlighted a significant increase of the sugarcane weeds in the plantation as farmers have limited expertise and knowledge to improve the weed management.
The utilization of herbicides is a common practice in weed management of plant crops. Previous research reported the high efficacy of herbicides (90-99%) in killing weeds (Wakabayashi & Boger, 2002;Délye et al., 2013). In the USA, herbicides have been widely used (95%) in cotton, soybean, maize, and sugar beet (Gianessi, 2005). Based on the Sistem Informasi Pestisida database in 2020, the usage of herbicides in Indonesia is targeted to many commodities including acacia, orchid, grapes, apple, corn mill, onion, and shallot. However, chemical herbicides usage has been banned due to the negative effects in soil health, aquatic environments, and the atmosphere. The substance also has negative effects to human health and ecosystem sustainability (Morales et al., 2013;Huovinen et al., 2015;Velki et al., 2019).
Diuron is one of the most familiar synthetic herbicides used by farmers to control weeds in their sugar plantation (Peng, 2012). The substance belongs to the phenyl amide family and acts as a photosynthesis inhibitor by preventing oxygen production (Wessels & Veen, 1956). In addition, it blocks the electron transfer in the photosystem II (PSII). The D1 protein is the center of the PSII reaction and C-terminal processing of the precursor D1 protein is important (Teixeira & Elzbieta, 2013). Some reports mentioned the substance effectivity in killing the annual and perennial grassy weeds. It has been widely used in plant crops such as cotton, sugarcane, alfalfa, and wheat. However, the usage of diuron has been reported to cause environmental problems. Diuron has been detected in 28% of river samples in the USA National Canal System. Besides, the presence of diuron produces the intermediate substances, leading to the formation of 3,4-Dichloroaniline (3,4-DCA) which causes soil leaching and bioaccumulation (Giacomazzi & Cochet, 2004). While in Indonesia, the diuron usage mainly for sugarcane is still allowed. The regulation in Indonesia allowing the diuron used as an active compound for ten brands of pesticides, including Amrocon 80 WP, Bioron 80 WP, Gonzales 80 WP, Gulmaron 500 SC, Gulmaron 80 WP, Maron 80 WP, Ronindo 500 SC, Ronindo 80 WP, Sidaron 80 WP, and Viaron 500 SC (Sistem Informasi Pestisida, 2020). If this condition continues to happen, it might affect soil health in the environment. This finding leads to the classification of diuron as a harmful substance causing the suppression of its utilization within 20 years based on the Directive 2000/60/CE. To overcome this condition, researchers tried to develop herbicides derived from the secondary metabolite of plant species and microbes to minimize the environmental effects and creating safer and non-toxic compounds (Nusrat et al., 2018;Radhakrishnan et al., 2018). Dayan & Duke (2014) have reviewed the varieties of nextgeneration herbicides from natural compounds and the detailed mechanisms of action. This article was used as a database of bioherbicides source in this study. This study provides insight into finding the bioherbicides virtually using the structural bioinformatics approach that incorporates a molecular docking method to find better performance and safer bioherbicides than diuron. The analysis was based on its mechanism of action to block the photosynthesis process in sugarcane. Targeting the protein and predicting the binding affinity using the molecular docking approach have been used years for the drug discovery process (Meng et al., 2011;Pinzi & Rastelli, 2019) due to its accuracy and effectivity to screen the candidates. The approaches were then adopted to find the candidates of bioherbicides targeting a specific protein in plants.

Samples retrieval of bioherbicide candidates
The candidate of the bioherbicides list was retrieved from the review paper of Dayan & Duke (2014). This previous study provided the list of the mechanism of action (MOA) which is targeting PSII electron transport and its sources for bioherbicides. The compounds classified as natural phytotoxin isolates from various organisms (Sorghum bicolor, Syctonema hofmanni, Fischerella muscicola, and Stigmatella aurantica). The 3D structure of the candidate compounds was obtained from the PubChem database (https://pubchem.ncbi.nlm.nih.gov) with the unique ID (Table 1).

Protein sequences retrieval, modeling, and 3D structure analysis
Three types of weeds: Eleusine indica, Praxelis clematidea, and Momordica charantia were chosen based on the field observation. The sequences of Photosystem II protein D1 (psbA) from E. indica (ID K9MXP5), P. clematidea (ID W8RMZ2), and M. Charantia (ID A0A2I6BZY2) were retrieved from the Uniprot database (https://uniprot.org). The protein target was further modeled utilizing I-TASSER software (https://zhanglab.ccmb. medumich.edu/I-TASSER/) (Yang & Zhang, 2015) with a unique template having high similarity with the structure of amino acid sequences. The considerations were made to choose the proper model protein from I-TASSER including (1) The rank of proteins, which is based on TM-score of the structural alignment between the query structure and known structures in the PDB library; (2) The lowest Root Mean Square Deviation (RMSD a ) score, represents the smallest RMSD value between residues that are structurally aligned by TM-align; (3) The highest IDEN a , represents the highest percentage sequence identity in the structurally aligned  (4) The highest value of Cov score represents the highest coverage of the alignment by TM-align and is equal to the number of structurally aligned residues divided by the length of the query protein.
To compare the psbA structure between three weeds, the amino acid sequences were aligned using MUSCLE in BioEdit software (Hall, 2011). The RMSD of each 3D structure protein was calculated to reveal the structure differences using PyMol (Schrödinger, USA) (The PyMOL Molecular Graphics System, Version 1.2r3pre, Schrödinger, LLC).

Molecular docking analysis and visualization
In order to understand the molecular interaction and affinity binding between bioherbicide candidate compounds and psbA, virtual screening in molecular docking was carried out using PyRx 0.9.5 software developed by Dallakyan & Olson (2015) and has been widely used (de Sousa et al., 2020;Kulkarni et al., 2020;Venkateshan et al., 2020). The research design simply looked for the inhibitor of the psbA and found a better natural compound candidate than diuron. The grid used for docking analysis was center X: 65368, Y: 81888, Z: 91545, dimensions (Å) X: 7168, Y: 7358, and Z: 10405. The docking was done specifically at diuron's binding site as a control. The docking complex and amino acid interaction were visualized using Discovery Studio R2017 (Dassault Systèmes BIOVIA, BIOVIA Visualizer, Release 2017).

Biological activity prediction
In order to explore the biological activity of candidate compounds, prediction analysis was carried out using the PASS server website (http://www.pharmaexpert.ru/passonline/index.ph p). This web server calculated the biological activity based on the structural similarity in the database. It scored the prediction range from 0 to 1. The score above 0.7 represents the prediction accuracy of the laboratory test (Filimonov et al., 2018).

Results and Discussion
Amino acid sequence alignment, protein modeling, and RMSD analysis Amino acids sequences were aligned using MUSCLE to identify the possibility of structure variation of psbA among three weeds. The alignment showed the amino acid differences among 353 amino acids between psbA in three weeds of E. indica, P. Clematidea, and M. charantia (Figure 1). Three amino acids differences appeared in the psbA of P. Clematidea including residue pairs number 11 (T → E); 346 (L → I); and 351 (L → T). While in M. charantia, six differences were found in residue pairs 11 (T → E); 238 (K → R); 346 (L → V); 348 (A → V); 349 (P → T); and 351 (L → I) compared to the E. indica protein sequences.
The protein structure was modelled to visualize the 3D structure of psbA from those three weeds. The template 4YUUA was selected by the I-TASSER algorithm among thousands of proteins in LOMET database (Figure 2A-C). The superimpose analysis was carried out by calculating the RMSD scores representing the atomic distances to identify the structure differences ( Figure 2D). The psbA of E. indica and P. clematidea has RMSD score of 1.016, whilst the E. indica and M. charantia; M. charantia and P. clematidea showed approximately similar RMSD scores, around 0.9 (Table 2). The data confirmed that structure variation among those three proteins was found.
The distance-based measurement by RMSD analysis of each D1 protein from E. indica, P. clematidea and M. charantia exhibited deviation score, ∼1 Å. A report from Eyal et al. (2005) stated the limit of accuracy of protein modeling is ∼8 Å, which leads to the conclusion that the D1 protein has no meaningful differences in terms of structure and function. The RMSD value was calculated based in the pairs of atoms and the distance between two atoms (Kufareva & Abagyan, 2012). Slight differences were spotted in the 3D structure of D1 protein from each weed species when visualized using superimpose approaches, which supports the RMSD values ( Figure 2D, red circle). The molecular interaction of each psbA from each weed was observed to the compound candidates. Fourteen candidates of bioherbicide were selected based on Dayan & Duke (2014) and docked into the psbA, compared to the diuron as the control. The virtual screening approach could examine a target and a subset of compounds in order to reduce the number of compounds to test in the laboratory. It predicts ligand binding modes by specific algorithms in computational technique. The scoring results represent the global minimum of the energy needed for the interaction (Salmaso & Moro, 2018).
The top six compounds were ranked from each complex based on the binding affinity score. Aurachin P, Aurachin A, and Cyanobacterin appeared in the top six of each protein complexes from different psbA weeds species. Cyanobacterin showed the highest potential as a candidate compound to block D1 protein of E. indica, followed by Aurachin P and Aurachin A, with affinity score -6.7, -6.2, and -6.2 kcal mol -1 , respectively. While in complex with psbA of P. clematidea, Aurachin A exhibited high affinity, followed by Aurachin P and Cyanobacterin. Aurachin P was first among other compounds in the interaction with psbA of M. charantia, followed by Aurachin A and Cyanobacterin (Table  3). Those three compounds exhibited a higher binding affinity than the diuron as control. The complex interaction was analysed to find out the type of the involved amino acid (Figure 3). Mostly, the complex shows van der Waals interaction between molecules (Table 4, green bubble). The analysis showed 50%-90% of amino acids responsible in the interaction between Aurachin A, Aurachin P, and Cyanobacterin were similar to the complex psbA and diuron for each weed species (Table 4). The high similarity of the binding site represents the resemblance of function candidate compounds as D1 protein blocker. Some of them bind into specific amino acids which has an important function. Based on the UniProt database, the amino acid involved in the interaction has several important roles. Amino acid (AA) number 161 has a function in tyrosine radical intermediate, AA number 170 and 333 playing role in the calcium-manganese-oxide [Ca-4Mn-5O]; manganese 1 and 4.

Biological potential activity of Aurachin P, Aurachin P and Cyanobacterin
In order to explore the potential of Aurachin P, Aurachin A, and Cyanobacterin, biological activity prediction was conducted (Table 5). The data explaining the potential activity of Aurachin P and Aurachin A was similar, mostly related to the inhibitory activity of prenyl-diphosphatase, undecaprenyl-phosphate mannosyl transferase, and plastoquinol-plastocyanin reductase inhibitor. Those terms related to the inhibition of enzyme and catalytic activity playing an important role in photosynthesis. In contrast, Cyanobacterin acts as a 1-Acylglycerol-3-phosphate O-acyltransferase inhibitor, related to the negative regulation of phosphatic acid biosynthesis. The activity of candidate bioherbicides supports the prediction to inhibit the photosynthesis process in weeds.     Biological activity of Aurachin P and Aurachin A was predicted to be related to the terms as an inhibitor in plastoquinol-plastocyanin reductase. This enzyme playing a role in the linear electron transfer chain that contributes to oxygenic photosynthesis in the chloroplast. Linear electron chain is responsible to oxidize water into molecular oxygen and reducing the NADP+ to NADPH. This condition makes the environment of transmembrane proton gradient which will be converted by ATP synthase into chemical energy (ATP). The plastoquinone enzyme catalyzes the electrotransfer between Photosystem II and I, which is the photosynthetic reaction centers of oxygenic photosynthesis (Gao et al., 2018). When this process was inhibited by Aurachin P and Aurachin A, the photosynthesis process will not happen.
Cyanobacterin was predicted to have 1-Acylglycerol-3-phosphate O-acyltransferase inhibitor activity. This enzyme playing roles in phosphatic acid biosynthesis. It may regulate neutral lipid accumulation and participate in lipid turnover regulation. The phospholipid is well known to play crucial roles in the development and signal transduction. It also regulates the homeostasis in growth and development stages under stress conditions. Phosphatic acid was proven to act as a key for thylakoid lipid biosynthesis in the chloroplast (Yao & Xue, 2018). If this process were inhibited by Cyanobacterin, the growth of weeds will be terminated.

Conclusion
This study provided insight into the bioherbicide candidate compounds which has shown potentially better affinity than synthetic herbicide diuron. It is indicated that the Aurachin P, Aurachin A, and Cyanobacterin were the best blocker candidate compounds for Photosystem II D1 protein to inhibit the growth of selected sugarcane weeds. However, efficacy tests are required to confirm the potential effectivity of the compounds found in this research.