IN-SILICO DIVERSITY ANALYSIS OF DISEASE RESISTANT NBS-LRR PROTEIN VARIANTS IN ORYZA SATIVA

Objective: Oryza sativa being the contributor of more than 2 million tons of staple food plays assorted role in agrarian economy of Pakistan. Variety of pathogens and insecticides cause serious loses of its yield. To develop immunity at genetic level in O. sativa , nucleotide binding site-leucine rich repeat proteins (NBS-LRR), associated with plant response to pathogens, have been targeted in present study. Methodology: Eight nucleotide sequences of NBS-LRR protein were retrieved from Rice Genome Annotation Project (RGAP) database. Variants were analyzed through ProtParam, CELLO2 and SOPMA tools and SWISS MODEL server. Results and conclusion: Isoelectric point (pI), instability index, aliphatic index and GRAVY showed only slight variations. Diversity was observed in sub-cellular localization, secondary (2D) and tertiary (3D) structures. Significant heterogeneity found among NBS-LRR proteins, in present study, might help to remove homogeneity in rice crops leading to reduced epidemics and yield stability.


INTRODUCTION
Agri food sector has been facing challenges due to increase in human population since last few decades.Growing food needs are mostly compensated by cereals production.Oryza sativa (rice) is one of the most cultivated cereals and major agronomic species which feeds 50% of the world population.Its production is 25% of world cereals (FAO, 2020;Vinci et al., 2023).Asian countries including Indonesia, Pakistan, China and India produce 92% of the world rice which is a food source of approximately 4.6 billion consumers (Gnanamanickam and Gnanamanickam, 2009;Muthayya et al., 2014).In addition to Asian countries, rice is also a staple food in U.S. In year 2020, rice crops harvested on an area of 2.987 million acres in U.S valued at 2495 million dollars (Godara, 2022;NASS, 2018).In 2021-2022, milled rice production in U.S was estimated to be 511.7 million metric tons (Bagnall et al., 2021).Nutritionally, rice comprises of 75-80% starch, 12% water, 7% proteins with 4% lysine minerals like calcium, phosphorus, magnesium, copper, manganese, iron and zinc (Beaulieu et al., 2022;Guan et al., 2023;Verma and Srivastav, 2020).
These above mentioned pathogens with diverse modus operandi have made rice crop security a serious concern.Managing these pathogens via proteome engineering is the most sustainable solution.Manipulating resistant proteins and cloning various isoforms in the same plant through breeding and genetic engineering techniques is an emerging biocontrol strategy (Pandit et al., 2022).To engineer proteins, genes associated with pathogens resistance pathways in plant should be targeted.Resistant genes play crucial role in pathogen resistance mechanism of plants known as effector-triggered immunity (ETI).One such type of potent genes is Nucleotide binding sites-leucine rich repeat (NBS-LRR).In ETI pathway, resistant genes encode transmembrane receptors which bind and form complex with specific avirulence (Avr) proteins of pathogens.This complementary binding initiate the conformational change in LRR and amino terminal domains of NBS-LRR protein (Chang et al., 2022;Yuan M. et al., 2021).This change stimulates NBS domain of protein to exchange ADP for ATP thereby activating mechanism that stops pathogen spreading by cell death (DeYoung and Innes, 2006).
NBS-LRR are the disease resistant genes.Majority of the isoforms constituting this group are uncharacterized.These are constitutively expressed in plants.NBS-LRR have two major groups i.e. one which encodes coiled coil motif at amino terminal (CC-NBS-LRR) and other comprises of N-terminal domain homologous to toll interleukin 1 receptor domain (TIR) of mammals (DeYoung and Innes, 2006).
In present study we have selected eight, Avr effector interacting, variants of NBS-LRR resistant protein which have been reported to be associated with resistance development against Magnaporthe oryzae in two varieties of rice cultivars i.e.BR2655 and HR12 (Chandrakanth et al., 2020).Objectives of present project were the characterization of these variants at the level of 2D and 3D configuration and physicochemical properties which might be helpful in isolating these resistance imparting genes from different rice plants and engineering them into same plant thus producing rice crops with potential pathogens resistance.
Rapid change of environmental conditions and excessive use of pesticides, pathogens are developing into multiple races.To combat these pathogens and to attain environmental sustainability, immunity development at the level of genes is crucial.

METHODOLOGY
Retrieving the sequences of eight variants of NBS-LRR gene from database: Sequences of eight variants of NBS-LRR gene in O. sativa were retrieved from Rice Genome Annotation Project (Osa1) Release 7 Annotation (rice.uga.edu/amalyses_search_locus.shtml,accessed on July 2023).(Yuan Q. et al., 2005).

Phylogeny of NBS-LRR protein variants:
To analyze the evolutionary relationship of NBS-LRR protein variants of O. sativa documented in present study, among themselves and also with the same proteins of some other plants like Solanum lycopersicum (tomato), Arabidopsis thaliana (thale cress) and Glycine max (soybean), protein sequences were retrieved from NCBI (National Center for Biotechnology Information) GenBank database available at (https://www.ncbi.nlm.nih.gov/nucleotide/,accessed on July 2023).Clustal Omega Multiple Sequence Alignment software was used for multiple sequence alignment of variants documented in present study (Sievers and Higgins, 2014).Ungapped aligned sequences were then subjected to MEGA11 software and phylogenetic tree was constructed (Tamura et al., 2021).Neighbour joining tree was constructed using the bootstrap value of 100 as reliability index.To predict the diversity at different levels among these variants, protein sequences were analyzed further.The results of analyses are discussed in detail.

Prediction of sub-cellular localization of variants:
CELLO predictor showed diversity in the sub-cellular localization of present study protein forms.NBS-LRR1, LRR2, LRR4, LRR6 and LRR8 were found to be localized in cytoplasm and nucleus with reliability scores of 1.388 & 1.138, 1.377 & 2.432, 1.290 & 1.618, 1.676 & 1.248 and 2.705 & 1.325, respectively.These values are showing that LRR2 is more localized in nucleus and LRR8 in cytoplasm.In all other cases, there is equal probability of localization in cytoplasm and nucleus.NBS-LRR3 and LRR7 are found in nucleus with score of 3.145 and 3.671, respectively.NBS-LRR5 variant is found to be localized in cytoplasm with reliability score of 3.145 (Figure 1).

Prediction of physicochemical properties of variants:
Differences among these protein variants at the level of physicochemical properties was analyzed using ProtParam tool (Table 1).Analysis revealed all the variants comprised of different number of amino acids and hence different molecular weights.NBS-LRR1 and LRR3 were the shortest in length with 2574 and 2709 amino acids, respectively while NBS-LRR2 and LRR4 were comprised of the highest number of amino acids i.e. 4479 and 4443, respectively.Not much deviation was observed in case of pI which ranged between 4.77 and 4.90.Highest extinction coefficient was observed in case of NBSS-LRR2 (57875 M -1 cm -1 ) and LRR4 (57750 M - 1 cm -1 ).The lowest value was observed for LRR1 i.e. 32625 M -1 cm -1 .While LRR3, LRR5, LRR6, LRR7 and LRR8 were showing intermediate values i.e. 34250, 36375, 40250, 50875 and 36250, respectively.
Prediction of 2D structure of protein variants: NBS-LRR protein variants were found to exhibit diversity in 2D structure in terms of extended strand and beta turn (Table 2 and Figure 2).While in case of alpha helix and random coil, the deviation was less.Extended strand ranged between 9.51 and 12.75.LRR4, LRR5 and LRR8 were found to exhibit 9.93%, 9.97% and 9.51% extended strand content.LRR3 showed highest content with 12.75%.LRR1 exhibited the second highest i.e. 11.20%.Beta turn ranged between 1.88% and 5.65%, the lowest being observed in case of LRR2 and the highest for LRR7.No significant variation was observed in case of alpha helix and random coil content which ranged between 50.68% -56.82% and 30.49% -36.39%, respectively.

Prediction of 3D configuration of protein variants:
Considerable diversity was observed in 3D configuration of NBS-LRR protein variants (Figure 3).LRR7 exhibited the simplest level of folding as compared to others followed by LRR1, LRR2 and LRR4.While the remaining variants were found to have complex level of folding.
Phylogeny analysis: Neighbour joining tree constructed via MEGA11 revealed closest relation of NBS-LRR1 with O. sativa as both originated from same branch point via bootstrap value of 100.Remaining seven isoforms were unrelated because they did not directly share a clade with O. sativa (Figure 4).These both shared clade with Arabidopsis thaliana only with bootstrap value of 27.LRR2 and LRR6 shared clade but with bootstrap value of only 46.These two also shared clade with LRR6.LRR5 and LRR8 were found phylogenetically closely related due to originating from same branch point with bootstrap value of 100.These two also showed closeness to LRR3 and LRR4 but the reliability index was very small.As per this phylogenetic tree, LRR3, LRR4, LRR5 and LRR8 were more evolutionary close to Glycine max than O. sativa.This analysis showed considerable divergence among the NBS-LRR protein isoforms documented in present study.The values for aliphatic index were found in the range of 27.23 to 32.05 which are totally inconsistent with the values range 88.09 to 103.58 reported in previous work (Chandrakanth et al., 2020).Aliphatic index is the measure of the proportion of protein comprising of aliphatic amino acids and is directly related with thermal stability of protein (Karshikoff et al., 2015).Eight variants analyzed in present work does not seem to be thermally stable while the value reported previously showed high thermal stability of the isoforms.This is the first study reporting instability index, extinction coefficient and GRAVY for these eight isoforms of NBS-LRR protein as no one has performed analysis of these parameters.Stability of protein in test tube is indicated by its instability index.The value above 40 shows stable nature pf protein (Gamage et al., 2019).So, in present study, all the variants except NBS-LRR5 and LRR8 were stable.
The variation with respect to 2D configuration among eight variants was prominent in cases of extended strand and beta turn.Our value of number of amino acids participating in alpha helix formation i.e. 50.68 to 56.80% was not consistent with earlier reported literature in which 25-46% residues have been found to be associated with alpha helix in these eight variants.Earlier work reports involvement of 9-14% residues with beta turn while in present study these values were lower than the reported one.As it was found to be 1.88 to 5.65%.Similarly, the amino acids participating in random coil formation in present study were predicted to range from 30.49 to 36.39%.This finding is not in consistence with earlier reported literature where this value ranged between 48-60% (Chandrakanth et al., 2020).
Our finding of highly folded and complex 3D configuration of NBS-LRR protein variants is consistent with earlier reported findings where different isoforms of this protein exhibited complication structures (Chandrakanth et al., 2020;Terensan et al., 2021).
Our phylogenetic analysis has revealed that the isoforms of NBS-LRR protein evolved considerably and showed divergence when aligned with O. sativa.This finding is not consistent with earlier study which reports conserved nature of different isoforms of the protein addressed in present study (Mizuno et al., 2020).
The marked diversity observed in present study is in accordance with previous literature as a study has reported high diversity in nucleotides and copy number of NBS-LRR gene loci in fourteen wild rice populations and twenty cultivars (Yang et al., 2008).

Conclusion:
Pyramiding of these isoforms of NBS-LRR gene into a single rice plant genome through artificial breeding and genetic engineering strategies might induce broad-spectrum pathogen resistance.Phenotypic characteristics cannot help in identification of plants with potential resistant genes.The characteristics of variants analyzed in present study might be used as markers for selection of these genes which can then be isolated and introduced into a plant simultaneously thus developing strong resistance in rice crops.Coupling of these resistant plants with chemical or cultural pathogenic control strategies might considerably enhance the crops yield and may be more sustainable solution.

Statements and Declarations:
Funding: N/A

Figure 2 :Figure 3 :Figure 4 :
Figure 2: Prediction of secondary structure of eight variants of NBS-LRR protein documented in present study using SOPMA tool