Modeling of the Interaction of Flavanoids with GABA ( A ) Receptor Using PRECLAV ( Property-Evaluation by Class Variables )

Quantitative Structure-Activity Relationship (2D-QSAR) models for binding affinity constants (log Ki) of 78 flavanoid ligands towards the benzodiazepine site of GABA (A) receptor complex were estimated using the PRECLAV (Property-Evaluation by Class Variables) program. The best MLR equation with nine PRECLAV descriptors has R = 0.843 and C = 0.782. Attempt is also made for obtaining 2D-QSAR model using NCSS software. The comparison of the results indicated that the PRECLAV method is very efficient in detecting structure-activity correlation with good predictive power. 2 R


Introduction
During the last two decades quantitative structure-activity relationship (2D-QSAR) models have gained extensive recognition in drug design [1].The widespread use of 2D-QSAR models come from the development of novel structural descriptors and statistical equations relating activity with chemical structure.The main hypothesis in the 2D-QSAR approach is that all biological activity of a chemical substance is statistically related to its molecular structure.The PRECLAV program uses the atom in the common skelton to compute bond and field (grid) descriptors [2,3].The PRECLAV program computes five classes of structural descriptors: Constitutional, topological indices, molecular graph invariants, geometrical, quantum bond indices and field (grid) descriptors [2][3][4].
All molecules are aligned by superimposing the common atom before generating the multiple linear regression models; PRECLAV makes a descriptor selection by discarding those descriptors that are poorly correlated with the investigated activity.
During last decade more than 400 chemically unique flavonoids (phenyl-benzopyrans) have been isolated from vascular plants and many of them are used as tranquiliz-ers in folkloric medicine.Such type of compounds are important constituents of the human diet, being derived largely from fruits and vegetables, nuts, seeds, stems and flowers and thus constitute one of the important classes of the metabolites.Some of the compounds from flavones family exhibit a potent in vivo anxiolitic activity, and do not involve unwanted side effect.As a result of this several attempts have been made to generate synthetic flavones derivatives with higher affinities for the GABA (A) receptor [5][6][7][8][9][10].Subsequently, attempts were also made to establish quantitative structure-activity relationship so as to establish a 2D-QSAR model for inhibition of GABA (A) receptor that could serve as a guide for the rational design of further potent and selective inhibition having the flavones backbone [11][12][13].One such attempt was recently made by Duchowiz and co workers [14][15][16].They have proposed the best linear model for a set of 70 flavones and found that the best model involves four correlating descriptors with statistical quality given by R 2 = 0.7174, Se = 0.580, = 0.6757, S LOO = 0.622.(Property-Evaluation by Class Variables) ture-activity relationship with good predictive power.This has prompted us to use PRECLAV program for investigating GABA (A) receptor binding and to compare the findings with those obtained using NCSS software.

Database and Modeling
The data base used as input by PRECLAV consists of 78 flavoniods presented in Table 1 together with their log K i (μM) values [14].The chemical structures were generated with Hyper Chem.[22], geometry optimization was performed with MOPAC [23] and the QSAR models were computed with PRECLAV [2,3].MOPAC 7 output files are used by PRECLAV [2,3] program to compute PRECLAV descriptors for generating multiple linear regression models.Before such generation of the models PRECLAV software makes a descriptor selection by discarding those descriptors that are poorly correlated with the investigated activity.The following descriptors were generated in the present case:

Notations of the Structural Descriptors
Generated by PRECLAV 2) OXX: presence of Oxygen.Maximum charge for O atom (at parabolic region) PRECLAV Descriptor.
3) NGS: area of negative charged surface/molecular surface area ratio (at parabolic region) PRECLAV Descriptor.
4) HBA: Capability to form.These descriptors are chosen on the basis of their quality (Q) and were used to generate the best MLR (Multilinear regression) model.
Finally, the leave-one-out (LOO) cross-validation procedure is applied to each and every MLR equation in order to estimate the prediction power of the proposed QSAR equations.The predictive ability of a QSAR equation is estimated with the LOO Pearson and Rank (Kendall) correlation coefficients and .The equation with the highest predictive power is considered to be the one with the highest value for the product × .This QSAR model can further be used to predict the activity of novel, not yet tested compounds (Drugs).

R
In the present study for modeling log K i of 78 compounds initially we have used 400 PRECLAV and 1457 DRAGON descriptors.The number of excluded near constant descriptors being 89, while the number of significant descriptors is 174.One by one outliers is removed from calibration set so that final 2D-QSAR model is obtained.

Results and Discussion
After computing the structural descriptors for the 78 flavones (Table 1 nine correlating parameters using variable selection analysis (Table 2).These parameters are the same as those were used in the aforementioned PRECLAV QSAR modeling.However, unlike PRECLAV, the NCSS programs clearly demonstrate successive arrival of 9-para-metric model.(Table 2).Among the regression results the best one-, two-, three-, four-, five-, six-, seven-, eight-and nineparametric models were selected and are given in Table 3.
detail of this model is given below:

R
In these models, the correlation coefficient, R 2 , is a measure of the fit of the model.F, the Fisher test value, reflects the ratio of the variance explained by the model and the variance due to the error in the model.Higher values of F-test indicate the significance of the model.
We observed that the quality and predictive power of the earlier model is considerably improved after deletion of outliers.Furthermore, the physical significance of the involved parameters is the same as before.
A perusal of Table 3 shows that using NCSS software statistically allowed models start pouring with two-and higher-parametric modeling.The regression parameters and quality of these models are given below: We have also used PRECLAV descriptors for obtaining the best 2D-QSAR model using NCSS software.Variable selection for multiple regression analysis has demonstrated the occurrence of best regression model with

R
In this model, in addition to the two parameters OXX and HTm the third parameter NGS has positive coefficient.This means that in addition to presence of Oxygen Maximum charge for O atom (at parabolic region) as well as H total index/weighted by atomic masses, the area of negative charged surface/molecular surface area ratio (at parabolic region) is also favourable for the exhibition of the activity.

R
We observe that in this model, in addition to the aforementioned three parameters a fourth parameter viz.B08 [C-O] has positive coefficient clearly meaning thereby that presence of Oxygen Maximum charge for O atom (at parabolic region) as well as H total index/weighted by atomic masses, the area of negative charged surface/ molecular surface area ratio (at parabolic region), the presence/absence of C-O at topological distance 08.(2D binary fingerprint) also favours the exhibition of the activity.
Seven-variable model log K i = -

R
In this model, in addition to the positive coefficients of the aforementioned four parameters, the fifth parameter namely BO5[O-Br] also has positive coefficient.It means that in addition to presence of Oxygen Maximum charge for O atom (at parabolic region) as well as H total index/weighted by atomic masses, the area of negative charged surface/molecular surface area ratio (at parabolic region), the presence/absence of C-O at topological distance 08.(2D binary fingerprint), the presence/absence of O-B at topological distance 05 (2D binary fingerprint) also favours the exhibition of the activity.
Eight-variable model log K i = - Here, we observe that in addition to the positive coefficients of the above mentioned five parameters, the six parameter namely VLS also has positive coefficient.This clearly means that in addition to the presence of Oxygen Maximum charge for O atom (at parabolic region) as well as H total index/weighted by atomic masses, the area of negative charged surface/molecular surface area ratio (at parabolic region), the presence/absence of [C-O] at topological distance 08.(2D binary fingerprint), the presence/ absence of [O-B] at topological distance 05 (2D binary fingerprint), volume of circumscribed sphere (at parabolic region) also favours the exhibition of the activity.
Nine-variable model log K i = -

R
We observe that in this 9-parametric model the aforementioned six correlating parameters have positive coefficients .This means that their physical significance in this model is the same as that of the 8-parametric model discussed above.
The aforementioned 9-variable model is, therefore, the most appropriate model and is subjected to Ridge regression [25] for investigating the existence or otherwise of any co-linearity defect.The Ridge parameters, namely VIF (variance inflation factor), T (Tolerance), CN (Condition number), have been calculated and presented in Table 4.We observed that VIF (variance inflation factor)values are much smaller than the allowed range of 10.Also, that condition number for for the correlating parameters all are much lower than 100 and the tolerance are <1.These observations therefore, suggest that no colinearity defect is present in the proposed model.

Relative performance of PRECLAV and NCSS software
In order to further investigate the relative performance of both PRECLAV as well as NCSS software we have calculated (estimated) log K i values for the 9-parametric models using both softwares and compared them with the experimental values of log K i (Table 5).This is demonstrated in Figures 1 and 2 indicating that quality of the model obtain from both PRECLAV and NCSS software is more or less same.log K i values are much closer to the experimental values in case of PRECLAV software.From the study made herein we cannot definitely say as to which software is superior.Both have their own merits and demerits.However, the number of good points are more in PRECLAV software as compared to NCSS software.From the results obtained we conclude that there are some good or bad points in both the software and that overall PRECLAV software yields better statistics compared to NCSS software.The comparison of the performance of this software is demonstrated as below: Comparison of results obtained using PRECLAV and NCSS software.It is worth mentioning that one of the important features of PRECLAV software is the analysis of virtual fragments.The software has indicated that for the set of 78 molecules analyzed here 30 virtual fragments are present out of which 9 fragments are significant.These most significant virtual fragments by correlation of "The Mass percent" and "Property values" are given in Table 6.This Table 6  In order to confirm our findings we have compared the estimated values of the activities (log K i ) with the experimental ones (log K i ) (Table 5).This has further been demonstrated in Figures 1 and 2. Also, we have obtained Ridge traces as shown in Figures 3 and 4. From Figures 1 and 2 as well as Table 5, we observed that the estimated activities (log K i ) are very close to the experimen-tal activities (log K i ).Similarly, Figures 3 and 4 indicates absence of any co-linearity defect.

Conclusions
From the results and discussion made above we conclude that the PRECLAV software generates and proposes the overall best model and that there is no need of performing successive or stepwise regression to arrive at the best model.Such regressions are needed in NCSS software for obtaining the best model.Furthermore, while using PRECLAV software there is no need to perform model validation separately.Finally, PRECLAV software proposes virtual fragment which increases or decreases the biological activity.From the comparison made above we conclude that the PRECLAV software is the best for future 2D-QSAR study.

PRECLAV NCSS 1 ) 8 )
Overall the best model is proposed 1) Recommends obtaining of the best model in succession which need to be confirmed by their means 2) Predicts and removes the outliers one by one during the regression so that the final model does not have any outlier 2) This is not possible 3) Performance cross-validation 3) Not possible 4) Selects most significant descriptors by quality 4) Selection of descriptors is not based on quality 5) Most valuable descriptors set is generated 5Makes estimated and observed values in calibration set 8) Yes it is also possible in NCSS 9) Analysis virtual fragments is possible 9) Not possible 10) Standard deviation of coefficients are not measured 10) We can estimate standard error of the coefficient of the correlating parameters 11) Ridge statistics is not possible 11) We can make ridge analysis and then investigate co-linearity defect 12) Estimate qulity Q of the model 12) Not possible (Property-Evaluation by Class Variables)

Figure 1 .Figure 2 .
Figure 1.Correlation between observed and estimated log K i using 9-parametric model both from PRECLAV and NCSS softwares.

Table 3 . Quality of regression and predictive potential for one to nine variable models.
Here the coefficients of both the parameters OXX and HTm are positive meaning thereby that presence of Oxygen Maximum charge for O atom (at parabolic region) as well as H total index/weighted by atomic masses are favourable for the exhibition of the activity.
demonstrates that large mass percent of CO, C 9 H 4 O 4 , C 6 H 5 O 7 , C 6 H 4 O, and C 8 H 5 O 4 increases logK i values while the large mass percent at C 8 H 4 O, NO 2 , Br and C 6 H 4 decreases the log K i values.These observations, therefore, be taken care of while synthesizing new flavones with better log K i values.