Quantitative Structure Anti-Cancer Activity Relationship ( QSAR ) of a Series of Ruthenium Complex Azopyridine by the Density Functional Theory ( DFT ) Method

A series of ruthenium azopyridine complexes have recently been investigated due to their potential cytotoxic activities against renal cancer (A498), lung cancer (H226), ovarian cancer (IGROV), breast cancer (MCF-7) and colon cancer (WIDR). Thus, in order to predict the cytotoxic potentials of these compounds, quantitative structure-activity relationship studies were carried out using the methods of quantum chemistry. Five Quantitative Structure Activity Relationship (QSAR) models were obtained from the determined quantum descriptors and the different activities. The models present the following statistical indicators: regression correlation coefficient R = 0.986 0.905, standard deviation S = 0.516 0.153, Fischer test F = 106.718 14.220, correlation coefficient of cross-validation 2 cv Q = 0.9850.895 and 2 2 cv R Q − = 0.010 0.001. The statistical characteristics of the established QSAR models satisfy the acceptance and external validation criteria, thereby accrediting their good performance. The models developed show that the variation of the free enthalpy of reaction ΔG ̊, the dipole moment μ and the charge of the ligand in the complex Ql, are the explanatory and predictive quantum descriptors correlated with the values of the anti-cancer activity of the studied complexes. Moreover, the charge of the ligand is the priority descriptor for the prediction of the cytotoxicity of the compounds studied. Furthermore, QSAR models developed are statistically significant and predictive, and could be used for the design and synthesis of new anti-cancer molecules.


Introduction
The satisfaction obtained in the use of cisplatin for the treatment of tumours stimulated research on transition metal complexes.Many organometallic compounds have been synthesized and tested [1] [2].The results obtained were promising for different metals.Especially, only ruthenium seems to be an alternative to platinum since its compounds are found to be more active and less toxic than those of platinum [3].Thus, interest in the anticancer activity of ruthenium complexes has increased.The biological activities of ruthenium complexes are strongly influenced by the ligand structure.Indeed, several studies have shown that changing the aromatic ligand or the conformation of the complexes can have a significant influence on their anticancer activities [4].The azopyridine ligand shown in Figure 1 and Table 1 are actually formed from the combination of a pyridine ring with an azo group.These bidentate ligands can bind to Ru ion through RuCl 3 , 3H 2 O reactive by only the lone electron pairs of the nitrogen atoms of the pyridine ring and the azo group, thereby forming a 5-membered stable ring of chelation.Thus, this reaction provides metal with excellent stability.In addition, the complexation of ruthenium (Reaction 1) by the asymmetric bidentate ligands leads to five isomers named α-Cl, β-Cl, γ-Cl, δ-Cl and ε-Cl [5].The difference between them comes mainly from the position of both chlorine atoms (Cis or Trans configuration) and both azopyridine ligands as shown in Figure 2.

(
) ( ) ( ) The recent discovery of anticancer activity azopyridine complex ruthenium   (IGROV), breast cancer (MCF-7) and colon cancer (WIDR) [8].These molecules have shown promising anticancer activity.Besides, it is admitted that two modes of binding of the complex to the cell lines were up today supposed to characterize the process: While the first trend indicates that the binding was due to the hydrolysis of both chloride atoms, thereby allowing covalent bonding between ruthenium ion and the DNA, the second indicates however that the bonding is performed between the azopyridine ligand and the DNA base-pairs of the cell lines through a π-π stacking interaction [9].Here, the improvement of the cytotoxicity of the azopyirdine complexes requires one to know of the physicochemical properties that govern it.This would help to efficiently orient the synthesis of the ruthenium azopyridine complexes.
The Quantitative Structure Activity Relationship (QSAR) study is one of the most widely used methods to design new therapeutic agents [10] [11] [12].It allows quantitative correlation with a mathematical model of the structure or properties of the compounds with their biological activities.It is increasingly used to reduce the excessive number of experiments, sometimes long, expensive and harmful for environment protection [13] [14].In this work, the objective is to carry out a descriptive and predictive study of the anticancer activity of a series of nine (9) isomers of ruthenium complexes.Using the methods of quantum chemistry, this work aims to modelize the observed anticancer activities.The molecular descriptors have been calculated only from the molecular structure of the compounds and predicting the anticancer activities of analogous molecules.

Material and Method
The

Calculation Level
DFT methods are generally known to generate a variety of molecular properties [18] [19] in QSAR studies that increase predictability, reduce computation time and the cost of the design of new drugs [20] [21].Thus, all calculations were performed with the DFT methods using Becke's three-parameter hybrid functional B3LYP [22] and the double-zeta pseudo-potential LanL2DZ [23].All geometry optimizations of the molecules were carried out beforehand in order to obtain the structure in its ground state.Then this stable configuration was confirmed by the frequency analysis which has to reveal the absence of imaginary frequency.Furthermore, the analysis of the natural orbital population NPA was carried out at the same theoretical level.All these calculations were carried out using the software Gaussian 03 [24].The modelling was done using the multi-linear regression method implemented in Excel [25] and XLSTAT spreadsheets [26].

Quantum Descriptors Used
For the development of QSAR models, some theoretical descriptors related to the conceptual DFT were determined.In particular, the variation of the formation free enthalpy ΔG˚, the natural ligand charge in the complex Q L and the dipole moment μ.These descriptors are all determined from the optimized mole-cules.Here, the variation of free enthalpy of the reaction indicates the spontaneity of the reaction.It was calculated according to Equation (2).
The ligand charge, which corresponds to the sum of the ligand's natural atomic charges within the complex obtained by the NPA calculation, reflects the electrophilic or nucleophilic character of this entity.The dipole moment (μ) indicates the stability of a molecule in water.Thus, a high dipole moment will result in poor solubility in organic solvents and high solubility in water.Moreover, the interdependence of descriptors is evaluated by a linear correlation coefficient R between the pairs of the set of descriptors.Here, two descriptors are said to be independent when R < 0.95 [27] [28].Q , it gives information on the predictive power of the model.This predictive power is called "internal" because it is calculated from the structures used to build this model.

Estimation of the Predictive Ability of a QSAR Model
The squared correlation coefficient R² gives an evaluation of the dispersion of theoretical values around the experimental data.The quality of the modelling is improved when the points are close to the fitting line [31].The adjustment of the points to this line can be evaluated by the correlation coefficient.The correlation coefficient R 2 was given by the following Equation (3): where: , i exp y : The experimental value of the anticancer activity; More the R² value will be closer to 1 more the theoretical and experimental values will be assumed to correlate.In addition, the variance 2 σ was deter- mined by the relationship (4): where k is the number of independent descriptors, n is the number of molecules The Fisher test F was also used to measure the level of statistical significance of the model, i.e. quality of the choice of descriptors constituting the model.The Fisher test F is defined from Equation ( 6): The correlation coefficient of cross-validation 2 Q to assess the accuracy of the prediction on the training set was calculated by using the following relationship: , , The performance of a mathematical model, for Eriksson et al. [32], was characterized by a value of 2 0.5 cv Q > for a satisfactory model and for an excellent model when 2 0.9 cv Q > .According to them, for a given training set, a model will be performant if the acceptance criterion Moreover, the prediction power of a model can be obtained from five Tropsha's criteria [12] [33] [34].If at least three of the criteria were satisfied, then the model will be considered efficient in predicting the activity studied.These criteria are the following:

Results and Discussion
In this QSAR study, the training set consisting of six molecules and the three other ruthenium azopyridine complexes forming the validation set are presented in Table 2. Also, the values of the descriptor's bivariate linear correlation coefficients R are presented in Table 3.
The calculated linear correlation coefficients R of the series of descriptors are less than 0.95 (R < 0.95).This demonstrates the non-dependence of the descriptors used to develop the models.

QSAR Model and Contribution of Descriptors
The best QSAR models obtained for the various anti-cancer activities as well as   It should be noted that the negative or positive sign of the model descriptor's coefficient reflects the proportionality effect between the biological activity evolution of interest and this parameter of the regression equation.Thus, the negative sign indicates that when the value of the descriptor is high, the biological activity decreases while the positive sign translates the opposite effect.
The negative sign of the coefficient of the free enthalpy variation or the dipole moment indicates that the cytotoxic activity will be improved for a low value of the free enthalpy variation or the dipole moment.On the other side, the positive sign of the coefficient of the ligand charge means that the cytotoxic activity will be improved for a high value of the ligand charge.Also, the significance of the models is reflected by the Fisher coefficient F which is between 14.22 and 106.718 and the cross-validation correlation coefficient

Verification of Tropsha Criteria
The Tropsha criteria for the different models are presented in Table 5.
All values respect the Tropsha criteria, so these models are acceptable for predicting the ruthenium azopyridine complexes cytotoxic activity.
The study of the relative descriptors contribution in the prediction of the compounds cytotoxicity was carried out for each type of cancer cells.The different contributions are shown in Figure 4.
The charge of the ligand has a large contribution than the free enthalpy variation or the dipole moment.Thus, the charge of the ligand is revealed to be the priority descriptor in the prediction of the cytotoxic activity of the ruthenium azopyridine complexes studied.

Conclusion
In this work, the cytotoxic activities of six ruthenium azopyridine complexes on cancer cells that comprise renal cancer (A498), lung cancer (H226), ovarian cancer (IGROV), breast cancer (MCF-7) and colon cancer (WIDR) were correlated with the theoretical descriptors calculated by the DFT methods.The cytotoxic activities of three other ruthenium azopyridine complexes were selected to form the external validation sets for the calculated models.Multiple Linear Regression (MLR) was used to quantify the relationships between molecular descriptors and the properties of the azopyridine derivatives cytotoxic activity.A strong correlation was observed between the experimental values and the predicted values of the cytotoxic activity, indicating the validity and quality of the QSAR models obtained.The quantum descriptors of the optimized molecules, the free enthalpy variation of reaction, the dipole moment and the ligand charge, made it possible to predict the cytotoxicity of the ruthenium azopyridine complexes studied on cancer cells.Among all these descriptors the ligand charge is the descriptor which influences the cytotoxicity activ-

[ 6 ]Figure 1 .
Figure 1.Skeleton of the azopyridine ligands with different substituents R and R 1 .The bidentate state of the ligand consists of the ligand binding to ruthenium or central metal by N py and N 2 .

Figure 2 .
Figure 2. The five isomers of RuCl 2 L 2 complexes.L stands for all azopyridine ligands.The arc represents azopyridine ligands highlighting their bidentate state.All these isomers have the C 2 symmetry except for the β-Cl isomer.t and c represent respectively trans and cis geometries.Therefore, the three letters codes show the geometry in order of the chlorides (Cl), the pyridine ring (N py ) and the azo nitrogen N 2 .
six molecules of the training set and the three other external validation set molecules used in this study have IC 50 ranging from 0.045 to 74 μM.Here, the term IC 50 means the median concentration of molecules determined experimentally to inhibit 50% of cancer cells in a population of cancer cells.This range of concentrations makes it possible to define a quantitative relationship between the anticancer activity and the theoretical descriptors.According to Aldrik et al.the experimentation of the cytotoxicity of the aforementioned human tumor cell lines was made in vitro using the microculture sulforhodamine B test (SRB) for the estimation of the cell viability[15].Biological data are generally expressed as the opposite of the log 10 base of activity obtain higher mathematical values when the structures are biologically very efficient[16] [17].The anticancer activity is expressed by the anticancer potential pIC 50 that is defined in Equation (1): A QSAR model is developed on the basis of statistical indicators.The quality of a model is determined on the basis of these various analysis statistical indicators, including the correlation coefficient R 2 , the standard deviation S, the correlation coefficients of cross validation 2 cv Q and Fischer F. R 2 , S and F relate to the adjustment of the calculated and experimental values: they describe the predictive capacity within the limits of the model and allow to estimate the precision of the values calculated on the learning set [29] [30].As for the correlation coefficient of the cross-validation 2 cv

:
The theoretical value of the anticancer activity; , i exp y : The average value of the experimental values of the anticancer activity.
of the training set and 1 n k − − corresponds to the degree of freedom.The root mean square error s is another statistical indicator used.It allows evaluating the reliability and accuracy of a model.It is obtained from Equation (5):

3 .
to 0.985.These different models are acceptable because all the values of External validation of these models was performed with the 1α, 1β and 1γ isomers.The different regression lines between the experimental and theoretical cytotoxic activities of the training set (blue dots) and the validation set (red dots) for each cancer cell are illustrated in Figure 3.

Figure 3 .
Figure 3. Regression lines of the different models.

Figure 4 .
Figure 4. Contribution of different descriptors in different models.

Table 2 .
Quantum descriptors and experimental anti-cancer activities of the training set and the validation set.ΔG˚, Q l and µ are expressed respectively in kcal, a.u. and Debye.

Table 3 .
Values of the descriptor's bivariate linear correlation coefficients.

Table 4 .
The most significant QSAR models for the modelling of cytotoxic activities on A498, H226, GROV, MCF-7 and WIDR cancer cells.

Table 4 .
It should be emphasized that these models were established using the same test and validation sets in Table2.

Table 5 .
Tropsha Criteria for different models.