Estimation of Distribution Algorithm with Multivariate T-Copulas for Multi-Objective Optimization

Estimation of distribution algorithms are a class of evolutionary optimization algorithms based on probability distribution model. In this article, a Pareto-based multi-objective estimation of distribution algorithm with multivariate Tcopulas is proposed. The algorithm employs Pareto-based approach and multivariate T-copulas to construct probability distribution model. To estimate joint distribution of the selected solutions, the correlation matrix of T-copula is firstly estimated by estimating Kendall’s tau and using the relationship of Kendall’s tau and correlation matrix. After the correlation matrix is estimated, the degree of freedom of T-copula is estimated by using the maximum likelihood method. Afterwards, the Monte Carte simulation is used to generate new individuals. An archive with maximum capacity is used to maintain the non-dominated solutions. The Pareto optimal solutions are selected from the archive on the basis of the diversity of the solutions, and the crowding-distance measure is used for the diversity measurement. The archive gets updated with the inclusion of the non-dominated solutions from the combined population and current archive, and the archive which exceeds the maximum capacity is cut using the diversity consideration. The proposed algorithm is applied to some well-known benchmark. The relative experimental results show that the algorithm has better performance and is effective.


Introduction
Optimization problems are widely encountered in various fields of science and technology.When an optimization problem involves only one objective function, the task of obtaining the optimal solution is called single objective optimization.When an optimization problem involves more than one objective function, the task of finding one or more optimum solutions is called multi-objective optimization.Many real-world optimization problems involve multiple objectives.The fundamental difference between single-objective and multi-objective optimization lies in the cardinality of the optimal set.In problems with two or more conflicting objectives, there is no single optimum solution.There exist a number of solutions which are optimal.Such solutions are called Pareto-optimal solutions.The solution to a multi-objective optimization problem consists of a solution set with multiple solutions that may produce tradeoffs between different objectives.These solutions are called non-dominated solutions and the corresponding solution set is called the Pareto-front [1].None of the solutions in the Pareto-front is best with respect to all objectives.In addition, no solution in the Pareto-front is better than any other solution in the front with respect to all objectives.Hence, without any additional problem-specific information about the priorities of various objectives, all the solutions in the Pareto-front are important.The main objective of multiobjective optimization is to find many such solutions which reflect the tradeoffs between the objectives.Multiobjective optimization algorithms, especially those based on evolutionary principles, have been widely used to solve problems with multiple objectives.In recent years, a considerable amount of interest has been shown in multi-objective evolutionary algorithm (MOEA) and a number of different MOEAs have been suggested, such as Strength Pareto Evolutionary Algorithm (SPEA2) [2], Non-dominated Sorting Genetic Algorithm (NSGA-II) [3], and Fast Pareto Genetic Algorithm (FastPGA) [4].
Estimation of distribution algorithms (EDAs) [5] are a class of evolutionary algorithms based on probability distribution model.They estimate a probability distribution from population of solutions, and sample it to generate the next population.It has been proven that EDA has some special characteristics of good convergence, high efficiency, concise concept and been successfully extended to multi-objective optimization problems.The performance of an EDA highly depends on how well it estimates and samples the probability distribution.A wide variety of EDAs using different techniques to estimate and sample the probability distribution have been proposed and are the subject of active research.Many EDAs use probabilistic graphical modeling techniques [6,7] for this purpose.However, EDA using probabilistic graphical modeling techniques generally spend too much time on the learning about the probability distribution of the promising individuals.Copula [8,9] is a recently developed mathematical theory and a power tool for multivariate probability analysis.According to copula theory, a joint probability distribution can be decomposed into n marginal probability distributions and a copula function.So, the joint probability distribution of multivariate can be constructed utilizing a copula function and the marginal probability distributions of every variable.Copulas have be applied to constitute the probabilistic model in the conventional EDAs [10,11].It is simpler and easier to model the probability distribution compared with other methods in EDAs.
In this paper, EDA is extended to multi-objective optimization problems by using a Pareto-based approach and T-copulas.The extended multi-objective EDA employs multivariate T-copulas to construct probability distribution model.T-copula parameters are firstly estimated, thus, joint distribution is estimated by estimating Kendall's tau and using the relationship of Kendall's tau and correlation matrix.After the correlation matrix is estimated, the degree of freedom of T-copula is estimated by using the maximum likelihood method.Afterwards, the Monte Carte simulation is used to generate new individuals.An archive with maximum capacity is used to maintain the non-dominated solutions.The Pareto optimal solutions are selected from the archive on the basis of the diversity of the solutions, and the crowding-distance measure is used for the diversity measurement.The archive gets updated with the inclusion of the non-dominated solutions from the combined population and current archive, and the archive which exceeds the maximum capacity is cut using the diversity consideration.The proposed algorithm is applied to some well-known benchmark.The relative experimental results show that the algorithm has better performance and is effective.

Multi-objective Optimization Problems
The general multi-objective optimization problem can be defined as follows: where is an n-dimensional deci-sion variable vector and X is the decision variable space.The constrains given by ( 1) and (2) define the feasible region Ω and any point in Ω defines a feasible solution.
is referred to as objective space.The k components of the vector are the criteria to be considered.The constrains represent the restrictions imposed on the decision variables.
When there are several objective functions, the concept of optimum changes, because in multi-objective optimization problems the purpose is to find "trade-off solutions rather than a single solution.The concept of optimum commonly adopted in multi-objective optimization is Pareto optimality.Pareto optimality is defined as: This definition says that is Pareto optimal if there exists no feasible vector which would decrease some criteria without causing a simultaneous increase in at least one other criterion.Other important definitions associated with Pareto optimality are Pareto dominance.

 
, , , , if and only if x is partially less than y, i.e., x y  and, at least for one i, i i x y . For a given multi-objective problem , the Pareto optimal set   For a given multi-objective problem and Pareto optimal set  the Pareto front is defined as: The set of all Pareto optimal solutions in the feasible region Ω is called Pareto optimal set and the corresponding set of objective vector is called Pareto optimal front.The illustrative example of a multi-objective minimization problem with two objectives, f 1 and f 2 , that are plotted in the objective space M mapped from the feasible region Ω is shown in Figure 1.The bold curve in the feasible region Ω indicates the Pareto optimal set  .The bold curve in the objective space M indicates the Pareto front  .PF

Multivariate T-Copulas
Estimation of distribution algorithms differ from traditional evolutionary algorithms.Instead of applying genetic operators like mutation and crossover to the parents, estimation of distribution algorithms estimate a probability distribution over the search space based on how the parent individuals are distributed in the search space, and then sample the offspring individuals from this distribu- tion.The major issues in estimation of distribution algorithms are how to build a probability distribution model and how to sample the new individuals according to the probability distribution model.The theory of copulas is known to provide a useful tool for modelling dependence in many applications.Copulas have attracted significant attention in the recent literature for modeling probability distribution of multivariate observations.An important feature of copulas is that they enable us to specify the univariate marginal distributions and their joint behavior separately.Archimedean Copula and Gaussian Copula have be applied to constitute the probabilistic model in EDAs [10,11].
The copula theory is briefly introduced in the following, the details can be found in [8,9].
An n-dimensional copula is a multivariate C.D.F., C, with uniformly distributed margins on [0, 1] (U(0,1)) and the following properties: 1 2) C is grounded and n-increasing; 3) C has margins C i which satisfy for all The following theorem is known as Sklar's Theorem.It is the most important theorem regarding to copula functions since it is used in many practical applications.
Theorem: Let F be an n-dimensional C.D.F. with continuous margins . Then F has the following unique copula representation (canonical decomposition): The theorem of Sklar is very important, because it provides a way to analyse the dependence structure of multivariate distributions without studying marginals distributions.From Sklar's theorem we see that, for continuous multivariate distribution functions, the univariate margins and the multivariate dependence structure can be separated.The dependence structure can be represented by an adequate copula function.Moreover, the following corollary is attained from (3).
Corollary: Let F be an n-dimensional C.D.F. with continuous margins F F  and copula C (satisfying (3)).Then, for any , , , ,  where i F is the generalized inverse of F i .The T-copula can be thought of as representing the dependence structure implicit in a multivariate T-distribution.For a symmetric and positive definite matrix   , 1,2, , , 1,2, , denote the standardized multivariate Student's T-distribution with correlation matrix R and degrees of freedom: The multivariate T-copula (MTC) is defined as: is the inverse of the univariate Student's t cumulative distribution function with v degrees of freedom.
The corresponding density is  ω (6) with .
A number of recent papers have shown that the empirical fit of the T-copula is generally superior to that of the so-called Gaussian copula, the dependence structure of the multivariate normal distribution.One reason for this is the ability of the T-copula to capture better the phenomenon of dependent extreme values.In the paper, we use T-copulas to model probability distribution in multi-objective estimation of distribution algorithm.

Multi-Objective EDA with T-Copulas
EDAs are undoubtedly a powerful search engine for solv-ing single objective optimization problems.However, the original scheme has to be modified for solving multiobjective optimization problems.As we saw in Section II, the solution set of a problem with multiple objectives does not consist of a single solution.Instead, in multiobjective optimization, we aim to find Pareto optimal set.Besides using multivariate T-copulas for constructing probability distribution model in the multi-objective EDA, we have used an archive with maximum capacity to maintain the non-dominated solutions and the crowdingdistance measure for the diversity measurement.The Pareto optimal solutions are selected from the archive on the basis of the diversity of the solutions.The archive gets updated with the inclusion of the non-dominated solutions from the combined population and current archive, and the archive which exceeds the maximum capacity is cut using the diversity consideration.The algorithm steps of multi-objective EDA with T-copulas for solving multiobjective optimization problems are as follows: The function initialization() is used to initialize a population S(0) with size N.The initial population S(0) is chosen randomly and should uniformly cover the entire solution space based on the consideration of the requirement of population diversity.The function Evaluate() is used to evaluate each individual in population.The archive A(0) has been initialized to contain the non-dominated solutions from S(0).The function Non_Domina() returns the non-dominated solutions from a population.
The function Estimate_Marginals_Distribution() is used to estimate the marginals distribution from population.The function Estimate_Correlation_Matrix() is used to estimate the correlation matrix R in T-copulas.
A simple method [9] based on Kendall's tau is applied for estimating the correlation matrix R. The method consists of constructing an empirical estimate of Kendall's tau for each bivariate margin of the copula and then using relationship ( 7) To infer an estimate of the relevant element of R.More specifically we estimate X X by calculating the standard sample c coefficient From the original data vectors 1 2 , and write the jth component of the ith vector as , i j X ; this yields an unbiased and consistent.An estimator of , . In order to obtain an estimator of the entire matrix we can collect all pairwise estimates and then construct the estimator Then, The function Estimate_Degrees_Freedom() is used to estimate the degrees of freedom v by maximum likelihood method.
The function Sample() is used to generate offspring population T R with correlation matrix and v degrees of freedom can be generated as follows [9].R R T 1) Find the Cholesky decomposition of , so that A  A R, with A lower triangular; 2) Generate a sample of n independent random variables from N(0, 1); 3) Generate an independent random variables s with Z from  distribution; , , , , , , , , , .
The function Selection() is used to evaluate each individual in the current population and the offspring population  

S t
 and carry out the non-dominated sorting and ranking selection, developed by Deb et al. in [3]  to select non-dominated individuals to constitute the next generation population   1 S t  .

A
The proposed algorithm maintains an archive t with maximum capacity M. At each iteration, the archive gets updated with the inclusion of the non-dominated solutions from the next generation population  and the archive 0,1 . If the size of the archive exceeds the maximum capacity M, it is cut using the diversity consideration.The function Cut_Archive() is used to cut the archive.

Simulation Results
In order to test efficiency of the Pareto-based multi-objective estimation of distribution algorithm with T-copu- las, the performance of the proposed algorithm is compared with that of some multi-objective optimization algorithms.The following some well-known benchmark functions [12] have been used to test.
1) ZDT1: 4) ZDT4: The two common metrics used to compare are convergence metric γ and divergence metric Δ.For these metrics we need to know the true Pareto front for a problem.In our experiments we use 1000 uniformly spaced Pareto optimal solutions as the approximation of the true Pareto front.A brief introduction of these metrics is given here: Convergence metric γ was proposed by Deb et al. [3], measures the distance between the obtained non-dominated front NF and optimal Pareto front PF.Mathematically, it may be defined as: where N is the number of non-dominated solutions found by the algorithm being analyzed and i is the minimum Euclidean distance (measured in the objective space) between the ith solution of NF and the solutions in PF.
Diversity metric Δ was proposed by Deb et al. [3].It measures the extent of spread achieved among the obtained solutions and is defined as: where s is the number of members in the set of nondominated solution found so far, and the parameter f d d and i are the Euclidean distances between the extreme solutions and the boundary solutions of the obtained non-dominated set.The parameter d is the average of all distances , assuming that there are s solutions on the best non-dominated front.The smaller the value of these metrics is, the better the performance of the algorithm is.The initial population was generated from a uniform distribution in the ranges specified below.Population size N = 80.All experiments were repeated for 30 runs.The maximum number of iterations is set to 1000 in each running.
Table 1 listed the mean and standard deviation(sd) values of convergence metric γ and diversity metric Δ obtained using SPEA2 [2], NSGA-II [3], FastPGA [4] and the proposed algorithm(EDATCMO) on the multiobjective benchmark functions ZDT1, ZDT2, ZDT3, ZDT4 and ZDT6.It can be known from   than those obtained by using SPEA2, NSGA-II, FastPGA.It indicates that EDATCMO outperforms SPEA2, NS-GA-II and FastPGA algorithms on the benchmark functions in both aspects of convergence and distribution of solutions.

Conclusion
We proposed a multi-objective estimation of distribution algorithm using T-copulas and Pareto-based approach.The proposed algorithm employs the multivariate T- copulas to construct probability distribution model, and the new individuals are generated according to the probability distribution model.An archive is used to maintain the non-dominated solutions.The Pareto optimal solutions are selected from the archive.The archive gets updated with the inclusion of the non-dominated solutions from the combined population and current archive.The algorithm is applied to some well-known benchmarks.The results show that the algorithm has better performance than SPEA2 [2], NSGA-II [3], FastPGA [4] on the multi-objective benchmark functions ZDT1, ZDT2, ZDT3, ZDT4 and ZDT6.

Table 1 . Mean and sd of the convergence and diversity met- rics for Benchmark functions.
, the mean values of convergence metric γ obtained by using EDATCMO is smaller than those obtained by using SPEA2, NSGA-II, FastPGA.And the mean values of diversity metric Δ obtained by using EDATCMO is smaller Y. GAO ET AL. 68