Probabilistic Fuzzy Regression Approach from the Point of View Risk

Fuzzy regression analysis is an important regression analysis method to pre-dict uncertain information in the real world. In this paper, the input data are crisp with randomness; the output data are trapezoid fuzzy number, and three different risk preferences and chaos optimization algorithm are introduced to establish fuzzy regression model. On the basis of the principle of the minimum total spread between the observed and the estimated values, risk-neutral, risk-averse, and risk-seeking fuzzy regression model are developed to obtain the parameters of fuzzy linear regression model. Chaos optimization algorithm is used to determine the digital characteristic of random variables. The mean absolute percentage error and variance of errors are adopted to compare the modeling results. A stock rating case is used to evaluate the fuzzy regression models. The comparisons with five existing methods show that our proposed method has satisfactory performance.

squares regression approach to modelling relationships in Quality function deployment. Zhang [8] proposed an fuzzy linear regression analysis model based on the centroid method. Li and Zeng [9] introduced an fuzzy regression models based on least absolute deviation.
The previous models were studied to solve fuzziness, but some data not only have fuzziness but also have randomness. Only few previous studies have examined both fuzziness and randomness in empirical research. In this paper, according to the actual meaning of the random variables, the appropriate numerical characteristics are selected. Introduced chaos optimization algorithm (COA) determines the numerical characteristics of the random variable. Three mathematical programming models are proposed, called risk-neutral, risk-averse and risk-seeking. Based on this, the fuzzy regression coefficient of different risk models can be obtained, and model are determined.

Mathematical Preliminaries
is trapezoidal fuzzy number. If the fuzzy number is symmetric fuzzy numbers, l r = . The fuzzy number is triangular fuzzy number when a = b.
The set of trapezoid fuzzy numbers is denoted by T R  , and T

Probablistic Fuzzy Regression
The fuzzy linear regression model can be state as is fuzzy output value of the i th observation, ij x is crisp value of the j th independent variable in the i th observation, 1, , j k =  , k is the number of independent variables; and 0 1

Introduction of Chaos Optimization Algorithm
As the characteristics of the sample directly reflect the overall population, the sample's numerical characteristics can be used to evaluate the overall population.
The common numerical characteristics are expectation, variance, etc. Select the corresponding numerical characteristics according to the actual needs, determine the numerical characteristics to be determined using COA.
The chaos optimization algorithm proposed by Li [11], chaos is introduced into the design variable of the optimization problem using a similar carrier method, and the ergodic range of the chaotic motion is extented to the range of value of the design variables. Then, search by chaotic variables. COA employs chaotic dynamics to solve optimization problems and it has been applied successfully in various areas such as function optimization and supply chain N. N. Gao, Q. J. Lu Journal of Data Analysis and Information Processing optimization [12]. Compared with conventional optimization methods, COA has faster convergence and can search for better solutions [13]. This algorithm also has an improved capacity to seek for the global optimal solution of an optimization problem and can escape from a local minimum. The characteristic of randomness ensures the capability for a large-scale search. Ergodicity allows COA to traverse all possible states without repetition and overcome the limitations caused by ergodic searching in general random methods. COA uses the carrier wave method to linearly map the selected chaos variables onto the space of optimization variables and then searches for the optimal solutions based on the ergodicity of the chaos variables.

Determination of Numerical Characteristic
The processes of applying COA in this study are described as follows.
Third, the mean absolute percentage error (MAPE) is defined as the average of percentage errors, which is scale-independent and is a popular measure for evaluating accuracy [8]. Thus, MAPE was adopted in this study as the fitness function in COA, which is defined as follows:   (2) and Equation (3), respectively.
The process is as follows:

Mathematical-Programming Model
Considering the random variable, the model in Equation (1) can be rewritten as follows: where ij x′ is a certain numerical characteristic of ij x , 1, , ; 1, , i n j k = =   .

Risk in Model
In this section, three different risk preference are introduced, called risk-neutral, risk-averse and risk-seeking problems, to determine the fuzzy regression model.

Degree of Fitness from the Point of View Risk
Dobois and Prade [10] proposed the following equality indices to compare two fuzzy numbers.
where Pos and Nes are short for Possibility and Necessity. Observe that index Equation (11) , the risk-neutral,risk-averse and risk-seeking degree of fitness the estimate model, denoted by , , From the properties of the two indices Equation (11) and Equation (12)

Parameter Estimation of the Model
The objective function of the model is to minimize the difference between the total spread of observed and estimated values it is given by Considering the above assumption, the problem is to obtain fuzzy parameters of risk-neutral, risk-averse and risk-seeking model in order to minimize J in Consequently, the mathematical models lead to the following three quadratic programming problems:

Algorithm of Model
The algorithm of the proposed model is summarized below.
Step 1: The parameters are initialized, including the number of iterations, initial value of numerical characteristics, initialized chaos variables, and ranges of parameters.
Step 2: The structure of the proposed model is generated using Equation (8).
The number of terms in the model is 1 k + , where k is the number of independent variables.
Step 3: The iteration begins from 1 m = . The chaos variables m c are generated based on the logistic model in Equation (2) and transformed into optimization variables m q using Equation (3).
Step 4: The interval of a random variable is defined based on the experimental data, and the corresponding numerical characteristics is selected. The numerical characteristic value of random variables are then generated based on the values of m q . The random variables are substituted by their corresponding numerical characteristics, and the probabilistic terms of the proposed models are generated.
Step 5: The fuzzy coefficient of each term of the proposed model is determined by solving the quadratic programming problems shown in Equations (14) to (19).
Step 6: Predicted output ˆi y is calculated with the developed proposed models. MAPE between estimate value ˆi y and actual value i y for all data sets can then be obtained using Equation (4) as the fitness value of the iteration m.   There are many factors that affect stock rating. However, based on the three principles of comprehensiveness, comparability and feasibility, we choose the following indicators to form an evaluation index system: profitability (x 1 ), operational capacity (x 2 ), short-term debt paying ability (x 3 ) and volatility of the stock (x 4 ).

Empirical Studies
This case study the relationship between stock rating (y) and profitability (x 1 ), operational capacity(x 2 ), short-term debt paying ability (x 3 ) and volatility of the stock (x 4 ). In the stock rating forecast, the input is the crisp value and the output is the trapezoid fuzzy number. The value of stock price volatility is the standard deviation of the closing price of the stock from Sep 18th, 2017 to Oct 18th.
The sample date collected are shown in Table 1, Table 2, especially, the x 4 in Table 1 is the standard deviation of the sample. Then the confidence interval of volatility of the stock is calculated by MATLAB shown in Table 3.

The Construction of the Model
After the data collection, based on the initialized stock volatility, fuzzy coefficient of different risks can be obtained by Equations (14) to (19).
Generating new stock volatility by COA, Equation (4) are considered as the objective function. In the end, stock volatility based on different risks are shown in Table 4 and Table 5. The results of the proposed models were compared with those of statistical regression(SR), Tanaka et al. [14] (denoted by TH), Diamond [1] (denoted by DM), Chiang Kao et al. [15] (denoted by KC) and Junhong Li et al. [9] (denoted by JL) to evaluate the proposed method effectiveness. MAPE and the variance of error (VoE) defined in Equations (4) and (20), respectively, were adopted to compare the modeling results of these approaches.
N. N. Gao, Q. J. Lu The same survey data was utilized to develop models based on SR, TH, DM, KC, JL. For SR, the centroid of fuzzy number are considered as fuzzy number.
The models are produced by different methods as follows in Table 6.

Conclusion
In this paper, we developed three mathematical programming models, called risk-neutral, risk-averse, and risk-seeking, to studied fuzzy linear regression N. N. Gao, Q. J. Lu Journal of Data Analysis and Information Processing