Risk Evaluation Based on Variable Fuzzy Sets and Information Diffusion Method ()
1. Introduction
With the global climate change and rapid population growth, many cities suffered extreme natural hazards frequently [1]. The uncontrolled development and unplanned change of land use belong to the highly sensitive areas, where natural hazards cause devastating economic and social losses [2] [3]. These and many other impacts mean it is important to assess the risk of natural hazards and growing economy requires improved risk assessments methods. Risk evaluation is an effective way to reduce the negative impacts of natural hazards and it is an effective way to reduce the effects of hazards which plays an increasingly important role in emergency management and to reduce the losses caused by hazards.
Risk evaluation should consider the occurrence likelihood of a specific event and the severity of the outcome. The result is composed by the probability of all relevant hazards and the severity of losses scenarios. In order to evaluate risk of any hazards, it needs to build a reasonable and effective assessment system. Generally, risk evaluation can be divided into qualitative and quantitative assessment method. The qualitative assessment methods are by means of several software such as geographic information systems (GIS). For example, Chen et al. [4] provided useful detailed information for flood risk management by combining with analytical hierarchy process and GIS in flood risk assessment, and the method can be easily applied to most areas where required data sets are readily available. Aye et al. [5] provided a prototype of an interactive web-GIS tool for the risk evaluation and management evaluation in Central East Moldavian Region, considering the occurrence of floods and earthquakes.
However, there is an unavoidable fact that GIS adopted to risk evaluation can often not be quantified and results are limited by insufficient risk-related data. In order to quantify risk, T. J. Huggins et al. [6] analyzed the cascading disaster risk by using longitudinal data. Ribeiro et al. [7] proposed a probabilistic model, based on a bivariate copula approach using elliptical and Archimedean copulas, to estimate the probability of loss. And Huang [8] put forward the concept about probabilistic risk, which can be quantified as the expected value to predict future risk based on historical data. However, long time series of disaster-related data usually often does not exist, and the information contained in existing data sets is typically incomplete. Then in order to assess the risk of hazards quantitatively, especially when the recorded data sets are incomplete, there must be another method. Many studies have focused on IDM which belongs to fuzzy sets theory and is easier to quantify the probabilistic risk when the data is insufficient. Li [9] proposed a flood risk assessment model based on information diffusion method to deal with the small sample size and gave one example to demonstrate the model is successful. Xu et al. [10] developed another method for using IDM to quantitatively assess the risk of multiple hazards. Huang [11] gave the definition of integrated probability risk and assess the risk of annual loss by using information diffusion method. Although some papers have given the definition and assessed the integrated hazards risk [12] [13], the randomness and fuzziness which determine the reliable of evaluation result have been ignored.
Considering that integrated risk evaluation contains fuzzy concept with multiple indicators and classes, the variable fuzzy sets theory introduced by Chen [14], which give membership degrees and relative membership function to evaluate the fuzzy concept, has been successfully used in risk evaluation. For example, Carreno et al. [15] applied the fuzzy sets theory into the seismic risk evaluation when the data required to assess risk are not available or are insufficient. Li et al. [16] used the fuzzy comprehensive evaluation method, which proved effective in solving problems of fuzzy boundaries and controlling the effect of monitoring errors on assessment results, to analysis flood risk. Wang et al. [17] proposed an integrated variable fuzzy evaluation model, to overcome the limitations of the traditional evaluations which only used a point value instead of an interval for grading standards, for the assessment of river water quality.
There are many approaches to assess risk including uncertainty theory and incomplete sets. Many papers have proved that these methods can be integrated to improve the accuracy of assessment result [18] [19]. Besides, there are many improved models, such as combined the projection pursuit method optimized by immune evolutionary algorithm with information diffusion method has been used to assess risk of drought [20]. However, there are still some obstacles for dynamically assessing integrated hazard risk and we say dynamically means that time dimensional should be considered in risk evaluation.
In our study, the definition of improved probability risk is identified first, then use VFS to eliminate both randomness and fuzziness in integrated risk evaluation. Based on the processed data, take time dimension into consideration and use the IDM to extract as much useful underlying information as possible from samples to estimate relationships behind the incomplete data sets, thus assess the integrated risk dynamically and improve the accuracy of evaluation result. This paper is organized as follows: In Section 2, we give some definitions of risk evaluation and describe some basic concepts and principles of the evaluation model. The process of how to build our model and how to evaluate the integrated risk dynamically is followed and discussed in detail. The results will be shown in Section 3. Finally, Section 4 discusses the importance of the new model and outlines implications for further work.
2. Basic Concepts
2.1. Improved Probabilistic Risk
According to what Huang have said, the risk could be classified into four categories: pseudo risk, probability risk, fuzzy risk, and uncertainty risk [11]. In the following contents, the definition of probability risk and fuzzy risk will be introduced, then the improved probability risk will also be defined.
The probability risk is a scenario in the future that is associated with some specified adverse incident that can be statistically predicted by using a probability model. There are limitations for determining this kind of risk, due to corresponding events with fuzzy boundaries and the predicative information is incomplete. Then we have the fuzzy risk, which is the scenario in the future associated with some specified adverse incident and that we are able to approximately infer by using fuzzy logic and incomplete information. IDM is used to solve this type of risk. Based on the two preceding definitions, we have the improved probability risk which is a scenario in the future associated with integrated adverse incident characterized by fuzzy sets of variables, and where VFS-IDM is used to dynamically assess integrated risk when the data sample is incomplete. Therefore, the equation for improved probabilistic risk can be defined as:
In the case of that we can estimate the probability distribution
of the occurrence of hazards with respect to factor x and the relationship
between the factor and losses, the risk can be quantified as the expected value of hazards, just as show in Equation (1):
(1)
2.2. Variable Fuzzy Sets Theory
For a variable fuzzy set (VFS) U and a random element
, we can find two relative membership degree functions (RMDF), denoted by
and
, which express the extent of acceptability (A) and repellency (
) respectively. That is,
and when the RMDF
is larger than
, the major property of
is acceptability, and the minor property is repellency. This means a ratio can be used to represent this relationship in the interval and the expression of RMDFs can be derivate [14].
We define interval
= [a, b] as the attracting sets of VFS U and X is a certain extended interval [c, d] including
on the real axis. For any fixed u, we have
, where r stands for the assessment object set. If we have
and M is the balance boundaries of interval
, then the following Figure 1 has shown the possible locations of x when M is fixed.
Thanks to the research of how to define the balance boundaries of VFS [21], we can get the important parameter
by following equation:
(2)
where l denotes the assessment indicator set,
. The position of
decides the expression of RMDF and satisfies following suppositions: 1) when
, then
; 2) when
, then
; 3) when
, then
. To figure out the expression of RMDFs, define relative difference degree of u to A:
(3)
Given the point
, if x is located at the right of M, the relative
![]()
Figure 1. The location between different point x and fix point M, [a, b], [c, d].
difference degree can be replaced by a ratio:
(4)
Combined with the property
and Equations (3) and (4), the RMDF is denoted by Equation (5).
(5)
From the above Equations, it can be determined that the RMDF is affected by hyper-parameter p and the position between random point x with parameters a, b, c, d, and M. The following conditions are RMDF should satisfy: 1) when
or
,
; 2) when
,
; 3) when
,
. In general, we take
such that the Equation (5) become linear functions. For the different location of random point x with respect to the object interval, the significant point is that the calculation of relative membership degree (RMD) can be classified into two types: When the random point x located in the lowest or highest point of the standard interval, the RMD sum of this and adjacent level is equal to 1. When random point x located in the interval of a mid-level, the RMD sum of its adjacent levels is 0.5. The detailed process can be found in Fang [22] and we will apply this result to find out the RMD matrix in case study.
2.3. VFS-IEM to Evaluate the Comprehensive Degree Value
For each sample x, the measured values from rth indicator to lth class can be denoted as relative membership degree matrix by using RMDFs. Because the subjectivity of determining index weight and assessment standard entails fuzzy sets, results are often incompatible and even lead to unreliable conclusions. This study proposed the VFS-IEM model to make the result more reliable.
The degree value from rth object to lth indicator called measured value
can be defined. The information entropy method (IEM) is used in the fuzzy comprehensive evaluation [23], to calculate the weight of indicators at each monitoring point over the whole assessment standard. For each element in U, the regularization membership vector
which represents entropy of the lth indicator is
(6)
Then the entropy coefficient can be defined as
(7)
where
is the entropy coefficient of lth indicator.
To assess the variable fuzzy sets more reliable, we should find out the relative membership degree of u to each indicator in a reasonable way. Let
denote as the relative membership degree of u to lth indicator and it remains to be determined. By the study of Wang [21], we say that the fuzzy set can be assessed by VFS-IEM comprehensive evaluation model, denoted by Equation (8), and each sample had been converted from multiple-dimensional indicators into one-dimensional degree value H.
(8)
2.4. Information Diffusion Method
In order to assess the risk of hazards quantitatively, especially when the recorded data sets are incomplete, IDM which belongs to fuzzy sets theory is used to extract as much useful underlying information as possible from the samples to estimate relationships behind the incomplete data. We can make full use of the diffusion information given by samples to estimate the probability density of samples or the relationship between sample data without knowing which distribution the samples come from.
The research by Huang [24] [25] have given us many results about IDM which can transform data sample point into a fuzzy set with membership function so as to improve the precision of estimators. Let
be one dimensional random sample,
is the universal field.
, the result of the normal diffusion function for sample
diffuse to monitoring point
calculated by
(9)
And the diffusion coefficient is calculated by the following formula:
(10)
Furthermore, we have the m-dimensional diffusion function of X on V,
(11)
The Principle of Information Diffusion have proved that it is useful to estimate the probability density of samples or the relationship between sample data without knowing sample distribution by using normal diffusion function [24]. Such as the probability density can be estimated by the value of
which says that observation
gives information to the monitoring point
. According to Equation (9), Let
(12)
The probablity density function can be denoted by matrix q:
(13)
For estimating the relationship between samples, let sample point with input
and output
, the universal field
and normal diffusion function
. For the value of
, observation
can diffuse information to the monitoring point
, and obtain a fuzzy relation
. Let
, it is defined as:
(14)
Then based on Equation (14), we have
model
(15)
And it has shown that the max-min fuzzy composition rule, i.e.,
(16)
can make more accurate inference when the samples are incomplete [25].
3. Dynamic Integrated Probabilistic Risk Evaluation Model
Risk evaluation is an effective way to reduce the impacts of hazards and it plays an increasingly important role in emergency management. However, the fuzziness related to integrated hazards and timeless always be ignored in risk evaluation. By the definition of improved probability risk, a combined model based on variable fuzzy sets and information diffusion method to assess integrated hazards risk dynamically is proposed in this study.
By defining the interval criterion matrix
and the variable interval matrix
, the balance boundaries M and RMD matrix can be calculated. According to the VFS-IEM comprehensive evaluation model, the integrated hazards level value of every sample is denoted by Equation (8) and the result is
So, the fuzziness in sample data sets have been eliminated and the processed samples have reflected the level value of integrated hazards.
To evaluate the integrated risk dynamically, the time dimensional should be considered. This study used the processed samples to estimate conditional probability distribution and the vulnerability curve by applying the information diffusion method. The Equations (9) (13) show that the discrete probability distribution
denotes the hazards occurrence.
(17)
Then the conditional probability distribution with respect to time dimensional denoted by Equation (18)
(18)
and the vulnerability curve obtained by Equation (16) and center of gravity method [26]
(19)
Then the integrated hazards dynamic risk can be quantified as the expected value of conditional probability distribution and vulnerability curve, just as shown in Equation (20).
(20)
4. Conclusion and Discussion
Risk evaluation is a very important issue in emergency management, but there are few papers discuss the uncertainty of integrated hazards and consider dynamic risk under time dimension when the data set is incomplete. In our study, the definition of probabilistic risk has been modified and takes time dimension into consideration to introduce the concept of dynamic probabilistic risk. Then employs fuzzy set theory (VFS) to calculate the relative membership degree and applies information entropy method (IEM) to obtain the weights of criteria indicators for integrated hazards evaluation. Based on the results obtained by VFS-IEM model, this paper applies information diffusion method (IDM) to estimate condition probability distribution and vulnerability curve with the time data and integrated hazards losses. Then the dynamic risk can be calculated by using the normal information diffusion estimator so as to improve the accuracy of risk evaluation results. The proposed model highlights: the integrated hazards could be processed by VFS theory and combines with IEM to make the integrated hazards level more reliable; solve the problem of limited information in dynamic risk and improve the accuracy of sample data estimation by converting the sample points into fuzzy sets. In the further study, case study will be the focus and data sets will be collected to test out methods.
Acknowledgements
The authors gratefully acknowledge the funding from the National Natural Science Foundation of China, Project No. 71771113 and the National Key R&D Program of China, 2018YFC0807000.