^{1}

^{1}

^{*}

^{2}

In this paper, survival data analysis is realized by applying Generalized Entropy Optimization Methods (GEOM). It is known that all statistical distributions can be obtained as distribution by choosing corresponding moment functions. However, Generalized Entropy Optimization Distributions (GEOD) in the form of distributions which are obtained on basis of Shannon measure and supplementary optimization with respect to characterizing moment functions, more exactly represent the given statistical data. For this reason, survival data analysis by GEOD acquires a new significance. In this research, the data of the life table for engine failure data (1980) is examined. The performances of GEOD are established by Chi-Square criteria, Root Mean Square Error (RMSE) criteria and Shannon entropy measure, Kullback-Leibler measure. Comparison of GEOD with each other in the different senses shows that along of these distributions
(MinMaxEnt)_{4} is better in the senses of Shannon measure and of Kullback-Leibler measure. It is showed that,
(MinMaxEnt)_{3} (
(MaxMaxEnt)
_{4}) is more suitable for statistical data among
(MinMaxEnt)_{m},m=1,2,3,4(MaxMaxEnt)_{m},
m=1,2,3,4. Moreover,
(MinMaxEnt)
_{3} is better for statistical data than
(MaxMaxEnt)_{4} in the sense of RMSE criteria. According to obtained distribution
(MinMaxEnt)
_{3}
(MaxMaxEnt)
_{4} estimator of Probability Density Function
f<sup style="margin-left:-8px;">^</sup>
^{}
*(t)*, Cumulative Distribution Functio
F<sup style="margin-left:-8px;">^</sup>
^{}
*(t)* , Survival Function
*Ŝ(*
*t)* and Hazard Rate
ĥ(
*t)* are evaluated and graphically illustrated. The results are acquired by using statistical software MATLAB.

Entropy Optimization Methods (EOM) have important applications, especially in statistics, economy, engineering and so on. There are several examples in the literature that known statistical distributions do not conform to statistical data; however, the entropy optimization distributions conform well. Generalized Entropy Optimization Methods (GEOM) have suggested distributions in the form of MinMaxEnt which is the closest to statistical data, and MaxMaxEnt which is the furthest from mentioned data in the sense of information theory [

Different aspects and methods of investigations of survival data analysis are considered in [

In particular in the paper [

In order to represent the results of our investigations, we give some auxiliary concepts and facts first.

Survival time can be defined broadly as the time to the occurrence of a given event. This event can be the development of a disease, response to a treatment, relapse or death [

Censoring: The techniques for reducing experimental time are known as censoring. In survival analysis, the observations are lifetimes, which can be indefinitely long. So quite often the experiment is so designed that the time required for collecting the data is reduced to manageable levels.

Let

The probabilistic properties of the random variable are studied through its cumulative distribution function

Cumulative Distribution Function:

Survival Function: This function is denoted by

Probability Density Function: Like any other continuous random variable, the survival time

Hazard Rate: This function is defined as the probability of failure during a very small time interval, assuming that the individual has survived to the beginning of the interval, or as the limit of the probability that an individual fails in a very short interval,

Entropy Optimization Problem (EOP) [

EOP: Let

GEOP: Let

The method of solving GEOP is called as GEOM.

The problem of maximizing entropy function

subject to constraints

where

has solution

where

If (3) is substituted into (1), the maximum entropy value is obtained:

If distribution

Let

Consequently,

Distributions

Let

Solving the

In the present research, the data of the life table for engine failure data (1980) given in

In our investigation, the experiment is planned for 200 numbers of patients surviving at beginning of interval but the presence of censoring from the planning patients 97 individuals stay out the experiment. This situation is taken into account in

It should be noted that, the presence of censoring in the survival times leads to a situation where the sum of observation probabilities stands less than 1 for the

Survival Time (year) | Working at the beginning of interval | Failed during the interval | Censored during the interval |
---|---|---|---|

0 - 1 | 200 | 5 | 0 |

1 - 2 | 195 | 10 | 1 |

2 - 3 | 184 | 12 | 5 |

3 - 4 | 167 | 8 | 2 |

4 - 5 | 157 | 10 | 0 |

5 - 6 | 147 | 15 | 6 |

6 - 7 | 126 | 9 | 3 |

7 - 8 | 114 | 8 | 1 |

8 - 9 | 105 | 4 | 0 |

9 - 10 | 101 | 3 | 1 |

Observed probabilities | Corrected probabilities | ||||
---|---|---|---|---|---|

0 - 1 | 200 | 5 | 0 | 0.0485 | 0.0485 |

1 - 2 | 195 | 10 | 1 | 0.0971 | 0.1068 |

2 - 3 | 184 | 12 | 5 | 0.1165 | 0.1650 |

3 - 4 | 167 | 8 | 2 | 0.0777 | 0.0971 |

4 - 5 | 157 | 10 | 0 | 0.0971 | 0.0971 |

5 - 6 | 147 | 15 | 6 | 0.1456 | 0.2039 |

6 - 7 | 126 | 9 | 3 | 0.0874 | 0.1165 |

7 - 8 | 114 | 8 | 1 | 0.0777 | 0.0874 |

8 - 9 | 105 | 4 | 0 | 0.0388 | 0.0388 |

9 - 10 | 101 | 3 | 1 | 0.0291 | 0.0388 |

survival data. For this reason, in solving many problems, it is required to supplement the sum of observation probabilities up to 1. Since the sum of observed probabilities

As we noted that above,

Consequently,

gives the least value to

gives the greatest value to

The

In order to obtain the performance of the mentioned distributions, we use various criteria as Root Mean Square Error (RMSE), Chi-Square, entropy values of distributions. The acquired results are demonstrated in

All

In the sense of RMSE criteria each

3.3084 | 3.2854 | 3.3219 | 3.3040 | 3.3204 | |

0.1229 0.1171 0.1116 0.1064 0.1015 0.0967 0.0922 0.0879 0.0838 0.0799 | 0.1269 0.1249 0.1210 0.1153 0.1081 0.0997 0.0905 0.0809 0.0711 0.0615 | 0.0989 0.0995 0.0998 0.1000 0.1001 0.1002 0.1003 0.1004 0.1004 0.1005 | 0.1202 0.1238 0.1161 0.1083 0.1014 0.0954 0.0901 0.0855 0.0814 0.0777 | 0.1093 0.1058 0.1030 0.1010 0.0994 0.0981 0.0971 0.0962 0.0954 0.0947 |

3.2042 | 3.2140 | 3.2921 | 3.2106 | 3.2041 | |

0.0619 0.0921 0.1218 0.1434 0.1502 0.1400 0.1160 0.0856 0.0562 0.0328 | 0.0429 0.1137 0.1451 0.1490 0.1377 0.1194 0.0994 0.0803 0.0634 0.0492 | 0.0863 0.1449 0.1299 0.1126 0.0999 0.0914 0.0861 0.0833 0.0824 0.0832 | 0.0543 0.0939 0.1351 0.1529 0.1480 0.1290 0.1046 0.0804 0.0593 0.0424 | 0.0483 0.1036 0.1363 0.1493 0.1456 0.1296 0.1068 0.0820 0.0589 0.0397 | |

3.2408 | 3.2057 | 3.2305 | 3.2500 | 3.2237 | |

0.1101 0.0825 0.1058 0.1285 0.1396 0.1344 0.1150 0.0877 0.0598 0.0366 | 0.0579 0.0928 0.1281 0.1484 0.1499 0.1353 0.1108 0.0830 0.0573 0.0365 | 0.0400 0.1294 0.1481 0.1403 0.1251 0.1091 0.0944 0.0816 0.0706 0.0613 | 0.0423 0.1425 0.1432 0.1278 0.1134 0.1018 0.0924 0.0848 0.0785 0.0733 | 0.0403 0.1205 0.1478 0.1454 0.1310 0.1133 0.0961 0.0809 0.0679 0.0570 |

3.2024 | 3.2000 | 3.2042 | 3.2083 | 3.2100 | |

0.0537 0.0972 0.1291 0.1471 0.1488 0.1355 0.1118 0.0838 0.0573 0.0357 | 0.0489 0.1034 0.1314 0.1458 0.1466 0.1342 0.1117 0.0844 0.0578 0.0358 | 0.0625 0.0920 0.1211 0.1427 0.1501 0.1405 0.1167 0.0859 0.0561 0.0324 | 0.0515 0.0974 0.1359 0.1519 0.1472 0.1290 0.1051 0.0808 0.0593 0.0420 | 0.0503 0.0988 0.1382 0.1526 0.1458 0.1268 0.1033 0.0802 0.0601 0.0438 | |

3.2104 | 3.2035 | 3.2039 | 3.2050 | 3.2155 | |

0.0522 0.0964 0.1369 0.1530 0.1468 0.1277 0.1038 0.0802 0.0598 0.0433 | 0.0518 0.0987 0.1322 0.1487 0.1480 0.1330 0.1093 0.0827 0.0579 0.0376 | 0.0502 0.1007 0.1343 0.1493 0.1469 0.1313 0.1079 0.0822 0.0584 0.0388 | 0.0533 0.0968 0.1324 0.1497 0.1483 0.1325 0.1085 0.0822 0.0580 0.0383 | 0.0501 0.0986 0.1416 0.1543 0.1436 0.1227 0.0999 0.0792 0.0619 0.0480 |

3.1937 | 3.1935 | 3.1936 | 3.1932 | 3.1934 | |

0.0477 0.1198 0.1221 0.1306 0.1397 0.1395 0.1230 0.0921 0.0570 0.0284 | 0.0476 0.1203 0.1216 0.1302 0.1402 0.1400 0.1230 0.0917 0.0568 0.0287 | 0.0477 0.1201 0.1218 0.1303 0.1400 0.1398 0.1230 0.0919 0.0568 0.0286 | 0.0475 0.1217 0.1192 0.1292 0.1422 0.1421 0.1225 0.0898 0.0560 0.0298 | 0.0476 0.1209 0.1207 0.1298 0.1408 0.1407 0.1229 0.0911 0.0565 0.0290 |

1 | 5 | 0 | 0.0485 | 0.1269 | 0.0483 | 0.0489 | 0.0475 |

2 | 10 | 1 | 0.1068 | 0.1249 | 0.1036 | 0.1034 | 0.1217 |

3 | 12 | 5 | 0.1650 | 0.1210 | 0.1363 | 0.1314 | 0.1192 |

4 | 8 | 2 | 0.0971 | 0.1153 | 0.1493 | 0.1458 | 0.1292 |

5 | 10 | 0 | 0.0971 | 0.1081 | 0.1456 | 0.1466 | 0.1422 |

6 | 15 | 6 | 0.2039 | 0.0997 | 0.1296 | 0.1342 | 0.1421 |

7 | 9 | 3 | 0.1165 | 0.0905 | 0.1068 | 0.1117 | 0.1225 |

8 | 8 | 1 | 0.0874 | 0.0809 | 0.0820 | 0.0844 | 0.0898 |

9 | 4 | 0 | 0.0388 | 0.0711 | 0.0589 | 0.0578 | 0.0560 |

10 | 3 | 1 | 0.0388 | 0.0615 | 0.0397 | 0.0358 | 0.0298 |

1 | 5 | 0 | 0.0485 | 0.0989 | 0.0863 | 0.0501 | 0.0477 |

2 | 10 | 1 | 0.1068 | 0.0995 | 0.1449 | 0.0986 | 0.1198 |

3 | 12 | 5 | 0.1650 | 0.0998 | 0.1299 | 0.1416 | 0.1221 |

4 | 8 | 2 | 0.0971 | 0.1000 | 0.1126 | 0.1543 | 0.1306 |

5 | 10 | 0 | 0.0971 | 0.1001 | 0.0999 | 0.1436 | 0.1397 |

6 | 15 | 6 | 0.2039 | 0.1002 | 0.0914 | 0.1227 | 0.1395 |

7 | 9 | 3 | 0.1165 | 0.1003 | 0.0861 | 0.0999 | 0.1230 |

8 | 8 | 1 | 0.0874 | 0.1004 | 0.0833 | 0.0792 | 0.0921 |

9 | 4 | 0 | 0.0388 | 0.1004 | 0.0824 | 0.0619 | 0.0570 |

10 | 3 | 1 | 0.0388 | 0.1005 | 0.0832 | 0.0480 | 0.0284 |

Although the distribution with the largest number of moment functions tends to fit better, it should be noted that in some cases, the set of moment functions with fewer elements is more informative then a different set of moment functions with more number of elements.

Distribution of | Calculated value of Chi-Square | Probability of Chi-Square value | RMSE | ||
---|---|---|---|---|---|

3.2854 | 4.4310 | 0.8163 | 0.3158 | ||

3.2041 | 0.6512 | 0.9987 | 0.1873 | ||

3.2000 | 1.7787 | 0.9389 | 0.1799 | ||

3.1932 | 1.6161 | 0.8993 | 0.1830 |

Distribution of | Calculated value of Chi-Square | Probability of Chi-Square value | RMSE | ||
---|---|---|---|---|---|

3.3219 | 5.3820 | 0.7161 | 0.3492 | ||

3.2921 | 4.9233 | 0.6693 | 0.3492 | ||

3.2155 | 2.2804 | 0.8922 | 0.2104 | ||

3.1937 | 1.6383 | 0.8966 | 0.1888 |

In order to establish availability of GEOD to survival data in the sense of Shannon measure it is required to consider entropy values of GEOD.

From

From

From

From

and

Comparison of GEOD with each other in the sense of Shannon measure shows that along of these distributions

The results of our investigation according to using known characterizing moment vector functions from

Corollary 1. If by

is fulfilled, when

Moreover for any

takes place.

Now, we calculate the distance between observed distribution

It is known that the Kullback ? Leibler distance between distributions

By starting these formula Kullback-Leibler measures for the distance between observed distribution

From

The results of our investigation according to using known characterizing moment vector functions from

Corollary 2. If

is fulfilled, when

Moreover for any

takes place.

0.3938 | |

0.3348 | |

0.3300 | |

0.3193 |

0.4441 | |

0.4009 | |

0.3457 | |

0.3198 |

In this section survival data analysis is conducted by

On basis of the results given in

1 | 200 | 5 | 0 | 0.0489 | 0.0489 | 0.9511 | 0.0514 |

2 | 195 | 10 | 1 | 0.1034 | 0.1523 | 0.8477 | 0.1220 |

3 | 184 | 12 | 5 | 0.1314 | 0.2837 | 0.7163 | 0.1834 |

4 | 167 | 8 | 2 | 0.1458 | 0.4295 | 0.5705 | 0.2556 |

5 | 157 | 10 | 0 | 0.1466 | 0.5761 | 0.4239 | 0.3458 |

6 | 147 | 15 | 6 | 0.1342 | 0.7103 | 0.2897 | 0.4632 |

7 | 126 | 9 | 3 | 0.1117 | 0.8220 | 0.1780 | 0.6275 |

8 | 114 | 8 | 1 | 0.0844 | 0.9064 | 0.0936 | 0.9017 |

9 | 105 | 4 | 0 | 0.0578 | 0.9642 | 0.0358 | 1.6145 |

10 | 101 | 3 | 1 | 0.0358 | 1.0000 | 0.0000 | -- |

1 | 200 | 5 | 0 | 0.0477 | 0.0477 | 0.9523 | 0.0501 |

2 | 195 | 10 | 1 | 0.1198 | 0.1675 | 0.8325 | 0.1439 |

3 | 184 | 12 | 5 | 0.1221 | 0.2896 | 0.7104 | 0.1719 |

4 | 167 | 8 | 2 | 0.1306 | 0.4202 | 0.5798 | 0.2253 |

5 | 157 | 10 | 0 | 0.1397 | 0.5599 | 0.4401 | 0.3174 |

6 | 147 | 15 | 6 | 0.1395 | 0.6994 | 0.3006 | 0.4641 |

7 | 126 | 9 | 3 | 0.1230 | 0.8224 | 0.1776 | 0.6926 |

8 | 114 | 8 | 1 | 0.0921 | 0.9145 | 0.0855 | 1.0772 |

9 | 105 | 4 | 0 | 0.0570 | 0.9715 | 0.0285 | 2.0000 |

10 | 101 | 3 | 1 | 0.0284 | 0.9999 | 0.0001 | -- |

In this study, it is established that survival data analysis is realized by applying Generalized Entropy Optimization Methods (GEOM). Generalized Entropy Optimization Distributions (GEOD) in the form of

is showed that,

Shamilov, A., Kalathilparmbil, C. and Ozdemir, S. (2017) An Application of Generalized Entropy Optimization Methods in Survival Data Analysis. Journal of Modern Physics, 8, 349-364. https://doi.org/10.4236/jmp.2017.83024