Parametrization of Survival Measures, Part I: Consequences of Self-Organizing

Lifetime analyses frequently apply a parametric functional description from measured data of the Kaplan-Meier non-parametric estimate (KM) of the survival probability. The cumulative Weibull distribution function (WF) is the primary choice to parametrize the KM. but some others (e.g. Gompertz, logistic functions) are also widely applied. We show that the cumulative twoparametric Weibull function meets all requirements. The Weibull function is the consequence of the general self-organizing behavior of the survival, and consequently shows self-similar death-rate as a function of the time. The ontogenic universality as well as the universality of tumor-growth fits to WF. WF parametrization needs two independent parameters, which could be obtained from the median and mean values of KM estimate, which makes an easy parametric approximation of the KM plot. The entropy of the distribution and the other entropy descriptions are supporting the parametrization validity well. The goal is to find the most appropriate mining of the inherent information in KM-plots. The two-parameter WF fits to the non-parametric KM survival curve in a real study of 1180 cancer patients offering satisfactory description of the clinical results. Two of the 3 characteristic parameters of the KM plot (namely the points of median, mean or inflection) are enough to reconstruct the parametric fit, which gives support of the comparison of survival curves of different patient’s groups.


Introduction
The driving force of the overall spontaneous progressions in nature is the at-agnosis as well [23]. Dynamical interactions have a spatiotemporal fluctuation which also has a scaling behavior. Homeostatic time-fluctuation is the so-called pink noise [24], that characterizes the noise of homeostasis.
The above complex biological processes connect to the biological allometry, scaling, non-equilibrium, and non-linear thermodynamics. Special self-similarity characterizes the mass-allometry by universal scaling, and it appears in a large category of living structures and processes [25], which rigorously optimizes the metabolic power in a universal frame, [26]. Scaling is a simple power function, , where a and b are constants, therefore the form of ( ) P x remains the same during any magnification of x. This scaling condition characterizes the biomaterials, which is indeed scaled universally on a very wide range of magnifications from the subcellular energy-consumption through mitochondria and respiratory complexes to the largest animals by scaling exponent α = 3/4, [27]. The fingerprint of complexity can be found in various fields of biology, showing unified principles of self-organization [28]. Note, that mitochondria probably has a key-role in this complex behavior of living objects, because the non-mitochondrial respiration scaling factor is lower, (α = 2/3), characterizing the simple surface-volume ratio in these processes, [29], however the robust category of living systems is scaled by complex manner [30]. Different conditions modify the power function [31], forming various universality classes by self-similarity. Self-organized processes are widely investigated in solid-state reactions (precipitations, phase-transitions, aggregations, nucleation, growth, etc.). The theory of phase-transition involving simultaneous random nucleation and growth was pioneered by Kolmogorov [32], Johnson, Mehl [33] and Avrami [34]. It is called Johnson-Mehl-Avrami-Kolmogorov (JMAK) model, revised later by others, [35], [36]. It describes the kinetics of phase transformation when nucleation is spatially random. The JMAK theory and one of its formulation called Avrami-function (AF) were introduced for solids to serve as mathematical models of different biological processes, [37] [38] and even for DNA replication process, [39] too. Experimental data [40] [41] [42] [43], prove a certain universality of the Avrami-equation to describe the real processes, which could be a useful tool for further research, [44]. It is generally useful for studying different processes with no known special system parameters, similarly to the critical phenomena of the physical-laws near to the phase-transition [45]. The AF (A(t)) [46] in its most applicable forms: where t is the elapsed time of the process, κ depends linearly on the nucleation rate and on the growth-rate by the power of three. The so called "Avrami constant" (n) was introduced in simple model n = 4, and so originally in solids it was considered an integer [47]. It is interesting, that the space-fractal dimension O. Szasz, A. Szasz dependens on AF [48]. Here n value is not necessarily an integer and depends well on the processes that are described by it. The fractal dimension, and the power-law of self-similarity are tightly connected [49]. Experimental data show, that the progression of many reactions in biology also follow the A(t) AF with various, non-integer characteristic constants [40]. It was observed universally in different processes from a wide range of structural and dynamical situations of living systems [44] [50] [51] [52].
The non-equilibrium thermodynamical formalism could be applied to a selforganized system of malignancy in space and time [53]. Cancer breaks the network of normal cells, while the cooperative tissue harmony changed to non-cooperative competitiveness forms a new complex structure non-linearly far from the thermodynamic equilibrium. Cancer could be described as a dynamical phase transition from healthy to cancerous [54], described with a clear analogy with phase transitions in a lifeless nature. Starting with an avascular situation and forming a dormant microscopic cluster [55], it continues to develop new angiogenetic formations by epithelial-mesenchymal cell transition, induced by bioelectromagnetic forces, [56]. Tumor leaves the dormant state by an allometric transformation [57], and the previously almost undetectable phase becomes tra- ceable. An Avrami-like function in time describes its development [58]. This idea was used to show the validity of Avrami description [59] and extended to metastases while studying the transition of avascular appearance of tumorous clusters [60] to vascular phase, which bases the dissemination of malignant cells, [61]. Metastases are developed by a first order phase transition of cells from non-cancerous to metastatic ones [62]. The development of this new phase needs a great amount of energy. The energy dissipates in the system, produces a high rate of entropy development.
The general transport structure (blood-vessel network) of the tissues forms fractals by allometric scaling, including the angiogenetic processes in tumor formation [57]. In oncological applications, the available metabolic transport and the fractal dimensions of the angiogenetic network determine the average survival of a tumor. The average survival of the tumor-cells shortens by the growing fractal dimension of the transport network and modified by some kind of an alimentation of the tumor, [63]. The tumor-growth follows the universal law of scaling [64], which can be used in cancer-research [65].
The dynamics of the evolution of cancer produces various phases of the growing structures due to the genetic instability, leading to phase transitions [66]. Tumor development operates near the threshold of phase transition, destabilizing the actual structure, making it highly heterogeneous [67], producing a large variety of random mutations [68], finding the most optimal conditions of the further proliferation. Their development is based on competition, a "fight" for the individual survival. The optimal strategy is well known in the game-theory [69] where the mixed-strategy forms Nash equilibrium in the non-cooperative game by random variation behind [70]. This situation is typical for topological phase transi-  [71], where the cooperation emerges despite the selfish, non-cooperative individual participating cells [72].
Our objective in this article is to find a parametric description of overall survival, which fits the self-organized processes and able to show the inherent information of survival measurements of cancer patients.

Method
Most of the survival analyses in medical evaluations use the Kaplan-Meier (KM) non-parametric estimator [73] [74], used for incomplete observations. KM is useful to examine the probability of lifetime and effectivity of the chosen treatment for such lethal diseases like cancer. The computed probability of an event in a definite point of time: ts living at the strat of observation KM estimator is defined by multiplying the above described successive probabilities by any earlier point of time obtaining the final estimate: where i d is the number of deaths at the time i t ; i t is a time when at least one death had happened in the examined cohort, and i n is the number of individuals known to survive (not censored, exists in the study) at time i t . Some modifications were done in tails (pessimistic approach when short-tailed) [75], and optimistic approach, a fat-tailed [76] is in use having a difference in survivals at the end of the trial. The best method for mining data could be when the non-parametric KM survival plot can be parameterized. The description of survival curves by parametric distribution function is a long-term effort [77] allowing the optimization of the information from the measured dataset. For the correct parametrization, we have to take an overview on the scientific facts that we can use for the research of the optimal parametrization. The most important result available is the parametric solution that is connected to the spatiotemporal self-organization and the Thus, the survival probability distribution (survival function) can be defined by the probability of the T lifetime being higher than t, that can be expressed in the form of ( ) The density function of the lifetime distribution function is the probable density, therefore, the average lifetime is: Introducing the h(t)dt death rate is the probability that in case of a t length survival time, death occurs at (t + Δt) and (h(t) is the "hazard function" or "death rate"). Therefore, the probability is that in the case of a t length time survival, death occurs at (t + Δt) is From this: It's cumulative form is Biological systems are strictly self-organized [78]. The inherent property of the living objects is the self-organizing and the consequent self-similarity of the living structures [11], which could be the basis of the proper parameterization of survival.
Taking the self-similarity into consideration, death-rate (failure rate in (8)) must be a self-similar time function [44], mirrored by a scaling like: Its self-similarity is obvious because it gives the same function by magnification m: The survival probability distribution function from (9) and (10) is: The self-similar death rate (hazard function) is: Substituting (14) with survival (13), we get: Hence: which has two parameters for one curve, 0 t , is the scale parameter, which is the natural scale of the time-function variation, and n is the shape parameter. Consequently, the lifetime distribution function ( ) L p t , by (3) and (4) is the wellknown AF (A(t)) or cumulative form of the two-parametric cumulative Weibull distribution (W(t)): with additional conditions 0 function, when the t-time is calculated from a given p probability is: There are various parameters characterizing the WF from the time of development independently. The shape parameter of WF is usually 1 n > , following a sigmoid curve, which form is a psychometric function [79] anyway. In cases when 1 n ≤ the survival is a simple exponential function with rapid decrease by the decreasing of n.
The cumulative Weibull distribution (Weibull function, WF) is highly universal and represents all the features described in the introduction above. The formal identity of WF with the AF in JMAK inherently involves the phase transition approach, and the mechanics follow the tumor kinetics, [59].
The AF and WF have been used for a long time for survival/reliability description. Originally Weibull's statistics was developed to describe the fracture of brittle materials [80], [81] and to calculate the probability of the damage-free survival of the given material. It can be derived from geometric scale invariance (fractal organized structures) by physical principles, [82] in mechanical mills. It is frequently applied in the study of mechanical fatigue and failure [83].
The fit of WF to the non-parametric KM is completely rigorous when a strictly homogeneous cohort of patients is investigated, with unified equivalence of the participating individuals followed until the decease or censoring. This grouping selection apparently limits the applicability of WF. The parametrization of the aging and natural death has no such grouping selection, it is related to every human being and their survival. The epidemiological studies in gerontology refer to the Gompertz-distribution, [84]. The Gompertz function (GF) is a function of time. When G(t) represents the number of individuals in the given period of time, t, 0 G is the number of subjects at the start of the counting time, then GF is: The parameters a and b are positive and a is connected to the growth, while b is connected to the displacement in variable t. GF is also a double-parametric function, similarly to the n and 0 t in WF.
During the historical development of WF, it has started to characterize the aging of the non-living components and machineries (reliability) while the GF was initially developed for the ageing of living objects [85]. By developing the statistical methods, soon, both the Weibull and Gompertz distribution have started to be applied for description of tumor-development and cancer-death.
The comparison of the two distributions shows that the best fit of GF is ) and the best fit of WF is ( where SE is the standard error of the regression estimate minimizing the sum of squares of measured and estimated datapairs. Due to their applicability, the Gompertz and Weibull distributions are both commonly used in biological and engineering reliability investigations [86], [87].
The study of Gompertzian distribution for tumors supports a hypothesis that the fractal structure weakens and, in the end, it disappears by the growth of the tumor [88]. In general, the tumor-growth follows a universality, [64] [89], which prefers to use the WF. The clear fitting of allometric scaling by the fractal structure of the tumor [64] shows not only the tumor growth but the validity of the allometry in the growth of the axillary lymph node involvement in breast cancer [90]. In consequence, we choose to use the WF for modelling the KM plot of the overall survival.
The Gompertz distribution could be obtained by the reduction of the generalized exponential Weibull distribution [91], which formulated in a more general form, proposing to derive both distribution from one single [92] and it is applied for survival data with pretty good results.
The GF does not satisfies the self-similarity (formulated in (11)), and therefore, it is not in harmony with self-organizing biological dynamics, which is a certain character of the harmonized biological development, [2]. This might be the reason, why the WF describes the intrinsic causes of age-related mortality better (following the homeostasis in the healthy aging process) while the Gompertz distribution reflects the extrinsic factors [93]. Due to the self-similarity of WF, we expect, that the self-organized biological development of tumors intrinsically developing in a healthy environment from where it derives, prefers the WF to describe the KM in malignant diseases accurately. It is a further support for the primary importance of Weibull distribution, that it is derived from the ontological law, and so it is directly connected to the self-organized structure of the living matter [28]. The self-similarity, as the basic fingerprint of self-organizing is not valid in Gompertz distribution. The "mystery" of Gompertz function is probably the equilibrium between the predictable and unpredictable (chaotic) dynamisms, [94]. Contrary to the exponential origin of GF, the self-similarity (power function) of WF's origin hypothesizes some parallels with the opposite pictures of fractal-like organizations and general scale-free (small-words, [95]) large networks (exponential function). Despite the structural preference of WF, GF also fits well to allometry, represented by power-function [96], shown in the development of rats [97]. Although WF fits very well to the growth function of the general ontogenic model, using the data for rat [98] ( 2 0.99965 r = , ; the fit of GF shows the same result ( 2 0.99967 for the same allometric curve. The difference is negligible in this regime of development. In the case of animals with larger masses, the difference is also not significant. It is subtle, favoring only the WF for the description of the best regression fit to the allometric scaling result, using the available data from [98]. (The best WF and GF fits to allometry for cow are ( 2 0.99978 r = , SE 1.021 = ) and ( 2 0.99972 r = , 2 0.99972 r = ), respectively.) WF is successfully applied to the living processes as the psychological function [99], describing the sensing processes well in connection with Weber-Fechner law [100], establishing psychometry [101]. Lifetime estimations are frequently approached by WF [86] and WF is also successfully used for clustering gene expression [102].
WF describes the non-parametric KM plot with appropriate accuracy in ger- ) to describe the natural death at the end of life. Cancer-death was also described by WF with time-dependent shape-factor, using a similarity between the fracture survival of brittle materials and the specific survival characteristics of a cohort of cancer patients [107] [108]. In this model the shape factor linearly depends on the time and gives surprisingly accurate fit to the data from the cancer-registers.
Due to its self-similar behavior, fractals could be used for modeling cancer [109], and the KM survival plot divided significantly by fractal dimension shows the prognostic value of the fractal analysis well [110]. Consequently, it is possible to evaluate the various images in oncology by the fractal structure and these images can be characterized by Weibull distribution as well [111]. International Journal of Clinical Medicine Due to the self-similarity, the parametric distribution generally fits well with the KM plot, and so it is successfully used in oncology [112] [113]. The application of the parametric WF approximating the survival curve is a standard approach for the evaluation of clinical trial data, and so it is established theoretically and practically, [34] [86] [99] [114]. Comparing various parametric fits to KM survival plot, the WF was the most accurate [115]. The model was used to analyze the prognostic factors of the survival of cancer patients, and it was proved in a large retrospective analysis with n = 746 gastric cancer cases, [116].
Summarizing the above, the self-organizing and the self-similarity are universal laws fingerprinted in the fractal description and can be described by cumula-

Results
The characterization of WF has four special points, the value at 0 t , the mean, The  (9) is constant when n = 1 (or β = 0, which means the parameter has no effect on the hazard), and it is increasing and decreasing when n > 1 (meaning the event is more likely to occur) and n < 1, (meaning the event is less likely to occur), respectively. The limit ( )    The data at the particular points vs. n are shown in ( Figure 5). The mode changes rapidly in the interval of n (1, 2), so reading accurately is difficult, therefore the median and mean are proposed to reestablish the entire WF.
However, in a value of 3.35 n ≈ at 0 1 t = the values of mode, mean and median are practically identical, so the WF could be characterized with a single parameter. Increasing 0 t does not lead to a significant change of the situation, so in virtually every case, we may approach WF only with one parameter over 3.35 n ≈ . In conclusion from the above, the parametric regression KM is universally determined by two parameters (the shape parameter (n) and the scale parameter ( 0 t ) of WF), due to the basic behaviors of living processes: their self-organizing International Journal of Clinical Medicine and self-similarity, which is characterized well by their spatio-temporal fractal structure. When a clinician tries to describe the main info of the KM survival curves, takes the median value of survival, as a significant parameter characterizing the actual survival result into account. This is, in fact, an automatic characterization by a single parameter of the non-parametric estimation. However, the median alone cannot characterize the long tail of the KM plot; it does not consider the history of the patients in the remaining second half of the cohort, which could be essential for measuring the "cured" [119] anyway. Studying the median alone disregards the real measurable success at the end of the study.
Correcting this "mistake" the average (mean) of the KM non-parametric distribution is considered. The mean is affected more by the "tail" of the distribution, so it gives a more accurate idea on the cure rate. The median is more responsible for the information about the rapidity of the loss of the patients, while the mean has more part in the information about the length of the effect of the high-success patients, Figure 6.
Sometimes the inflection of KM is studied too, having the highest death-rate in the study at that point. All are important for characterization, but two of them are independent, and the third could be calculated from the chosen two. The distribution curve must be characterized by two parameters at least.
Two of the three noteworthy points (median, mean, inflection) of the KM may parametrize the non-parametric plot. Measuring or guessing these characteristic points (mainly the median and mean) is a standard comparison of the KM-plots and usually accepted as the result of the actual study. These points really characterize the non-parametric distribution and give the possibility to parametrize, so, in fact, this is a "hidden" parameterization of the KM plot by WF.
The regression is shown in Figure 8. Note, that this approach is less precise than the function fit, because the double logarithm suppresses the accuracy in real KM fit.
However, the obvious deviation of the regressions from the measured OS is in By taking extra care to have a homogeneous cohort, at least the time-limit of the study forms a group from patients, who had no event (or are not censored). The "remaining" patients in the given treatment study have the highest benefit from the performed treatment or they were in a definitely different condition when they were selected into the cohort. We call this group "remained group" (RG) due to the lack of proof of complete recovery. However, this group is sometimes regarded (incorrectly) as a cured fraction (according to the endpoints of the study). In a rigorous approach the disease-free survival (DFS) has to be compared with the matched healthy control group, and the cure-rate on this comparison must be decided [120]. An alternative way to determine the group of "cured" patients and the connected value of the "cure" time is when the hazard rate of the studied group corresponds to the hazard in the general population [121]. When it fits, we may talk about the real cure rate, which does not mean that an event cannot happen due to independent reasons from the investigated disease.
The KM curve in an RG situation obviously does not fit to the strict WF, which must be decreased to a zero cumulative probability. When the ratio of the remaining individuals is RG RG c n N = , the KM plot can be approximated with reasonable accuracy by the weighted sum of two WFs. In the RG fraction, the time-parameter is longer than in the fraction of patients having an event or censored.
In this case, the composition of the time-parameter of the long survival WF fit is practically infinite (compared to the time-length of the study): In this case, the correction by a survived fraction of the patients is constant.
Denoting the constant correction c, the plot will be composed by this: The variation of c shows different fitting functions, Figure 9: Characterization of the curative effect of the treatment making a WF fit to the non-parametric KM survival could be done with the Shannon-entropy. Entropy measures the information carried by the probable density function (pdf,  The quantity of information is ( ) , , ln , , I t n t p t n t = − which is realized by ( ) 0 , , p t n t , so the complete information from the system is the classical Shannon-entropy, [122] is: A higher entropy shows less information (more uncertainty). When an event has a lower probability to occur, it carries more information, so its Shannonentropy is lower than the effects of the frequent occurrence. The expectation of a random variable is characterized by this entropy, so by this meaning it is a direct analog for the entropy definition in physics (statistical thermodynamics). When the informational entropy decreases, (its change becomes negative) it means that the probability distribution differs from the uniformed distribution, concentrating to some data.
The entropy growth in physics usually happens when the system approaches equilibrium, while in pdf the increase of entropy shows a lack of information when the average rate of information produced by the stochastic source of the data decreases.
The Shannon entropy (28) measures the diversity of probability distribution function (pdf) behind WF (in fact the derivative of WF). It is a sum of the n and 0 t dependent parts: and γ is the Euler-Mascheroni constant: The special points of this entropy function are:

Discussion
To demonstrate the parametrization, we use a large number of patients (1180 individuals), with various tumors treated by numerous standard therapies, but having one thing in common: they are treated by complementary modulated electro-hyperthermia (mEHT), when the standard treatment fail to deliver the desirable results, [123] [124]; Figure 10.   WF fits with an acceptable accuracy; the largest deviation is less than 0.007, (0.7%). Note, that there is a difference, when we fit by minimalizing the deviation of the curves, the square of Pearson correlation (where the bracket means the mean of the variable). The obvious difference is due to the different meaning of fit. The parameter SE minimizes the difference between the curves, while the 2 r minimizes the shape difference (maximizes the similarities) of the curves. A comparison with Shannon entropy shows more certainty (less uncertainty) by about 6% in the regression by minimizing SE than maximizing 2 r . In the following, when we do not note the opposite, we use the minimal SE regression.
The fit is accurate, having no more difference in any compared points of the curves than 1%, but it is not accurate enough at the end of the observed time, due to the RG group of the patients. The deviation could be less with applying the RG principle of (26), Figure 13. The 4.543 Sh S = , which is 2.5% higher, mirrors the RG part of the patient distribution.
The parametric decomposition gives better fit by two WFs according to (24),  For an easier calculation of the WF fractions (components) of the KM-plot, we may use the logarithmic evaluation of the survivals, which modifies the grouping more than the above decomposition. A linearly fit function Figure 10. According to (22) it shows rather large deviations at the start and at the end of the curve, Figure 15.
The original WF fit shown in Figure 12(a), and the linear fit from the logarithmic approach of Figure 15. differs from each other, Figure 16. The deviation of the logarithmic fit is more than double in some intervals, so the direct fit of WF to KM is more accurate.  Despite the inaccuracy of the logarithmic evaluation, it has a great advantage of guessing the subgroups of the patients by an optimal decomposing of the KM plot. The logarithmic curve on Figure 15. shows three well distinguishable parts, for which the linear is accurate, and divides the original KM into three subgroups, Figure 17.
The WF fit to Figure 17 of the three parts of the KM is shown on Figure 18.
The logarithmic fit by (22) shows different results than the direct fit. The reason is simply that the logarithmic fit considers only a part of the whole curve, and fits to that, consequently the accurate fit to that part of the KM will not fit to the other parts at all, if the logarithmic curve was approached in different parts.
The observed KM is, of course, considers all the patients. The overlapping fits from the logarithmic approach modifies the KM plot. Consequently, only the fit for original KM plot has a relevance.
However, the logarithmic analysis is very useful for detecting the subgroups of the patients. It became clear that the survival contains three subgroups, Figure  17. Consequently, three partitions of the KM curve ( Figure 10) would give a   more accurate fit, than the RG (Figure 13) or the two-group decomposition ( Figure 14) had allowed. This fit to KM is very accurate, the deviation remains under 0.0005 in the complete fit, Figure 19.

Conclusions
We had shown the applicability of the two-parameter cumulative Weibull distribution for approximating the non-parametric Kaplan-Meier plot with a higher accuracy. We had shown the universality of the Weibull approach based on the general behaviors of the living organisms, including the cancer-tissue development. The self-organizing and self-similarity with their consequences determine the strict connection of the parametric approach well with the experimental non-parametric observations. Informational entropy allows the distinguishing of the subgroups in a general set of patients by their overall survival. We have demonstrated that applying the two-parameter WF provides a sufficient fit to the non-parametric KM survival curve in a real case of 1180 patients suffering in various malignant diseases. Two of the 3 characteristic parameters of the KM plot (namely the points of median, mean or inflection) are enough to reconstruct the parametric fit.
In summary, Weibull parametric distribution with satisfactory refinement can accurately approximate a KM survival plot with surviving individuals at the endpoint of the study.