^{1}

^{*}

^{1}

Lifetime analyses frequently apply a parametric functional description from measured data of the Kaplan-Meier non-parametric estimate (KM) of the survival probability. The cumulative Weibull distribution function (WF) is the primary choice to parametrize the KM. but some others (e.g. Gompertz, logistic functions) are also widely applied. We show that the cumulative two-parametric Weibull function meets all requirements. The Weibull function is the consequence of the general self-organizing behavior of the survival, and consequently shows self-similar death-rate as a function of the time. The ontogenic universality as well as the universality of tumor-growth fits to WF. WF parametrization needs two independent parameters, which could be obtained from the median and mean values of KM estimate, which makes an easy parametric approximation of the KM plot. The entropy of the distribution and the other entropy descriptions are supporting the parametrization validity well. The goal is to find the most appropriate mining of the inherent information in KM-plots. The two-parameter WF fits to the non-parametric KM survival curve in a real study of 1180 cancer patients offering satisfactory description of the clinical results. Two of the 3 characteristic parameters of the KM plot (namely the points of median, mean or inflection) are enough to reconstruct the parametric fit, which gives support of the comparison of survival curves of different patient’s groups.

The driving force of the overall spontaneous progressions in nature is the attempt to minimize the actual energy and maximize the entropy in the actual processes. In this sense, life follows the basic thermodynamic laws: the living process continuously “burns” the incoming “nutrition”. Only the energy-pump of the incoming sun-energy makes the difference: creates original gradients which are later divided into other inhomogeneities by spontaneous processes.

Life process tries to diminish the working energy of the sunlight by increasing the overall entropy of the environment. Living process lowers the electron energy by the oxidation producing outgoing (waste) final “products”. The gradual loss of electron energy of the “nutrition” molecules is the energy to sustain life. Simply speaking, the living process is a dissipative entropy producer. As the Nobel laureate physiologist A. Szentgyorgyi states “Life is nothing but an electron looking for a place to rest” [

Living objects are open systems among various environmental surroundings, adapting themselves to the conditions around, forming self-organized structures [

The invariance of magnification (scale invariance, when the up or down magnification shows similar structures) is the form of self-similarity, which is a typical consequence of the self-organizing processes, [

Random stationary, stochastic, self-organizing processes form dynamic behaviors [

· life is complexly organized in a wide range of magnification and different levels of interactions,

· life is self-regulated with various feedback processes,

· the living systems are open, dissipative objects with multilevel interactions with the environment,

· the activity of life processes has intensive cross-talks of different levels of its organization,

· the specific forms and properties are complexly environment dependent

These points are important for the universality of life, for the dynamic fluctuations and scaling too [

The above complex biological processes connect to the biological allometry, scaling, non-equilibrium, and non-linear thermodynamics. Special self-similarity characterizes the mass-allometry by universal scaling, and it appears in a large category of living structures and processes [

Self-organized processes are widely investigated in solid-state reactions (precipitations, phase-transitions, aggregations, nucleation, growth, etc.). The theory of phase-transition involving simultaneous random nucleation and growth was pioneered by Kolmogorov [

The AF (A(t)) [

ln ( − ln ( A ( t ) ) ) = n ln ( t ) + ln κ A ( t ) = 1 − exp ( κ t n ) (1)

where t is the elapsed time of the process, κ depends linearly on the nucleation rate and on the growth-rate by the power of three. The so called “Avrami constant” (n) was introduced in simple model n = 4, and so originally in solids it was considered an integer [

The non-equilibrium thermodynamical formalism could be applied to a self-organized system of malignancy in space and time [

The general transport structure (blood-vessel network) of the tissues forms fractals by allometric scaling, including the angiogenetic processes in tumor formation [

The dynamics of the evolution of cancer produces various phases of the growing structures due to the genetic instability, leading to phase transitions [

Our objective in this article is to find a parametric description of overall survival, which fits the self-organized processes and able to show the inherent information of survival measurements of cancer patients.

Most of the survival analyses in medical evaluations use the Kaplan-Meier (KM) non-parametric estimator [

( Probability at actual time of observation ) = ( Number of participants living at the strat of observation ) − ( Number of participants died or censored during the of observation ) ( Number of participants living at the strat of observation )

KM estimator is defined by multiplying the above described successive probabilities by any earlier point of time obtaining the final estimate:

K M ( t ) = ∏ t i ≤ t ( 1 − d i n i ) (2)

where d i is the number of deaths at the time t i ; t i is a time when at least one death had happened in the examined cohort, and n i is the number of individuals known to survive (not censored, exists in the study) at time t i . Some modifications were done in tails (pessimistic approach when short-tailed) [

The best method for mining data could be when the non-parametric KM survival plot can be parameterized. The description of survival curves by parametric distribution function is a long-term effort [

The parametrization of survival measures we use to the universality of life consideres its self-organized self-similarity. The progression of life involves non-linear and non-equilibrium thermodynamical consequences including the fractal description and similar processes of the phase transitions in non-living systems. For calculating the survival-time, let T be the stochastic variable defined on the set of individuals, (lifetime). The lifetime distribution function is the probability of the lifetime being less than or equal to t, namely

p L ( t ) = P { T ≤ t } (3)

Thus, the survival probability distribution (survival function) can be defined by the probability of the T lifetime being higher than t, that can be expressed in the form of

p S ( t ) = 1 − p L ( t ) = P { T > t } (4)

The density function of the lifetime distribution function is the

f ( t ) = d p L ( t ) d t (5)

probable density, therefore, the average lifetime is:

〈 T 〉 = ∫ 0 ∞ t f ( t ) d t = ∫ 0 ∞ p S ( t ) d t (6)

Introducing the h(t)dt death rate is the probability that in case of a t length survival time, death occurs at (t + Δt) and (h(t) is the “hazard function” or “death rate”). Therefore, the probability is that in the case of a t length time survival, death occurs at (t + Δt) is

h ( t ) Δ t = 1 − p S ( t + Δ t ) p S ( t ) = − d p S ( t ) d t p S ( t ) Δ t = − d [ 1 − p L ( t ) ] d t p S ( t ) Δ t = f ( t ) p S ( t ) Δ t (7)

From this:

h ( t ) = − d p S ( t ) d t p S ( t ) = f ( t ) p S ( t ) (8)

It’s cumulative form is

H ( t ) = ∫ 0 t h ( τ ) d τ = − ln ( p S ( t ) ) (9)

or

p S ( t ) = e − H ( t ) (10)

Biological systems are strictly self-organized [

Taking the self-similarity into consideration, death-rate (failure rate in (8)) must be a self-similar time function [

h ( t ) = α t β (11)

Its self-similarity is obvious because it gives the same function by magnification m:

h ( m t ) = α ( m t ) β = m β α t β = m β h ( t ) (12)

The survival probability distribution function from (9) and (10) is:

p S ( t ) = e − ∫ 0 t h ( τ ) d τ (13)

The self-similar death rate (hazard function) is:

H ( t ) = ∫ 0 t α τ β d τ = α β + 1 t β + 1 (14)

Substituting (14) with survival (13), we get:

p S ( t ) = e − ∫ 0 t α τ β d τ = e − α β + 1 t β + 1 (15)

Introducing

t 0 = ( n α ) 1 / n and n = β + 1 (16)

Hence:

p S ( t ) = e − ( t t 0 ) n (17)

which has two parameters for one curve, t 0 , is the scale parameter, which is the natural scale of the time-function variation, and n is the shape parameter. Consequently, the lifetime distribution function p L ( t ) , by (3) and (4) is the well-known AF (A(t)) or cumulative form of the two-parametric cumulative Weibull distribution (W(t)):

p L ( t ) = A ( t ) = W ( t ) = 1 − e − ( t t 0 ) n = 1 − p S ( t ) (18)

with additional conditions t ≥ 0 , A ( t ) = W ( t ) = 0 , when t < 0 . The inverse function, when the t-time is calculated from a given p probability is:

t = W i n v ( p ) = t 0 ( − ln ( 1 − p ) ) 1 / n (19)

There are various parameters characterizing the WF from the time of development independently. The shape parameter of WF is usually n > 1 , following a sigmoid curve, which form is a psychometric function [

The cumulative Weibull distribution (Weibull function, WF) is highly universal and represents all the features described in the introduction above. The formal identity of WF with the AF in JMAK inherently involves the phase transition approach, and the mechanics follow the tumor kinetics, [

The AF and WF have been used for a long time for survival/reliability description. Originally Weibull’s statistics was developed to describe the fracture of brittle materials [

The fit of WF to the non-parametric KM is completely rigorous when a strictly homogeneous cohort of patients is investigated, with unified equivalence of the participating individuals followed until the decease or censoring. This grouping selection apparently limits the applicability of WF. The parametrization of the aging and natural death has no such grouping selection, it is related to every human being and their survival. The epidemiological studies in gerontology refer to the Gompertz-distribution, [

G ( t ) = G 0 exp ( − a ⋅ ( exp ( b ⋅ t ) − 1 ) ) (20)

The parameters a and b are positive and a is connected to the growth, while b is connected to the displacement in variable t. GF is also a double-parametric function, similarly to the n and t 0 in WF.

During the historical development of WF, it has started to characterize the aging of the non-living components and machineries (reliability) while the GF was initially developed for the ageing of living objects [

The study of Gompertzian distribution for tumors supports a hypothesis that the fractal structure weakens and, in the end, it disappears by the growth of the tumor [

The Gompertz distribution could be obtained by the reduction of the generalized exponential Weibull distribution [

The GF does not satisfies the self-similarity (formulated in (11)), and therefore, it is not in harmony with self-organizing biological dynamics, which is a certain character of the harmonized biological development, [

WF is successfully applied to the living processes as the psychological function [

WF describes the non-parametric KM plot with appropriate accuracy in gerontology [

Due to its self-similar behavior, fractals could be used for modeling cancer [

Due to the self-similarity, the parametric distribution generally fits well with the KM plot, and so it is successfully used in oncology [

Summarizing the above, the self-organizing and the self-similarity are universal laws fingerprinted in the fractal description and can be described by cumulative Weibull distribution. This universality of WF is applied to parametrize the KM plot. Due to the universality, the WF parametric regression fits the KM plot with sufficient accuracy and so determines the KM curve by two parameters ( t 0 and n). On the regression, a considerable improvement could be made by smoothing the KM with the hazard data (patients at risk), [

The characterization of WF has four special points, the value at t 0 , the mean, the median and the inflection point. The median, the mean and the mode (the maximum point in the distribution function is an inflection point in the cumulative curve) are calculable from the parametric formulas, (see

median [ p S ( t ) ] = t 0 [ ln ( 2 ) ] 1 n mean [ p S ( t ) ] = t 0 ∫ 0 ∞ e − x x 1 n d x = t 0 Γ ( 1 + 1 n ) mode [ p S ( t ) ] = t 0 [ n − 1 n ] 1 n (21)

The corresponding probabilities when t 0 = 1 and n = 2, are 0.5, 0.607 and 0.456 for the median, mode and mean, respectively. The quantile of this function is ≈0.632 and it independent from n value. Limit lim n → 0 p S ( t ) = 0 through a step-function at t = 0, while lim n → ∞ p S ( t ) is a step function at t = t 0 , (

The various parameter-pairs of WF are shown in

The inflection point in the WF (cumulative Weibull distribution) is the mode of the probability distribution function. It is the most likely appearing value in the Weibull probability distribution function. The inflection in the WF of survival divides the speed of developing death, which reaches its maximum at this point and the transfer of inflection is slowed by the elapsed time.

Programming calculates the result or makes it graphical (

The data at the particular points vs. n are shown in (

In conclusion from the above, the parametric regression KM is universally determined by two parameters (the shape parameter (n) and the scale parameter ( t 0 ) of WF), due to the basic behaviors of living processes: their self-organizing

and self-similarity, which is characterized well by their spatio-temporal fractal structure. When a clinician tries to describe the main info of the KM survival curves, takes the median value of survival, as a significant parameter characterizing the actual survival result into account. This is, in fact, an automatic characterization by a single parameter of the non-parametric estimation. However, the median alone cannot characterize the long tail of the KM plot; it does not consider the history of the patients in the remaining second half of the cohort, which could be essential for measuring the “cured” [

Sometimes the inflection of KM is studied too, having the highest death-rate in the study at that point. All are important for characterization, but two of them are independent, and the third could be calculated from the chosen two. The distribution curve must be characterized by two parameters at least.

Two of the three noteworthy points (median, mean, inflection) of the KM may parametrize the non-parametric plot. Measuring or guessing these characteristic points (mainly the median and mean) is a standard comparison of the KM-plots and usually accepted as the result of the actual study. These points really characterize the non-parametric distribution and give the possibility to parametrize, so, in fact, this is a “hidden” parameterization of the KM plot by WF.

A simple approach of Weibull fit could be made on the KM plot by its derivative in the t 0 reference point, which is proportional to –n. (The derivative there

is exactly d W ( t 0 ) d t = − ( 1 e ) n t 0 ≅ − 0.368 n t 0 .) Therefore, the parametric evaluation

could be checked well at the t = t 0 point, and the complete parametrization could be established approximately by the value of the t 0 point and the value of its slope,

The regression could be simplified to linear by double logarithmic approach:

ln [ − ln ( W ( t ) ) ] = n ln ( t t 0 ) = n ln ( t ) − n ln ( t 0 ) (22)

The regression is shown in

However, the obvious deviation of the regressions from the measured OS is in

the tail of KM, which is similarly not followed by both functions. The universal WF idea offers regression fit to the KM for a group of patients who have had an event or have censored until the end of the study. This is, of course, limited in real trials. We consider any chosen cohorts inhomogeneous because of the huge variability of living conditions. A homogenous group of patients, which has identical individuals could never be selected. However, there is a possibility to divide the cohort to subgroups with very similar patients, and fit WF on these independently, while the measured KM is, of course, a sum of the results of all the subgroups. With M subgroups in the complete cohort of N patients, and every group containing k 1 , k 2 , ⋯ , k M patients, the WF for the actual measured non-parametric KM will be:

W ( K M ) ( t ) = k 1 N e − ( t t 0 ( 1 ) ) n ( 1 ) + k 2 N e − ( t t 0 ( 2 ) ) n ( 2 ) + ⋯ + k M N e − ( t t 0 ( M ) ) n ( M ) or W ( K M ) ( t ) = ∑ i = 1 M k i N e − ( t t 0 ( i ) ) n ( i ) and ∑ i = 1 M k i = N (23)

By taking extra care to have a homogeneous cohort, at least the time-limit of the study forms a group from patients, who had no event (or are not censored). The “remaining” patients in the given treatment study have the highest benefit from the performed treatment or they were in a definitely different condition when they were selected into the cohort. We call this group “remained group” (RG) due to the lack of proof of complete recovery. However, this group is sometimes regarded (incorrectly) as a cured fraction (according to the endpoints of the study). In a rigorous approach the disease-free survival (DFS) has to be compared with the matched healthy control group, and the cure-rate on this comparison must be decided [

The KM curve in an RG situation obviously does not fit to the strict WF, which must be decreased to a zero cumulative probability. When the ratio of the remaining individuals is c R G = n R G / N , the KM plot can be approximated with reasonable accuracy by the weighted sum of two WFs. In the RG fraction, the time-parameter is longer than in the fraction of patients having an event or censored.

W ( c ) ( t ) = ( 1 − c R G ) e − ( t t 0 ) n + c R G e − ( t t 0 ( R G ) ) n ( R G ) (24)

In this case, the composition of the time-parameter of the long survival WF fit is practically infinite (compared to the time-length of the study):

W ( R G ) ( t ) = e − ( t t 0 ( R G ) ) n ( R G ) ≅ 1 (25)

In this case, the correction by a survived fraction of the patients is constant. Denoting the constant correction c, the plot will be composed by this:

W ( c ) ( t ) = ( 1 − c ) e − ( t t 0 ( c ) ) n ( c ) + c (26)

The variation of c shows different fitting functions,

Characterization of the curative effect of the treatment making a WF fit to the non-parametric KM survival could be done with the Shannon-entropy. Entropy measures the information carried by the probable density function (pdf, p ( t , n , t 0 ) ) behind the WF ( W ( t , n , t 0 ) ). It measures the probability of realization of an event or censoring

p ( t , n , t 0 ) = d W ( t , n , t 0 ) d t = n t 0 ( t t 0 ) n − 1 exp ( − ( t t 0 ) n ) ; ∫ 0 ∞ p ( t , n , t 0 ) d t = 1 (27)

The quantity of information is I ( t , n , t 0 ) = − ln ( p ( t , n , t 0 ) ) which is realized by p ( t , n , t 0 ) , so the complete information from the system is the classical Shannon-entropy, [

S S h ( n , t 0 ) = − ∫ 0 ∞ p ( t , n , t 0 ) ln ( p ( t , n , t 0 ) ) d t (28)

A higher entropy shows less information (more uncertainty). When an event has a lower probability to occur, it carries more information, so its Shannon-entropy is lower than the effects of the frequent occurrence. The expectation of a random variable is characterized by this entropy, so by this meaning it is a direct analog for the entropy definition in physics (statistical thermodynamics). When the informational entropy decreases, (its change becomes negative) it means that the probability distribution differs from the uniformed distribution, concentrating to some data.

The entropy growth in physics usually happens when the system approaches equilibrium, while in pdf the increase of entropy shows a lack of information when the average rate of information produced by the stochastic source of the data decreases.

The Shannon entropy (28) measures the diversity of probability distribution function (pdf) behind WF (in fact the derivative of WF). It is a sum of the n and t 0 dependent parts:

S S h ( n , t 0 ) = γ ( 1 − 1 n ) + ln ( t 0 n ) + 1 = S S h 1 ( n ) + S S h 2 ( t 0 ) where S S h 1 ( n ) = γ ( 1 − 1 n ) − ln ( n ) + 1 ; S S h 2 ( t 0 ) = ln ( t 0 ) (29)

and γ is the Euler-Mascheroni constant: γ ≅ 0.577 The special points of this entropy function are:

S S h 2 ( 1 ) = 0 ; max ( S S h 1 ( n ) ) = S S h 1 ( γ ) ≈ 1.127 S S h 1 ( 0.173 ) ≅ 0 ; S S h 1 ( 4.223 ) ≅ 0 ; S S h 1 ( 0.363 ) ≅ 1 ; S S h 1 ( 1 ) = 1 lim n → ∞ , t 0 → ∞ S S h ( n , t 0 ) = − ( 1 + γ ) (30)

The entropy (diversity) monotonically grows by t 0 in a logarithmic way, while it rapidly grows by n reaching the maximum at n = γ (when t 0 = 1 ) and decreases from that point reaching zero at n = 4.223 (when t 0 = 1 ) and building information from that point (decreasing), so the step-function of WF (definite step) starts to dominate. The division of the entropy of a shape and scale (time) dependent part gives a possibility to define the role of these parameters. While the scale (time) parameter increases the Shannon-entropy monotonically, the shape parameter (n) after a maximum at γ , decreases the entropy, showing an increasing amount of information about the death (decreasing info about being alive) of the participants in the cohort. The growing shape-factor n definitely worsens the survival over the value γ , while the growth of the scale (time) factor gives longer survival expectations.

The Shannon entropy could be calculated real-time t ( S S h ( n , t 0 ) ) and also could be relative to t 0 time, meaning, that the time is measured in t 0 units ( S S h 0 ( n ) = S S h ( n , 1 ) ), estimating the self-time. A higher entropy value means a higher uncertainty of death (therefore, a lower certainty of being alive). We expectthe growth of Shannon entropy of the parametric probability distribution function in cases of better results of the treatment.

To demonstrate the parametrization, we use a large number of patients (1180 individuals), with various tumors treated by numerous standard therapies, but having one thing in common: they are treated by complementary modulated electro-hyperthermia (mEHT), when the standard treatment fail to deliver the desirable results, [

Using the approximate parametrization by the evaluation of this KM plot with the slope in t 0 , we get t 0 ≈ 43 and n ≈ 0.9 . median ≈ 28,

The fit of single parametric WF curve to the KM plot, (

WF fits with an acceptable accuracy; the largest deviation is less than 0.007, (0.7%).

Note, that there is a difference, when we fit by minimalizing the deviation of

the curves, SE = min { ∑ i ( x i K M − x i W F ) 2 } or

r 2 = ( ∑ i ( x i K M − 〈 x K M 〉 ) ( x i W F − 〈 x W F 〉 ) ∑ i ( x i K M − 〈 x K M 〉 ) ∑ i ( x i W F − 〈 x W F 〉 ) ) 2 the square of Pearson correlation

(where the 〈 〉 bracket means the mean of the variable). The obvious difference is due to the different meaning of fit. The parameter SE minimizes the difference between the curves, while the r 2 minimizes the shape difference (maximizes the similarities) of the curves. A comparison with Shannon entropy shows more certainty (less uncertainty) by about 6% in the regression by minimizing SE than maximizing r 2 . In the following, when we do not note the opposite, we use the minimal SE regression.

The fit is accurate, having no more difference in any compared points of the curves than 1%, but it is not accurate enough at the end of the observed time, due to the RG group of the patients. The deviation could be less with applying the RG principle of (26),

The parametric decomposition gives better fit by two WFs according to (24), ^{2} has reduced drastically. The result shows the responding group (response rate (RR) 48%) and the non-responding one (52%). Note, that the less-responding group could be regarded as a non-responding control-arm.

The long-survival part of KM-plot has a higher entropy and shows more uncertainty of the death in both approaches. A better fit can be achieved when we

count RG. The RG is obtained from the remaining survival fraction in most of the actual cases, and it has measurably longer survival than the study follows the patients who had no event or were not censored earlier. RG is a part of the “censored” patients at the end of the study.

For an easier calculation of the WF fractions (components) of the KM-plot, we may use the logarithmic evaluation of the survivals, which modifies the grouping more than the above decomposition. A linearly fit function ln [ − ln ( W ( t ) ) ] by ln ( t ) of KM is shown in

The original WF fit shown in

Despite the inaccuracy of the logarithmic evaluation, it has a great advantage of guessing the subgroups of the patients by an optimal decomposing of the KM plot. The logarithmic curve on

The WF fit to

The logarithmic fit by (22) shows different results than the direct fit. The reason is simply that the logarithmic fit considers only a part of the whole curve, and fits to that, consequently the accurate fit to that part of the KM will not fit to the other parts at all, if the logarithmic curve was approached in different parts. The observed KM is, of course, considers all the patients. The overlapping fits from the logarithmic approach modifies the KM plot. Consequently, only the fit for original KM plot has a relevance.

However, the logarithmic analysis is very useful for detecting the subgroups of the patients. It became clear that the survival contains three subgroups,

more accurate fit, than the RG (

We had shown the applicability of the two-parameter cumulative Weibull distribution for approximating the non-parametric Kaplan-Meier plot with a higher accuracy. We had shown the universality of the Weibull approach based on the general behaviors of the living organisms, including the cancer-tissue development. The self-organizing and self-similarity with their consequences determine the strict connection of the parametric approach well with the experimental non-parametric observations. Informational entropy allows the distinguishing of the subgroups in a general set of patients by their overall survival.

We have demonstrated that applying the two-parameter WF provides a sufficient fit to the non-parametric KM survival curve in a real case of 1180 patients suffering in various malignant diseases. Two of the 3 characteristic parameters of the KM plot (namely the points of median, mean or inflection) are enough to reconstruct the parametric fit.

In summary, Weibull parametric distribution with satisfactory refinement can accurately approximate a KM survival plot with surviving individuals at the end-point of the study.

This work was supported by the Hungarian Competitiveness and Excellence Programme grant (NVKP_16-1-2016-0042).

The authors declare no conflicts of interest regarding the publication of this paper.

Szasz, O. and Szasz, A. (2020) Parametrization of Survival Measures, Part I: Consequences of Self- Organizing. International Journal of Clinical Medicine, 11, 316-347. https://doi.org/10.4236/ijcm.2020.115031