Parametrization of Survival Measures (Part II): Single Arm Studies

In some clinical applications in oncology randomized, double armed, and double-blind trials are not possible. In case of device applications, double-blinded conditions are nonrealistic, and with many times the randomization also has complications due to the high-line treatments where the reference cohort is not available; the active “arm” has mainly palliative initiative. Sometimes highly personalized therapies block the collection of the homogeneous group and limit its double-arm randomization. Our objective is to discuss the situations of the single arm evaluation and to give methods for the mining of information from this to increase the level of evidence of the measured dataset. The basic idea of the data-separation is the appropriate parameterization of the non-parametric Kaplan-Meier survival pattern by the poly-Weibull fit.


Introduction
Survival studies most frequently use the Kaplan-Meier (KM) non-parametric estimate. The KM estimator is fixed by the duration of participation in the obser-  appropriate cohort is a complex issue. Cohort forming sometimes uses forced conditions by reaching a definite toxicity predefined by the protocol (like in high-dose chemotherapy [8]), expecting the same (unified) reaction on the stage-selected patients. The RCT approach is devoted to the application of the most appropriate treatment update and for the reference control is used from the same cohort (called control-arm). The new therapy (active arm) must show its superiority over the control in comparison. The equipoise selection into both arms is mandatory, but the two treatments could be compared by not only their positive efficacy but their side effects as well, that may adversely affect the treatment [9].
Sometimes, in cancer treatment, a misleading (or at least not complete) evaluation is practiced by measuring the local control of the tumor, instead of the systemic development of the malignancy in the whole body. The problem of the overall control of the system is complicated and not even possible with imaging because of micro-metastases and such adverse effects which cause comorbidities for the patient. Therefore, parametrization would only be effective if the endpoint of the study is the overall survival and the quality of life combined.
Before deciding on the RCT, both sides of the balance of measured efficacy and the adverse effects must be taken into account. In case of serious diseases or terminal cases, no curative treatment is available, or further curative therapy is simply not possible because of comorbidities like organ-failure, low-blood-count, etc. Note that some conditions limit the RCT evaluation even in the double arm construction: the false inclusion and exclusion criteria (sometime "cherry picking"); the missing normal distributions; or the changing time series that have the same statistical momentums but their time-fluctuations differ. The data-set in the last case is out of the applicability of the usual analysis of variance (ANOVA).
Furthermore, the ethical selection issues oppose the randomization, so the trial must be solved in a simple non-randomized design of single arm. other treatment is available except for the newly tried one. In these cases, the best supportive care (BSC) could be applied [10], like a control group when an active curative or palliative therapy is under investigation, and retrospectively, a historical control of the same hospital or large databases are also frequently compared to the historical data-set of the same hospital or compares to other large databases, retrospectively. There are some situations where no suitable historical control is available because of the completely new approach of the therapy [11], or the disease is so rare, that no comparison could be found [12].
Of course, we know that the single arm without a reference cannot give information about the changes that were achieved by the therapy involved. However, it is also obvious, that the data of the interesting changes are involved in the single arm spectrum as well but are well hidden without an orientation to measure the changes.
The single arm design is popular in the Phase I process when safety data is collected. The goal in this phase of the study is to determine the toxicity, the side effects and the dose with dose-escalation process. The investigation of efficacy is not included in Phase I trials. The Phase II studies concentrate on efficacy of the applied safe process [13]. When the hypothesis to be proved is clearly defined and the "null hypothesis" could be the zero response, the minimum of the clinically relevant response should define the size of the trial contrary to the simple design where the evaluation of the data can be rather complicated due to the difficulty of the missing reference for comparison, which is hard anyway because of the natural biological variability. The interpretation of the results of single arm distinguishes the placebo effect or the spontaneous natural history of the disease from the actual treatment efficacy. However, the single-arm trials may be the option when placebos are unethical, and opportunities of the controlled trial are limited, due to the vast variations of the patients. For example, the advanced diseases in oncology are frequent topics of single-arm trials, due to the massive, exhausting and mostly variant protocols of failed pretreatments. The reason of the failure is usually a progressive and refractory disease, or limitations in applying the conventionally proven methods due to organ-failure or a dangerous level of blood damages. In these cases, forming appropriate cohorts is very difficult or even not possible. When a single arm study is chosen due to the certain drawbacks of RCT, we mostly apply a palliative BSC additive to the active treatment. One of the most important condition of such single arm treatments is that it must not worsen the results of BSC, and its worst outcome must be the ineffectiveness. The best indicator of this condition is the combination of overall survival time and the quality of life.

Methods
Lifetime studies have a surprising universality by the self-organizing [14] [15] A. Szasz et al. International Journal of Clinical Medicine and consequently by the self-similarity of the morphological structures and dynamic processes in living objects. Self-similarity has a morphological consequence, showing the spatiotemporal fractal structure in biological objects [16]; [17]. These ideas are forming the similarities of the species [18], which directly leads to the expected lifetime universality of well-selected cohorts. The general allometry is as wide as the cover of the mass, ranging from respiratory complexes, through the mitochondria, to the animals with the largest mass [19].
Due to the self-similarity, most of the biological structures and processes can be described by a simple-power function (like ( ) , where a and α are constants, and so the form of ( ) P x remains only multiplicated by the constant during any m magnification of x: ( ) ( ) ( ) This magnification process (scaling [20]), could be followed by a few orders of magnitudes (scale-free behavior) in biosystems.
In consequence of the widely applicable universality behavior, the general ontogenic growth [21] allows the deduction of the Weibull distribution [22], which can be used to analytically describe the non-parametric Kaplan-Meier estimate for tumors. Self-similarity drives the tumor-development, which shows the universal law of growth [23] [24]. This lays the foundation of our attempt to find the reason behind the universal parametric regression for the lifetime of the patients, which is supported by the universal law of growth of the solid tumors [25]. The extension of the Weibull model allows us to estimate the tumor-latency too [26]. We had shown the self-similarity of bioprocesses in general [27], leading us to some well-defined mathematical formulas like the Avrami equation, which has a complete formal correspondence with the function of the cumulative Weibull distribution (WF) [28]. The two-parameter cumulative Weibull distribution (WF) is a good candidate for the parametrization of the KM-plot [27]. It is both theoretically and practically established for clinical applications [29].
The real challenge is how we can reveal the hidden data in the single active arm in case of the missing randomization that forms reference in double arms. We have limited possibilities for mining the available information without a reference set, even though we know it well, that the information is in the data. The general self-similar behavior of the various tumors has different parametrization and so can be distinguished from each other. Consequently, the fitting to survival curves gives hints on how to extract information from the single arm alone.
Experimental data fit well to the empirical data in biology as well as it has been widely investigated and proven in solid-state reactions (precipitations, phase-transitions, aggregations, nucleation, growth, and others) [30] [31] [32] [33] [34]. Indeed, experimental data show that many biological reactions follow the Avrami equation. It is applied universally to different processes regardless of the structure and dynamics of the system. Avrami functions are self-similar, and various comparative functions characterize the exponents [35]. The considerations of Avrami function explain the parametric approximations of the non-parametric Kaplan-Meier survival distribution (KM) [27].
The mortality can be approached by the fitting of different distributions [36] International Journal of Clinical Medicine in epidemiologic modeling. The most popular descriptions are the Gompertz, Weibull and logistic distributions [37]. These methods are usually used for gerontologic, aging mortalities, modelling the statistics of the ages of death, do not consider any particular disease or clinical therapy involvements [38] [39] [40] [41]. A generalized Weibull-Gompertz distribution could derive various distributions [42]. In demographic aging, the Gompertz and Weibull functions describe different biological causes [39]. The Gompertz model involves a multiplicative aging mortality, while it is additive in Weibull description. The multiplicativity affects the extrinsic, while the additivity the intrinsic causes in older ages. Our present modeling does not deal with aging mortality and the connected epidemiologic consequences. Our considerations comprise the cancer-survival, which is strongly disease and therapy dependent, so it covers the intrinsic causes, on the actual parametrization of the probability of survival. This non-aging survival discussion prefers the Weibull distribution in comparison to Gompertz, describing the intrinsic self-organiziation behaviour of the human living organism.
In such advanced situations, when the malignancy is double refractory, the WF provides the best fit to the KM [11]. The cancer incidences significantly fit Weibull distribution in 18 types of malignancies [43], and so WF is justified to describe the driver events of the tumor-building process. Extending this idea, we expect that the best fit parametrization of the survival curve could lead to the information about the hidden facts in the actual non-parametric KM plot. The approximation with a simple WF function in real cases of the KM nonparametric survival curve is not precise enough. The missing preciosity apparently contradicts the WF self-organized basis. When the survival is self-organized in the same way as we observed in all the biological processes, the fitting to the non-parametric KM has to show the self-similarity, because it is entirely rigorous due to the universality of the lifetime of the living systems and the growth dynamics of the tumors. The contradiction is due to the fact that the self-similar WF only fits to strictly homogeneous patients' cohorts. WF parameters characterize the group of generally equal participating individuals, which is of course not acceptable. The KM represents a cohort group of patients with the equipoise of individuals made as ideal as possible, choosing explicit inclusion and exclusion criteria. Nevertheless, the choosing criteria in the situation when we are not able to apply RCT cannot be fixed well. The only inclusion is the failure of conventional curative treatments and the only exclusion is when the patient is in such terminal stage when any extra intervention could be fatal.
Due to the enormous variability of the living conditions (like social, diet, habits, etc.) and bio-variability of the individuals (like genetic variability, immune-variability, sensing-variability, etc.), any chosen cohort has inhomogeneities. However, it is possible to divide the cohort into more homogeneous subgroups than the full set of individuals, expecting that the fitting of the self-similar WF will be better by the growing homogeneity of the subgroup to which it is applied.

A. Szasz et al. International Journal of Clinical Medicine
Usually, the groups of local responses (complete response (CR), partial response (PR), no change (NC), or progression of the disease (PD)) come into the center of the attention automatically at the finishing of the study. We could make similar subgrouping in systemic (lifetime, survival) measurements, and WF fit them individually. The measured data is the summary of the complete cohort with overlapping data in the experimental non-parametric KM estimates, containing the data of all the subgroups. For simplicity, using the same subgrouping as in local response, the subgroup of those patients who could be regarded is introduced as "cured" (CP), the subgroup for those whom the treatments helped (they as responding patients (RP), and the patients who had no benefit from the therapy as non-responding patients (NP). The KM in the real experiment measures is only the sum of these (in the same way as in the analysis of the local response). Fit WF for subgroups and sum it for fitting to complete KM: Simpler and more roboust WF regression received, when the fitting is divided into only two different functions [44]. Here we define two sub-cohorts composed linearly [45] [46] [47], one that the treatment had no or minor influence on (NP) and one where the treatment was effective (RP): where the Weibull parameters denoted by (RP) and (NP) superscripts, according to their sub-cohorts. Due to the complete set of patients, 1 Using the regression with division into only two subgroups by temperature development criteria was used by others [48] where the patients included in the hyperthermia cohort were divided into "heatable" and "non-heatable" sub-groups, where the end of the study was determined by the time when the last patient was proved to be unaffected by hyperthermia. Two (responding and non-responding) or more subgroups (including the stabilization, treating a chronic disease, or other), could be introduced this way as well.
The two-subgroup division has five parameters to fit. Looking for the only In that special case when the RP subgroup is cured, meaning no disease-specific death happen in the whole observation period (including the available follow-up time too), the ( ) ≅ , so the WF-like curve will have the following form: The modification of (5) can be interpreted as the change of the 0 t , and the scale factor of the Weibull function: Consequently

Results
Using the hypothesis, that the self-similar WF follows the real bioprocesses in survival, the effect of the malignancy staging at the first diagnosis could be fol- Supposing a cluster contains 30 cells (~3 cells in a diameter) and supposing it takes 100 days to double its size, the tumor will be in the preclinical (latent) state for approx. 8 years, without the existing malignant tumor being observable, but we assume the self-organized growth during this time-period too.
Considering the basic survival curve from the start of the malignant behavior even from a single "renegade cell" [50], the WF describes the tumor development including the dormant period until all the patients deceased or censored, (we obtain (7): Following the staging of the tumor status with WF when the diagnosis is based on the development of the malignant lesion related to (5): Hence, according to (6), the measured ( ) Let us denote the time when the tumor is observed like in carcinoma in situ, by 0 T . Due to the supposed continuity of the tumor-growth from the latent to the observable stage, the WF fit could follow triple parametrization to the KM non-parametric estimate. In this case a location parameter is added to the shape and scale parameters: This gives a "truncation" possibility of this basic (Equation (7), hypothetical) overall survival plot ( Figure 4).
Following the complete survival until the last event (or censoring) in the studied group of patients, the start of the study will be at the shifted time, which determines the truncations of the basic WF to its parts ( Figure 5).    Considering i T , the shift for the studies in subsequent stages, we get: The 0 T is the start of the observational period: optimally the immediate treatment, or at least the watchful waiting (watch and wait, WAW period); when the treatment cannot be decided yet. For simplicity we consider the studies as time-to-event (TTE) data, where time is denoted from a starting point to a certain event, such as death. When the end of the study fixed differently, we must use the fit shown in (2). All studies start as new one, of course, there is no knowledge about the unmeasured early treatments; consequently, survival probability at the start of the treatment is 1, irrespective of when it started. We show the later starting points in the time-line of the disease in Figure 6.
We start counting the elapsing time from i T , by time-shift in (12   logarithmic dependence on the i T late start time in Figure 8. In reality, the real KM curve could be decomposed to at least two components like it is shown in (2). An example is shown in Figure 9, where the disease is characterized by the same shape factor, only the scale factor changes from 1 y (non-responding) to 10 y (responding) situations. When the later start of the study is linearly changed we assume linearity of the decomposition factor too.
The form of Figure 9 shows   curve. The hypothetical curve fit to KM is WF when the study goal is TTE; so it would be continued to the complete end (all patients deceased or censored, no patients are at risk). The finish-times ( i F ) define the PFRs in actual points, when N patients were involved in the study: The early finished studies, when a certain number of patients remain in risk are shown by an example in Figure 10.
The studies finishing early have a slight shift in 0 t when elongating them and the number of patients at risk decrease ( Figure 11).

Discussion
Both the two independent Weibull parameters change by inclusion criterial of staging. Both the shape and the scale factors are decreased when treatment starts later, which is natural. In case of an unchanged n shape-character, the decrease of the scale factor is less than in case of a changing n.
Using (9) we get: Expression (16) allows an approximating of the metabolic rate from the change of ( ) where 0 SUV is the FDG uptake of the neighboring healthy tissue. The metabolic ratio, calculated by In this way we could also approximate the basic survival curve, when the PET is actually sensitive enough to measure cancer in situ lesions, supposing the time when the tumor starts to form in a microscopical region and its clusters are still undetectable with our present diagnostic methods.
The treatment of the chosen patient cohort is expected to change the KM of the active arm compared to the control arm, which is untreated with the same protocol, and formed from the same cohort. The changes of KM in active arm will modify the WF fit, too. The measured change of metabolic rate by SUV indicates the effect of the actual treatment. When the malignant tissue shows a lower metabolic rate (lower SUV ratio) the treatment regarded effective. The lower SUV has a longer scale parameter ( 0 t ) according to (17). In case of a successful treatment, the shape-parameter (n) decreases, "smooths" the probability of event with a longer, heavier tail. The question is: how the situation changes by treatments in the study? The WF changes of course and the evaluation use this change to compare it to the reference (control arm) WF. There are different parametric estimations for the result. The first attempt is always the median survival, which looks undecided about the efficacy of the treatment in the measuring process. However, this single parameter is not nearly enough to see the complete picture. It is possible that the treatment is effective without the change of the median of the KM, while the distribution has a long tail; patients over the median lifetime live longer. for example Figure 13. It can happen when the mortality of the disease is very rapid, and the development of the resistance made by the treatment needs a longer time compared to the median survival.
For the decision of the efficacy we must use an information parameter from the WF, an important parameter of a probability distribution: the Shannon-entropy ( Sh S ) [54], as it is discussed in the first part of this series [27]. The SE parameter measures the diversity of probability density function (pdf), which is in the case of Weibull distribution: The information source of Sh S is produced by a stochastic data-source, like the probability distribution of the survival time. In the simple formulation, it refers to the amount of uncertainty about an event associated with a given probability distribution. At the probability of the survival, this directly means, that the decreasing entropy shows the increasing probability of death. The easiest way to decide the advantage of a treatment which changes the parameters of the WF, is with this parameter, because the survival is better when Sh S is higher. It is due to the meaning of the entropy: a larger entropy means less information and a higher uncertainty of death. Visualizing it on the image of the pdf, it has more located peak when n grows, and its width is shrinking by 0 t , therefore both make death more definite. The growing n and decreasing 0 t both decrease the entropy, making the certainty of death higher. In the case of The entropy evaluation in the case shown in Figure 7 is presented in Figure   14. The lower chance of survival is shown well by the decrease of the entropy with the late start times ( i T ). This is complete correspondence with the expectations: the later cancer diagnosis decreases the prognosed survival.
Interestingly, despite the more moderate decrease of the scale factor when the shape factor decreases in optimal fit, the Shannon entropy shows an advantage for these optimal WF sets, compared to the constantly fixed shape. The reason is that the patients with longer survival time are fit for the later start of the treatment and were selected by their other, less hazardous conditions than the others.
The Shannon entropy can be evaluated for late-start treatments (treatments in various stages of the tumor) like that it is shown in Figure 9. The Shannon entropy for non-responding patients (group A), and for responding ones (group B) is shown in Figure 15. The decrease of the entropy well shows the increasing certainty for events.
The Shannon-entropy decreases the number of patients at risk linearly, due to the increasing certainty of death ( Figure 16).
We assume, that no extra comorbidity developed (or at least it is controlled) over the elapsed time, consequently, we kept the original two parameters (shape and scale) unchanged, regarding the same cohort of patients participated; only their study started in different i F times. When we calculate with the developing comorbidities, then both parameters of WF will be changed in a direction that Sh S decreases, indicating a higher certainty of the event.

Conclusion
We discussed a method of data mining from the single-arm clinical study with-  for the KM curves in Figure 9). changes of the two independent parameters of the Weibull cumulative distribution by the study design, namely their dependence on the inclusion criteria (staging) and the intended end-point (finishing). We had shown that the various studies with different inclusion and exclusion criteria and different endpoints could be well described by the decomposition method. The fit of these results to real studies in clinical applications will be shown in the next part of this series of articles.