^{1}

^{*}

^{2}

^{*}

We start with analyzing stochastic dependence in a classic bivariate normal density framework. We focus on the way the conditional density of one of the random variables depends on realizations of the other. In the bivariate normal case this dependence takes the form of a parameter (here the “expected value”) of one probability density depending continuously (here linearly) on realizations of the other random variable. The point is, that such a pattern does not need to be restricted to that classical case of the bivariate normal. We show that this paradigm can be generalized and viewed in ways that allows one to extend it far beyond the bivariate or multivariate normal probability distributions class.

This paper can be viewed as an extension of our previous work (Filus and Filus [

It is a well-known fact that among existing multivariate probability distributions, there are no more than a few classes that are widely and successfully applied in practical stochastic modeling procedures. Typically, the underlying random variables are assumed to be independent or having an “approximately Gaussian” bivariate or multivariate distribution. The normality often is assumed even when corresponding data hardly agree with that mathematical model (showing asymmetry, for example). On the other hand, from all the multivariate distributions used in applications, the normal seems to be “the best”. The reason for this is that the Gaussian models catch the stochastic relationship (mainly by a regression function) between its marginal random variables in the most natural way. We first analyze and interpret the specific way the multivariate normal density of the random vector _{j} of the density of X_{j} to be any continuous function _{j}, given

Pursuing this method successively for _{j}, σ_{j}) density of X_{j} into the conditional density by setting a new value _{j}” to be the following (linear regression) function:_{j} but also any other parameter of any probability density to define the corresponding conditional distributions. This is the essence of the socalled parameter dependence method. Specifically in this paper, our task will be showing more closely relation of this method to the multivariate Gaussian model construction.

In Section 2, we analyze the stochastic dependences between marginal random variables of the bivariate normal in order to point out the original version of the parameter dependence pattern next extending to other constructed bivariate probability densities. The explanation as well as the example of applications of the bivariate normal is different from that in Filus and Filus [

In Section 7, we point out that the “method of parameter dependence” is used in some more areas of reliability theory for different situations than we are considering. This is a part of the accelerated life testing theory where the dependence of life time distribution’s parameter from a given (high) stress is investigated.

Another (fairly new) area is the “load optimization theory” sometimes associated with the load sharing phenomena analysis that we sketch in Subsection 7.2. The differences between these approaches and our theory are pointed out in 7.2.

We start with the following situation. Suppose the normally distributed random variable X_{2} describes an attribute of a physical or biological object, say u. Consider the (stochastic) behavior of the object u in two distinct “physical” situations. In the first situation, u is exposed to some random stress whose magnitude is described by a normally distributed random variable X_{1}. In the second situation we assume there is no such a stress present or the stress takes on a fixed predetermined value. The usual task here is to determine the joint distribution of X_{1}, X_{2}. Let the densities of X_{1}, X_{2} be normal, i.e., _{2} – kσ_{2}) is positive for at least k = 3, in order to assure approximate positivity of the normal life-time X_{2}].

Imagine the following fictitious experiment whose goal is to establish the possible stochastic impact of a medication’s dose change on some cancer treatment results. Suppose a person of a certain fixed age, was diagnosed with a kind of cancer. Assume that one of the significant characteristics of that kind of cancer is a tumor with a size X_{2}. During a given time period T after the patient was diagnosed, a specific medication was administered. Also suppose that this medication was routinely administered in the past, and that the average dose is estimated (or fixed) to be m_{1} milligrams per kilo of weight daily. Assume that, originally, the known (either measured or estimated) average size of the tumor is, say, m_{2} millimeters and after the period T of treatment the tumor size X_{2} is measured again and its negative or positive increment _{1} of the medication administered.

We assume that the goal of the underlying experiment is to make a prediction on effect _{1} may be justified when only “historical data” are analyzed and then extrapolated for a larger population of cases not yet recorded. In the case of extrapolation of historical data for a larger population we assume that the only information one possess on the applied in the past dose X_{1} is its probability distribution, which is the Gaussian _{1}, σ_{1}. Also, for _{2} (after the treatment) is assumed to be a random variable having a normal _{1}. For any other applied doze X_{1} = x_{1} the, associated with a single patient, value (x_{1} – m_{1}) statistically affects the change in the tumor size X_{2 }– m_{2} i.e., the treatment result. The word “statistically” here means that the impact of a nonzero quantity (x_{1} – m_{1}) (“the dose is not the standard one”) on the (former) probability density N(m_{2}, σ_{2}) of the tumor size X_{2} realizes through affecting the value of the mean m_{2} rather than directly affecting the numerical value x_{2} of X_{2}.

If we were interested in finding the joint probability distribution of X_{1}, X_{2} it is enough to determine the conditional density g_{2}(x_{2}|x_{1}) of X_{2}|X_{1}, since the marginal density of X_{1} is not changing.

In accordance with the “linear regression rule”, the dependence of the (new) expected value _{2} on the event X_{1} = x_{1}, is determined by the familiar functional relationship:

where a = r(σ_{2}/σ_{1}) and r is the (linear) correlation coefficient of the variables X_{1}, X_{2 }.

This approach directly leads to the determination of the conditional density of the random variable X_{2} given any realization X_{1} = x_{1}. It is a well-known fact that the conditional density g_{2}(x_{2}|x_{1}) is, again, normal and

i.e., the _{2}.

The joint density g(x_{1},x_{2}) of the random variables X_{1}, X_{2} is given by the usual arithmetic product g_{2}(x_{2}|x_{1})g_{1}(x_{1}) .

In the example above, one can reinterpret the “response random variable” X_{2} to be for example the patient’s “residual life-time”, or blood pressure, or level of some important chemical in the blood (such as cholesterol). In such cases the mathematics of the problem would remain the same.

Note the obvious fact that the tumor size X_{2} does not have a physical influence on the medication dose X_{1} so that the original marginal pdf g_{1}(x_{1}) remains the same. However, the stochastic dependence between X_{1}, X_{2} is mutual, since, in general, g_{1}(x_{1}|x_{2}) ≠ g_{1}(x_{1}).

It is well known that the actual problem with the bivariate normal density construction is to get to the conditional density (2), which fully represents the underlying stochastic dependence of random variable X_{2} on X_{1}.

Our claim is that the above paradigm for the stochastic dependence (characteristic for the bivariate Gaussians) can be extended to other classes of bivariate and multivariate distributions (see Filus and Filus [

Historically, people relied on the nice symmetry in the stochastic dependence of X_{1} and X_{2} when using their joint bivariate normal distribution. This kind of symmetry (i.e., both marginal and both conditional distributions are normal and both sides regression functions are linear) can only be achieved with the linear regression functions as described above (1). However, are linear regression functions really the only functions that one can successfully apply within this framework? Assuming that the function m_{2}(X_{1}) is any continuous function in X_{1}, one obtains a wide and interesting extension of the class of bivariate normal densities. We called this class FF-normal (previously named “pseudonormal”, see [_{2} can as well become a continuous function of the stress X_{1} (X_{1} may have a “stress” interpretation in a very wide sense). This stress may change the parameter σ_{2} of the (normal) density of, say, the “life-time” X_{2} into another value _{1} of the random variable X_{1}. The price for such a wide generalization is loss of the, mentioned above, symmetry (the marginal of X_{2} ceases to be normal) but the gains are considerable. Anyway, the bivariate normal remains to be a special cases of the FF-normals.

With the bivariate FF-normal densities of (X_{1}, X_{2}) we can use general continuous m_{2}(x_{1}) and σ_{2}(x_{1}) functions, and, performing similar calculations as above, we find, rather surprisingly, that g_{2}(x_{2}|x_{1}) is once more a regular normal density in x_{2}.

Consider now the following situation with a bivariate FF-normal distribution in which the “physical” interpretation of the underlying random variables can now be more general than above. Let u_{1}, u_{2} be two objects (or phenomena) which are characterized by the random variables X_{1}, X_{2} respectively. If the objects are physically separated then the random variables X_{1}, X_{2} are assumed to be independent, having normal pdfs _{1} physically impacts u_{2}), then the corresponding joint FF-normal density g(x_{1}, x_{2}) of the random vector (X_{1}, X_{2}) is given by the usual product formula:

For the conditional density of X_{2}|x_{1} we have:

The functions _{2}, σ_{2} of X_{2}’s density.

More explicitly, one obtains the bivariate FF-normal pdf in the form:

where

In particular, one may consider the “nonlinear regression function”

with arbitrary real parameters

Realize that in the case A = 0 and

Generally speaking, the essence of the construction method is that for any pair of (“initially independent”) random variables X_{1} and X_{2} with given probability densities _{1}. This means that when X_{1} = x_{1} we may assume that _{1}” distribution of the random variable X_{2} (given the event X_{1} = x_{1} occurred, with probability density g_{1}(x_{1}) ).

Then the joint density of the pair (X_{1}, X_{2}) is always

This situation is especially natural if we consider X_{2} to be the life-time of an object and X_{1} is the stress put on it.

Roughly, one can say that the construction method of bivariate distributions, presented above is an extension of the method used in the construction of the bivariate normal.

Consider a 2-component (say u_{1}, u_{2}) parallel system reliability setting in which X_{1}, X_{2} represent the components’ life-times (see Barlow and Proschan [_{1}(x_{1}) and g_{2}(x_{2}) respectively), are stochastically independent. When the two components are installed into the system, they start to interact. Assume that during that interaction some irregularities in the work of component u_{1} cause corresponding changes in u_{2}’s inner physical structure. This increases the hazard rate of that physically affected component u_{2}. Such physical phenomena are then “responsible” for the occurrence of stochastic dependence in the “in-system” component life-times X_{1} and X_{2}.

One can also imagine this situation as follows. During the two components’ “in-system” performance, component u_{1} creates a situation in which component u_{2} is “constantly bombarded” by a string of harmful “micro-shocks” (see Filus and Filus [_{2}’s physical constitution. We also assume that these micro-damages in component u_{2}’s inner physical structure “cause” some corresponding “micro-changes in the original (baseline) failure rate” (and, in parallel, in the corresponding probability distribution) of its life-time X_{2}. After a, possibly long, time period X_{1} of such interaction all these micro-damages cumulate their effects. As a result of this accumulation, the overall change in the corresponding “hazard rate function” will become significant. To describe formally the change in the hazard rate function we have chosen to consider corresponding changes in its parameter(s). In what follows we present a particular bivariate model for a 2-component system reliability which we called FF-Weibullian (formerly “pseudo-Weibullian” in Filus and Filus [

Suppose the lifetimes of the components u_{1} and u_{2} in “laboratory conditions” are independent and distributed according to the Weibull density random variables X_{1} and X_{2}.

Let _{k} (k = 1, 2).

Here, for k = 1, 2, we have the “vector parameter”

Next consider the components u_{1} and u_{2} as acting within the system. Let the resulting (changed) values_{2}(x_{2}; l_{2}, a_{2}) be determined by the following continuous functions of x_{1}:

One then obtains the wide class of bivariate FF-Weibullian densities:

where, for ease of computation, we recommend to apply as “sub-model” the following family of “parameter functions”:

In particular, s may depend on x_{1}.

Another analytically interesting “sub-model” is given by:

Note that both factors g_{1}(x_{1}) and g_{2}(x_{2}|x_{1}) of the joint density g(x_{1}, x_{2}) given by (3) are Weibullian densities. In particular, g_{2}(x_{2}|x_{1}) is Weibullian with respect to the argument x_{2} alone.

For the simpler FF-exponential example, see [

A parallel and basically independent path of investigation, which also has its roots in the bivariate normal distribution’s dependence paradigm, is present in the literature under the key word “conditioning”.

This method, used in the construction of numerous multivariate probability distributions, was extensively developed mostly since around 1987. See, for example, Arnold, Castillo and Sarabia [

The underlying method (by numerous authors called the “conditioning method”) relies on imposing conditional structure X|Y and Y|X on, given in advance, “baseline” probability densities f(x; A) and g(y; B) of some (“initially independent”) random variables X and Y respectively, where A and B are scalar or vector parameters. The two conditional densities are defined as we did above, i.e.

g(y|x) = g(y; B(x)) and f(x|y) = f(x; A(y))

where A(y) and B(x) are continuous functions of realizations of the random variables Y, X respectively.

In this case the task is to find two proper (unknown) marginal densities for the bivariate probability distribution of (X, Y) which are, as a rule, not unique and sometimes do not exist.

Despite similarities this method essentially differs from ours. In our case, instead of the two conditional densities g(y|x) and f(x|y), we define only one, say, g(y|x), but together with the marginal f(x).

Pursuing this way we always directly obtain a unique model simply as the product of the two (known) densities.

In such a way, we have obtained a wide class of bivariate densities which is essentially disjoint from the class obtained by that alternative method. Also, the physical interpretation of the, so defined, conditional densities differs in the two approaches. However, both approaches are devoted to the same purpose which is to extend of the paradigm of the bivariate normal in stochastic modeling. Nevertheless, using the conditioning method it is very difficult to construct the multivariate distributions of any higher than two dimensions.

Practically that method reduces to the bivariate cases while the method we present has a remarkable easiness of construction of probability distributions of, actually, arbitrary finite dimension. There is, namely, a recurrence procedure which allows to construct any j-th dimensional pdf based on corresponding (j – 1)-th dimensional pdf

The next section is devoted to the construction of multivariate distributions for any arbitrary finite dimension.

For the construction, mentioned in the title, we successively use the simple recurrence method that yields the j-th dimensional probability density, given the (j – 1)-th one. Realize, that (for j = 3) we have already defined the 2- dimensional densities g^{2}(x_{1}, x_{2}) by means of the products g_{1}(x_{1})g_{2}(x_{2}|x_{1}), where each underlying conditional density was given by g_{2}(x_{2}|x_{1}) = g_{2}(x_{2}, q_{2}(x_{1})).

So the “first step” is already done. Suppose now that we have at our disposal the (j – 1)-th dimensional (j ³ 3) pdf, say _{j} is a constant parameter, we define the conditional pdf

Now, the “new” value

The j-dimensional pdf of the random vector

The latter pdf becomes the basis for identical construction of the (j + 1)-dimensional pdf and so on.

We then stop the procedure once j + 1 = m, where m is the total dimension of the considered (maximal) random vector, say,

Since the analogy with the construction of each j-dimensional normal pdf

where the random variables ^{j}. The symbol^{T} denotes the usual matrix transpose. Recall that every matrix A can be decomposed as

where B is a lower triangular and M is an orthogonal matrix. From (5) we obtain that any nonsingular lower triangular matrix B can be represented as the product:

where A is an arbitrary nonsingular matrix. If we replace representation (4) of the random normal vector X by the following representation

then we replace the arbitrary random vector X by an arbitrary “triangular” random vector Y related to X by:

where M^{T} is an arbitrary orthogonal transformation. ^{}

Since the two zero-expectation random vectors X – µ, Y – µ are obtained one from another by an isometry (here, rotation) M^{T} in the Euclidean space R^{j}, they may be considered as representing the same “stochastic data” expressed in two different (but still rectangular) coordinate systems. So from a stochastic viewpoint the “difference” between the random vectors X and Y is inessential and we can consider the random vector Y as an “arbitrary normal” (“with accuracy to the rotation” M^{T}).

Collecting all the above, we will consider the normal random vector Y, given by (7), where matrix B is any lower triangular matrix and Z is the standard normal j-vector. Write (7) in the form:

where m ³ j is the actual dimension of the constructed (final) random vector, say

Considering the first –1 lines in (7*) as a system of linear equations, one obtains all

Realize that transformation (9) is easily reversible.

Assuming that realizations

where from the above assumed nonsingularity we have c_{kk} ≠ 0. From (10) it follows that the conditional density of each Y_{k}, given the values

while for the (constant) conditional variance we obtain

To adopt the above procedure to our concept of “baseline” T_{j} versus “in system” Y_{j} random variables, replace in (9) the independent standard random variables _{k} has the (“baseline”) normal N(µ_{k}, s_{k}) pdf

Replace transformation (7) by

where

This yields the conditional pdf of

Finally, the general pattern of “creation” of any successive j-variate normal pdf can be explained as follows.

Given are the first j – 1 lines of transformation (11) in the form:

for some_{1}, |c_{11}|s_{1}) density of the variable Y_{1}.)

We may assume that the next baseline random variable T_{j}, originally having the N(µ_{j}, s_{j}) pdf, is incorporated to the “system” by transforming

This transformation is thought of as adding to (12) the following j-th line:

[“Physically” this could mean that the variables _{j} obtained from T_{j} that (originally) was independent from these stresses].

From (13) one can determine the conditional pdf of Y_{j}, given any realization _{j}:

Thus, as the j-th “object” (originally independent from the “system” and characterized by the random quantity T_{j}) was “put into the system” the quantity T_{j} turns to the quantity Y_{j} and, in parallel, the parameters µ_{j} and s_{j} of its normal density are turned into

Clearly, the new value

of realizations _{jj}. This can be “made up” if we allow the number c_{jj} in (13) to be dependent on

As we have shown also in multivariate cases, the origin of the “parameter dependence method for the construction”, lies in the construction of the multivariate normal distributions. Recall that having defined the conditional pdf ^{j}^{ }) multivariate normal.

Preserving the general spirit of the multivariate normal pdf derivation, let us extend all the Equations (13) for _{j}_{,}_{j} by any continuous function of the same variables. Now, for any

where F_{j}() and Y_{j}() are arbitrary continuous functions and

From (13*) we obtain its inverse:

and then for each observation

It is clear that the sequence of the densities (14) _{1}(y_{1}) of Y_{1} uniquely determines the m-variate FF-normal pdf

However, the marginal pdfs of

The main conclusion which follows the considerations in Sections 6.2 - 6.4 may be stated as: There is a generic relationship that associates the construction method of the parameter dependence with the stochastic dependence structure present within the multivariate normal distribution of any dimension.

As an example of this relationship realize that the transformations (13) and (13*), when applied to the independent normal random variables_{j} is dropped.

Let now

Applying to the random vector

where the latter is the two parameter exponential density with respect to y_{j} for

Another interesting case of the m-variate FF-Weibullian pdf can be obtained by applying transformations (13*) to m independent Weibullian random variables. An even more general class of FF-Weibullians one obtains using the pseudopower transformations (see Filus and Filus [

All these distributions (including the m-variate normal) can as well be obtained by direct use of the “parameter dependence pattern” which produces more m-variate models than the considered above transformations. On the other hand existence of the defining transformations facilitates an underlying statistical analysis and simulations.

Some paradigms, applied in the reliability literature, are exactly those of the “parameter dependence” that we describe in this paper. However, in most of the cases they are not directly related to the problem of construction of multivariate probability distributions (so, also are different from the “conditioning” procedures in [

When testing the life times of some high reliability products, the stresses usually encountered such as temperature, humidity, voltage sometimes are kept on significantly higher than usual levels in order to make the life times shorter than they are in normal conditions. The so obtained data (a “sample”) is then extrapolated into those (hypothetical) life times that would, possibly, be obtained under the regular values of the stresses. Existence of rules, that associate the products’ life times with values of the stresses applied, is necessary for performing proper extrapolations. Several such rules, typically known as the Arrhenius or Eyring (see, Meeker and Escobar [

Unfortunately, with this method the simplicity often comes along with inaccuracy of the predictions. Other methods apply the “Proportional Hazards Relationships” known also as Cox Model (see Cox [

More recently ([

In what follows we discuss the differences.

1) The generality of the “parameter dependence theory” we built in this paper, is significantly higher than the very special case applied to the accelerated life testing theory. There are three reasons for that.

Firstly, in our approach the subject of constructing conditional probability distributions is not limited to the life testing, and not even to the “stress-life time” pattern only. The range of applications of our theory is very wide, including many biomedical (see, Collett [

Secondly, in the paradigm we consider, the relation between a parameter and stress (or any other random quantity) is given by an arbitrary continuous function, while the number of such functions applied in association with the accelerated life time testing is very limited. Actually, the functions are restricted to few “models” such as the Arrhenius, Eyrie, inverse power law, log-linear, and not many more (see, for example, the Eyring-Weibull model in [

Our idea is to omit the complicated physical or chemical phenomena that often are poorly understood and to apply two steps purely empirical approach.

Speaking roughly, the first step is an “educated guess” (for choice of a proper function) and the second is statistical verification of this guess.

Thirdly, in our theory we may consider an arbitrary parameter of an arbitrary probability distribution as a stress dependent, while, according to our knowledge (see, for example, [

2) Besides the generality (of the constructed conditional distributions) our concept also differs with regard to the purpose. Namely, independently of the conditional distributions construction, we also have the construction of bivariate and multivariate probability distributions such as the FF-normal, FF-exponential, FF-Weibullian, FF-gamma and other (for comparison with similar “conditioning methods” of construction present in the literature, see Section 5).

The construction of high dimension multivariate distributions based on parameter dependence can easily be extended to Markovian and non-Markovian (still simple!) stochastic processes (see Filus and Filus [

1) Other than the accelerated life testing subject, where the “parameter dependence paradigm” is applied, is a set of problems centered around the notion of “load optimization” (see Filus [

2) Similar application of the parameter dependence pattern also occurs when “load sharing phenomena” takes place. Suppose that we have a parallel system supporting a load such as several engines aircraft or two electric power lines. Failure of any system’s component may cause the total load to be redistributed among fewer components, so that the load on each of them increases by some predictable value. Now we may encounter either the load optimization problem [

Remark. As a final remark, let me mention the relationship between the parameter dependence presented in this paper, and the stochastic dependence based on models initiated in 1961 by Freund [

Quite opposite to that, in the models we introduce, the component interactions take place only when the components work. Any failure of a system component stops its influence on the remaining components’ life times. Therefore, the two paradigms, the “Freund’s load sharing” and our “parameter dependence”, are “disjoint” and in a sense “complementary”. In reality, both (physical) phenomena may take place at the same time and it seems to be quite possible in the future to construct stochastic models (i.e., multivariate probability distributions) that would obey both paradigms.

Nevertheless, we stress the generic relation of all the multivariate probability distributions based on the parameter dependence with the multivariate Gaussians.