^{1}

^{*}

^{1}

Testing the equality of means of two normally distributed random variables when their variances are unequal is known in the statistical literature as the “Behrens-Fisher problem”. It is well-known that the posterior distributions of the parameters of interest are the primitive of Bayesian statistical inference. For routine implementation of statistical procedures based on posterior distributions, simple and efficient approaches are required. Since the computation of the exact posterior distribution of the Behrens-Fisher problem is obtained using numerical integration, several approximations are discussed and compared. Tests and Bayesian Highest-Posterior Density (H.P.D) intervals based upon these approximations are discussed. We extend the proposed approximations to test of parallelism in simple linear regression models.

Suppose that x_{1} and x_{2} are two independent normal random variables with means μ_{1} and μ_{2}, and variances σ 1 2 and σ 2 2 , respectively. Samples of sizes n_{1} and n_{2} drawn from the corresponding populations are denoted by x i j ( i = 1 , 2 and j = 1 , 2 , ⋯ , n i ). It is desired to test the hypothesis H_{0}: μ_{1} = μ_{2} when it cannot be taken as known that the variance ratio λ = σ 1 2 / σ 2 2 is one. This problem, known as the Behrens-Fisher (BF) problem, has been investigated by many authors and each has proposed some particular solution. It is not the purpose of this paper to list or survey these solutions. However, the Bayesian approach to this problem, viewed as one of the most fascinating approaches of statistical inference on the means of heterogeneous normal populations, will be the main focus of this paper, and we shall also give special attention to the problem of testing parallelism of two linear regression lines when the variances of the error terms are not equal.

The exact Bayesian solution to the BF problem was given by Jeffreys [

The paper has two chief objectives. First we present a comparison among several approximations to the tails of the posterior distribution of the variate U = μ 1 − μ 2 , where μ_{j} is the mean of the j^{th} population. Second we extend the methodologies to address the question of parallelism or equality of slopes when the variances of the error terms of the two regression lines are not equal. Recommendations will follow the examples in the last sections.

Based on the samples outcome, the usual sufficient statistics are defined as follows:

x ¯ i = ∑ j = 1 n i x i j / n i and Σ i = ( n i − 1 ) s i 2 = ∑ j = 1 n i ( x i j − x ¯ i ) 2 , i = 1 , 2 .

The likelihood function of the combined data from two samples drawn independently from two normal populations with means μ_{1} and μ_{2}, and variances σ 1 2 and σ 2 2 , respectively, is proportional to:

( 1 / σ 1 ) n 1 ( 1 / σ 2 ) n 2 exp { − 1 2 [ Σ 1 + n 1 ( x ¯ 1 − μ 1 ) 2 σ 1 2 + Σ 2 + n 2 ( x ¯ 2 − μ 2 ) 2 σ 2 2 ] } . (2.1)

Following Lee [

Π ( μ 1 , μ 2 , σ 1 , σ 2 ) d μ 1 d μ 2 d σ 1 d σ 2 ∝ d μ 1 d μ 2 d σ 1 d σ 2 σ 1 σ 2 (2.2)

From Box & Tiao [

Π ( μ 1 , μ 2 , σ 1 , σ 2 | x ) ∝ σ 1 − ( n 1 + 1 ) σ 2 − ( n 2 + 1 ) exp { − 1 2 [ Σ 1 + n 1 ( x ¯ 1 − μ 1 ) 2 σ 1 2 + Σ 2 + n 2 ( x ¯ 2 − μ 2 ) 2 σ 2 2 ] } (2.3)

Integrating σ 1 , σ 2 out of (2.3), the marginal posterior density of U = μ 1 − μ 2 is shown to be

Π ( u | x ) ∝ ∫ − ∞ ∞ [ 1 + n 1 ( y + u − x ¯ 1 ) 2 Σ 1 ] − n 1 2 [ 1 + n 2 ( y − x ¯ 2 ) 2 Σ 2 ] − n 2 2 d y = ∫ − ∞ ∞ Π ( u | y , x ) ⋅ Π ( y | x ) d y (2.4)

The conditioning in Equation (2.4) is on the represents the data vector x.

Equation (2.4) was obtained by Jeffreys [

1) The conditional posterior pdf for U, given μ_{2}, is the univariate student’s t ( n 1 − 1 ) , with mean x ¯ 1 − μ 2 and variance Σ 1 / ( n 1 ( n 1 − 3 ) ) , i.e.,

Π ( u | μ 2 , x ) ∝ [ 1 + n 1 ( u − ( x ¯ 1 − μ 2 ) ) 2 Σ 1 ] − n 1 2 . (2.5)

2) The marginal posterior pdf for the variance ratio λ is such that ( s 2 2 / s 1 2 ) λ has the well-known Snedecor’s F-distribution with ( n 2 − 1 ) , and ( n 1 − 1 ) degrees of freedom, i.e.,

Π ( λ | x ) ∝ λ n 2 − 3 2 ( 1 + Σ 2 Σ 1 λ ) − ( n 1 + n 2 − 1 2 ) (2.6)

3) The conditional posterior pdf for U, given λ, is such that

t = Π ( u | λ , x ) ∝ [ 1 + a ( λ ) ( u − x ¯ 1 + x ¯ 2 ) 2 ] − ( n 1 + n 2 − 2 2 ) (2.7)

where ( λ ) = n 1 n 2 λ ( n 1 + n 2 λ ) ( Σ 1 + λ Σ 2 ) , i.e., it has student’s t density with

( n 1 + n 2 − 2 ) degrees of freedom. Hence as was indicated by Barnard [_{1} and n_{2} are nearly equal, and moderately large, and s 1 2 and s 2 2 do not differ much, the conditional distribution in (2.7) will remain nearly constant over a wide range of λ values. In the next section we derive different approximations to deal with the situation when the range of λ is wide enough to have a considerable effect on t.

We shall write the posterior density of U as follows:

Π ( u | x ) = ∫ 0 ∞ Π ( u | λ , x ) ⋅ Π ( λ | x ) d λ (3.1a)

In (3.1a) we face the same problem as we have with (2.4), however, this equation will be the tool of discussion in the remainder of this section. As can be seen evaluation of the posterior density of U is just evaluating the integral in (3.1a) which can be written as:

E [ Π ( u | λ , x ) ] = ∫ 0 ∞ Π ( u | λ , x ) ⋅ Π ( λ | x ) d λ (3.1b)

Referring to Robert and Casella [

Π ( u | λ , x ) * = 1 n ∑ i = 1 n Π ( u | λ i , x ) (3.2)

The second approximation to be considered is by the method of moments. While Patil [

E ( u r | x ) = ∫ − ∞ ∞ ∫ 0 ∞ u r ⋅ Π ( u | λ , x ) ⋅ Π ( λ | x ) d λ d u . (3.4)

Denoting the r^{th} central moment of U by m r , and its 4^{th} cumulant by l 4 , we can show from Equation (3.4) that

m 2 = Σ 1 n 1 ( n 1 − 3 ) + Σ 2 n 2 ( n 2 − 3 ) ,

m 4 = 3 [ Σ 1 2 n 1 2 ( n 1 − 3 ) ( n 1 − 5 ) + 2 Σ 1 Σ 2 n 1 n 2 ( n 1 − 3 ) ( n 2 − 3 ) + Σ 2 2 n 2 2 ( n 2 − 3 ) ( n 2 − 5 ) ]

The fourth cumulant l 4 is:

l 4 = m 4 − 3 m 2 2 (3.5)

Following Patil [

δ ~ a t ( b ) where ( δ = U − x ¯ 1 + x ¯ 2 ) . (3.6)

Equating m_{2} and m_{4} to the second and fourth central moments of the R.H.S. of ~ in (3.6) we have:

a = m 2 m 4 ( m 4 + l 4 ) − 1 2 and b = [ 2 ( 1 + m 4 l 4 ) ] .

In the above equation [s] means the smallest integer larger than s. Thus one may use tables of student’s t-distribution with [b] degrees of freedom to make tests and construct approximate H.P.D. intervals for U. In other words:

1 − α = P r ( t α / 2 < T b < t 1 − α / 2 ) = P r ( a t b , α 2 < δ < a t b , 1 − α 2 )

Barnard [^{*}, performing the numerical integration in (3.1a) will be nearly equivalent to assigning the model value to λ in the conditional density, Π ( u | λ , x ) so that:

Π ( u | λ , x ) ≐ Π ( u | λ * , x ) . (3.7)

Now, since λ * = ( n 2 − 3 ) Σ 1 ( n 1 + 1 ) Σ 2 , then substituting in (3.7) we have:

Π ( u | x ) ∝ [ 1 + a ( λ * ) ( u − x ¯ 1 + x ¯ 2 ) 2 ] − ( n 1 + n 2 − 1 2 ) , (3.8)

where

a ( λ * ) = [ ( n 1 + n 2 − 2 ) ( Σ 1 ( n 1 + 1 ) n 1 + Σ 2 n 2 ( n 2 − 3 ) ) ] − 1 .

Therefore, as a modal approximation to the posterior distribution of we take:

t * = u − ( x ¯ 1 − x ¯ 2 ) Σ 1 n 1 ( n 1 + 1 ) + Σ 2 n 2 ( n 2 − 3 ) ~ t ( n 1 + n 2 − 2 ) . (3.9)

The following lemma due to Feller is quite appealing and may be used to approximate the integral (3.1a):

Lemma: Feller ( [

E ( g ( ⋅ ) ) = ∫ − ∞ ∞ g ( y ) ϕ n ( y ) d y → g ( Θ ) . (3.10)

Since λ ¯ = E ( λ | x ) ≡ Θ = ( n 1 − 1 ) s 1 2 ( n 1 − 3 ) s 2 2 and var ( λ | x ) = σ n 2 ( θ ) = 2 θ 2 [ n 1 − 1 + n 2 − 1 − 4 ( n 1 n 2 ) − 1 ( n 1 − 1 − ( n 1 n 2 ) − 1 ) ( n 2 − 1 − 5 ( n 1 n 2 ) − 1 ) ] with σ n 2 ( Θ ) → 0 as ( n 1 , n 2 ) → ∞ , then by taking Π ( u | λ , x ) as g ( ⋅ ) and Π ( λ | x ) as ϕ n ( y ) the right hand integral of (3.1a) can be approximated by:

Π ( u | x ) ∝ [ 1 + a ( λ ¯ ) ( u − x ¯ 1 + x ¯ 2 ) 2 ] − ( n 1 + n 2 − 1 2 ) ,

where

a ( λ ¯ ) = { ( n 1 + n 2 − 4 ) [ s 2 2 n 2 + ( n 1 − 1 ) s 1 2 n 1 ( n 1 − 3 ) ] } − 1 .

Accordingly, one may take

E = U − ( x ¯ 1 − x ¯ 2 ) ( n 1 + n 2 − 4 n 1 + n 2 − 2 ) [ Σ 1 n 1 ( n 1 − 3 ) + Σ 2 n 2 ( n 2 − 1 ) ] ~ t ( n 1 + n 2 − 2 ) . (3.11)

as an approximating distribution to the posterior distribution of U.

The Edgeworth expansion has been used extensively by many authors in order to approximate the density function of any statistics ν n . We refer to the paper by Barndorff-Nielson and Cox [

density function of any statistic ν n = v n − E ( v n ) Var ( v n ) at a point c is given by the general formula:

ψ ( y ) ≐ [ 1 + ϑ 1 6 ( c 3 − 3 c ) + θ 2 − 3 24 ( c 4 − 6 c 2 + 3 ) + θ 1 72 ( c 6 − 15 c 4 + 45 c 2 − 15 ) ] ⋅ 1 2 Π exp ( − 1 2 c 2 ) ,

where θ 1 and ( θ 2 − 3 ) are respectively the coefficients of skewness and Kurtosis for the density of ν n _{.} As can be seen when ϑ 1 = 0 , the terms of order n − 1 / 2 disappear and the modified Edgeworth expansion of order n − 1 for the density function of the statistic ν n = U − x ¯ 1 + x ¯ 2 τ 2 at c is given by:

ψ ( y ) ≐ [ 1 + κ 4 24 τ 2 2 ( c 4 − 6 c 2 + 3 ) ] ⋅ 1 2 Π exp − ( 1 2 c 2 ) . (3.12)

The Edgeworth expansion for the density of U can easily be obtained, from Equation (3.12), by using linear transformation, U = x ¯ 1 − x ¯ 2 + m 2 c .

Although we are not approximating the distribution function directly, in practice these approximations may be used for calculating tail areas. Thus it is of interest to see how these approximations for the posterior distributions of U = μ 1 − μ 2 perform at the tail.

Example 1: “Mean tumor recurrence scores in breast cancer patients stratified by tumor grades.”

Oncotype DX is a commercial assay used for making decisions regarding the treatment of breast cancer. The results are reported as a tumor recurrence score ranging from 0 to 100, Klein et al. [

n 1 = 40 , x ¯ 1 = 11.55 , s 1 2 = 18.3

n 2 = 37 , x ¯ 2 = 34.57 , s 2 2 = 171.25

Approximations, discussed in section 3, to the posterior density of U = μ 1 − μ 2 are applied to the above data. In addition we also consider the usual asymptotic normal approximation which for large n_{1} and n_{2}, is given as suggested by Welch [

z = U − ( x ¯ 1 − x ¯ 2 ) τ 2 → p N ( 0 , 1 ) ,

The results of the data analyses based on the proposed approximations are presents in

As we can see, all approximations have confidence limits close to the exact limits, probably because the sample sizes are moderately large. We provide the R-codes for the calculations of these limits in the Appendix within the applications of linear regression.

Linear and nonlinear regression models are ubiquitous in medical and biological research. Testing equality of slopes of two linear regression lines is of special interest. This is illustrated in the following application which is a continuation of example 1.

Method | 95% HPD interval estimation | |
---|---|---|

0.025 | 0.975 | |

Monte-Carlo integration (exact) Scale Approximation Modal Approximation Averaging Approximation Edgeworth Expansion Normal Approximation (Welch) | −27.70 −27.48 −27.52 −27.46 −27.61 −27.69 | −18.55 −18.36 −18.41 −18.61 −18.44 −18.40 |

Example 2: continuation of example 1.

Ki67 is a commonly used marker of cancer cell proliferation, and has significant prognostic value in tumor recurrence of breast cancer. In this illustration which is a continuation to example 1, we use a sub-sample of women who had Oncotype DX testing performed and available Ki67 indices which correlated with tumor grades (1 versus 3). Literature documented that Ki67 scores contribute significantly to models that predict risk of recurrence in breast cancer, for example see; Cuzick et al. [

We shall use the following notations. First we denote the independent variable by x to predict values of the dependent variable denoted by y.

Suppose that we have two linear regression lines, then conditional on x i j we assume that

y i j ~ N ( α j + β j ( x i j − x ¯ j ) , ϕ j ) , i = 1 , 2 , ⋯ , n j & j = 1 , 2 .

In general we assume that we have two conditions, and we would like to estimate δ = β 1 − β 2 .

In our Bayesian analysis we shall take a reference prior that is independently uniform in α j , β j and log ϕ j , such that:

π ( α j , β j , ϕ j ) ∝ 1 / ϕ j , j = 1 , 2

Let us define the following quantities:

S j e e = S j y y − S j x y 2 / S j x x , a j = y ¯ j , b j = S j x y / S j x x , e j 0 = y ¯ j − b j x ¯ j

where

S j x y = ∑ i = 1 n j ( x i j − x ¯ j ) ( y i j − y ¯ j ) , S j y y = ∑ i = 1 n j ( y i j − y ¯ j ) 2 , S j x x = ∑ i = 1 n j ( x i j − x ¯ j ) 2 ,

and

Σ j = ( n j − 2 ) S j e e / S j x x , j = 1 , 2 .

The joint posterior distribution of the model parameters ( α j , β j , ϕ j ) are proportional to

f ( α j , β j , ϕ j | x i j , y i j ) ∝ π j = ϕ j − ( n j + 2 ) / 2 exp [ − 1 2 { S j e e + n j ( α j − a j ) 2 + S j x x ( β j − b j ) 2 } / ϕ j ] (4.1)

For the two regression lines, the joint posterior of ( α 1 , α 2 , β 1 , β 2 , ϕ 1 , ϕ 2 ) is π 1 π 2 , or

π ( α 1 , α 2 , β 1 , β 2 , ϕ 1 , ϕ 2 | x , y ) ∝ ϕ 1 − ( n 1 + 2 2 ) ϕ 2 − ( n 2 + 2 2 ) ∏ j = 1 2 exp [ − 1 2 { S j e e + n j ( α j − a j ) 2 + S j x x ( β j − b j ) 2 } / ϕ j ] (4.2)

Integrating out α 1 and α 2 , the joint posterior of ( β 1 , β 2 , ϕ 1 , ϕ 2 ) is thus given as:

∝ ϕ 1 − ( n 1 + 1 2 ) ϕ 2 − ( n 2 + 1 2 ) ∏ j = 1 2 exp [ − 1 2 { S j e e + S j x x ( β j − b j ) 2 ϕ j } ]

Integrating out ϕ 1 and ϕ 2 we get:

π ( β 1 , β 2 | x , y ) ∝ [ 1 + n 1 − 2 Σ 1 ( β 1 − b 1 ) 2 ] − ( n 1 − 1 2 ) [ 1 + n 2 − 2 Σ 2 ( β 2 − b 2 ) 2 ] − ( n 2 − 1 2 ) (4.3)

Under the transformation δ = β 1 − β 2 , one can show that the posterior density of δ is given by:

π ( δ | x , y ) ∝ ∫ − ∞ ∞ [ 1 + r 1 Σ 1 ( δ + β 2 − b 1 ) 2 ] − ( r 1 + 1 2 ) [ 1 + r 2 Σ 2 ( β 2 − b 2 ) 2 ] − ( r 2 + 1 2 ) d β 2 = ∫ − ∞ ∞ π ( δ | β 2 , x , y ) , π ( β 2 | x , y ) d β 2 (4.4)

where r j = n j − 2 .

The Bayesian inferences on δ are completely determined by Equation (4.4) which we cannot easily manipulate in order to have statistical inferences test on δ conducted in a routine fashion. Moreover, it is clear from (4.4) that the exact

marginal posterior distribution of ( β j − b j ) r j Σ j is that of a Student t-distribution with r_{j} degrees of freedom.

Similar to testing the equality of means of two normally distributed distributions and as shown in the first part of the paper, we use the suggested approximations to the integral given in (4.4) to find the marginal posterior distribution of δ. However, it is quite helpful to not only examine the posterior density of δ, but also examine the components of the joint posterior density given in (4.4). For this purpose, we state the following results without proof, since they can be easily obtained by applications of the calculus of probability.

1) The conditional posterior p∙d∙f of δ, given β 2 is the univariate Student-t with ( n 1 − 2 ), with conditional mean b 1 − β 2 and conditional variance Σ 1 ( n 1 − 2 ) ( n 1 − 4 ) .

The unconditional posterior mean and variance of δ are:

E ( δ | x , y ) = b 1 − b 2

μ 2 ( δ ) = var ( δ | x , y ) = Σ 1 ( n 1 − 2 ) ( n 1 − 4 ) + Σ 2 ( n 2 − 2 ) ( n 2 − 4 )

μ 4 ( δ ) = 3 Σ 1 2 ( n 1 − 2 ) ( n 1 − 2 ) ( n 1 − 4 ) ( n 1 − 6 ) + 3 Σ 2 2 ( n 2 − 2 ) ( n 2 − 2 ) ( n 2 − 4 ) ( n 2 − 6 ) + 6 Σ 1 Σ 2 ( n 1 − 2 ) ( n 2 − 2 ) ( n 1 − 4 ) ( n 2 − 4 )

2) The marginal posterior p∙d∙f of the variance ratio λ is a scale multiplicative of the F-distribution. That is:

λ = ϕ 1 / ϕ 2 = S 1 e e / ( n 1 − 2 ) S 2 e e / ( n 2 − 2 ) F n 2 − 2 , n 1 − 2

Therefore, the posterior moments of λ are:

E [ λ | x , y ] = S 1 e e S 2 e e ⋅ ( n 2 − 2 ) ( n 1 − 4 )

Mode [ λ | x , y ] = S 1 e e S 2 e e ⋅ ( n 2 − 4 ) n 1

var [ λ | x , y ] = ( S 1 e e / S 2 e e ) 2 [ 2 ( n 2 − 2 ) ( n 1 + n 2 − 6 ) ( n 1 − 4 ) 2 ( n 1 − 6 ) ]

Using the inverse moments of the F-distribution we can show that:

E ( 1 λ | x , y ) = ( S 2 e e S 1 e e ) ( n 1 − 2 n 2 − 4 ) (4.5)

and

E ( 1 λ 2 | x , y ) = ( S 2 e e S 1 e e ) 2 [ n 1 ( n 1 − 2 ) ( n 2 − 4 ) ( n 2 − 6 ) ]

3) The conditional posterior p∙d∙f of δ, given λ is such that

π ( δ | λ , x , y ) ∝ [ 1 + A B A + B ( δ − ( b 1 − b 2 ) ) 2 ] − ( ν − 1 / 2 ) (4.6)

E ( δ | λ , x , y ) = b 1 − b 2

var ( δ | λ , x , y ) = 1 n 1 + n 2 − 6 [ 1 A + 1 B ] = 1 n 1 + n 2 − 6 [ S 1 e e + λ S 2 e e S 1 x x + S 1 e e + λ S 2 e e λ S 2 x x ]

These results are derived based on the fact that conditional on λ, the posterior distribution of

t = ( δ − D ) n 1 + n 2 − 4 A − 1 + B − 1

Is that of a t-distribution with ( n 1 + n 2 − 4 ) degrees of freedom.

As a first approximation to the posterior distribution of Δ = δ − ( b 1 − b 2 ) is to assume that

Δ = d a t ( ν ) (5.1)

Equating μ 2 ( Δ ) and μ 4 ( Δ ) to the second and fourth control moments of the R∙H∙S of = d in (5.1) we get:

a = [ μ 2 ( Δ ) μ 4 ( Δ ) μ 4 ( Δ ) + κ 4 ( Δ ) ] 1 / 2

ν = [ 2 ( 1 + μ 4 ( Δ ) κ 4 ( Δ ) ) ]

Here, κ 4 ( Δ ) = μ 4 ( Δ ) − 3 μ 2 2 ( Δ ) and [ ξ ] mean the smallest integer larger than ξ. Thus one may use tables of student’s t-distribution with [ ν ] degrees of freedom to construct H.P.D. intervals on δ. As can be seen this result is identical to the moment or the scale approximation of the posterior distribution of the difference between means.

As suggested by Box and Tiao [

π ( δ | x , y ) ≗ π ( δ | λ * , x , y ) (5.2)

Hence we take

T * = ( δ − ( b 1 + b 2 ) ) ( n 1 + n 2 − 4 ) ( A + B ) / A B (5.3)

As a t-statistic with ( n 1 + n 2 − 4 ) degrees of freedom, where A = S 1 x x / ( S 1 e e + λ * S 2 e e ) and B = λ * S 2 x x / ( S 1 e e + λ * S 2 e e ) , and λ * = S 1 e e S 2 e e ⋅ ( n 2 − 4 ) n 1 is the modal value of λ.

Here we find that the conditions of Feller’s [

By taking π ( δ | x , y ) as g ( ⋅ ) and π ( λ | x , y ) as ψ n ( ξ ) , the R.H.S. of (5.2) can be approximated by

π ( δ | x , y ) α [ 1 + a ( λ ¯ ) ( δ − ( b 1 − b 2 ) ) 2 ] − ( ν − 1 / 2 )

where

a ( λ ¯ ) = ( 1 A ¯ + 1 B ¯ ) − 1 = [ S 1 e e S 1 x x + λ ¯ S 2 e e S 1 x x + S 2 e e S 2 x x + 1 λ ¯ S 1 e e S 2 x x ] − 1

Hence, we take

T ¯ = δ − ( b 1 − b 2 ) ( Σ ¯ 1 + Σ ¯ 2 ) ~ t ( n 1 + n 2 − 4 )

where

Σ ¯ 1 = ( n 1 + n 2 − 6 ) ( n 1 − 2 ) ( n 1 − 4 ) Σ 1

Σ ¯ 2 = n 1 + n 2 − 6 ( n 2 − 2 ) 2 Σ 2

We now analyze the data in this example. We take the Ki67 to be the explanatory variable (x), while the recurrence score is the dependent variable (y). The tumor grades 1 and 3 form the two groups whose slopes are to be compared. The summary data are:

Tumor grade 1 Tumor grade 3

n 1 = 40

The scatter plots are given for tumor grade 1 (

As we can see, for tumor grade 1, the association between Ki67 and tumor recurrence is quite weak, but the association is stronger for tumor grade 3.

Remarks:

For all the proposed methods, our data analysis approach is simulation-based. The number of replications used is sufficient. For the Monte-Carlo integration, which we consider to be the exact we monitored the simulation. As we can see in

We should note that the red line bands in

In the data analytic part, one may be interested in the shape of the density of the approximating distributions and how they deviate from the exact density. We did not discuss this issue since most of the time we are interested in the tail area

Method | 95% HPD interval estimation | |
---|---|---|

0.025 | 0.975 | |

Monte-Carlo integration (exact) Scale Approximation Modal Approximation Averaging Approximation Edgeworth Expansion Normal Approximation (Welch) | −0.673 −0.663 −0.660 −0.664 −0.665 −0.661 | 0.028 0.024 −0.002 0.005 0.018 0.021 |

of the distribution in order to construct confidence intervals on the mean difference or the difference of slopes. From

None declared by all authors.

Shoukri, M. and Al-Mohanna, F. (2018) Extending the Behrens-Fisher Problem to Testing Equality of Slopes in Linear Regression: The Bayesian Approach. Open Journal of Statistics, 8, 284-301. https://doi.org/10.4236/ojs.2018.82018

R-CODES.

#Summary data

n1=40, n2=37, s1xx=1019.6, s2xx=11167.57, s1ee=700.13, s2ee=3931.172

b1=.124, b2=.447.

#Monte Carlo integration

sig1=(n1-2)*s1ee/s1xx

sig2=(n2-2)*s2ee/s2xx

T2=(sig1/((n1-2)*(n1-4)))+(sig2/((n2-2)*(n2-4)))

ll=rf(10000,n2-2,n1-2) #simulating from the F-distribution

l=ll*(s1ee/s2ee)*((n2-2)/(n1-2))

A=s1xx/(s1ee+l*s2ee)

B=l*s2xx/(s1ee+l*s2ee)

nn=n1+n2-4

beta1=b1+rt(10000,n1-2)*sqrt(sig1/((n1-2)*(n1-4)))

beta2=b2+rt(10000,n2-2)*sqrt(sig2/((n2-2)*(n2-4)))

delta=(beta1-beta2)

q1=quantile(delta,prob=c(0.025,0.975))

q1

#Monitoring the simulation

h=function(x){(1/(1+((A*B)/(A+B))*(x-b1+b2)^2)^(((nn+1)/2)))}

x=h(l)

estint=cumsum(x)/(1:10^4)

mean(estint)

esterr=sqrt(cumsum((x-estint)^2))/(1:10^4)

mean(esterr)

plot(estint,xlab="Mean and error range",lwd=2,

ylim=mean(x)+20*c(-esterr[10^4],esterr[10^4]),ylab="")

lines(estint+2*esterr,col="red",lwd=2)

lines(estint-2*esterr,col="red",lwd=2)

#Moments (Scale) Approximation

sig1=(n1-2)*s1ee/s1xx

sig2=(n2-2)*s2ee/s2xx

T2=(sig1/((n1-2)*(n1-4)))+(sig2/((n2-2)*(n2-4)))

T41=3*sig1^2/((n1-2)^2*(n1-4)*(n1-6))

T42=3*sig2^2/((n2-2)^2*(n2-4)*(n2-6))

T412=6*sig1*sig2/((n1-2)*(n1-4)*(n2-2)*(n2-4))

T4=T41+T42+T412

K4=T4-3*T2^2

a=sqrt(T2*T4/(T4+K4))

b=ceiling(2*(1+(T4/K4))) #integer degrees of freedom

d=rt(10000,b)

delta.m=(b1-b2)+a*d

q2=quantile(delta.m,prob=c(.025,.975))

q2

#Welch Normal Approximation

zz=rnorm(10000)

u=(b1-b2)+zz*sqrt(T2)

mean.u=mean(u)

sd.u=sd(u)

qqnorm(u)

q3=quantile(u,prob=c(0.025,0.975))

q3

#Edgeworth expansion

library(distr)

aa=K4/(24*T2^2)

p <- function(x) (1/sqrt(2*pi) *(1/(exp(x^2/2)*

(1+aa*(x^4-6*x^2+3)))))

# probability density function

dist <-AbscontDistribution(d=p) # signature for a distribution with pdf ~ p

rdist <- r(dist) # function to create random variates from p

set.seed(1)

XX <- rdist(10000) # sample from X ~ p

x <- seq(-10,10, .01)

hist(XX, freq=F, breaks=50, xlim=c(-5,5))

lines(x,p(x),lty=2, col="red")

mean(XX)

sd(XX)

edgeworth=b1-b2+sqrt(T2)*XX

hist(edgeworth)

q4=quantile(edgeworth,prob=c(0.025,.975))

q4

#Averaging approximation based on Feller’s lemma

nu=n1+n2-4

f=rf(10000,n2-2,n1-2)

t=rt(10000,nu)

l=(s1ee/s2ee)*f

mean.l=mean(l)

denom=s1ee+mean.l*s2ee

A=s1xx/denom

B=mean.l*s2xx/denom

d.mean=(b1-b2)+ (sqrt((A+B)/(A*B))*t)/sqrt(nu)

q5=quantile(d.mean,prob=c(.025,.975))

q5

#Modal approximation

ff=rf(10000,n2-2,n1-2)

f=ff*s1ee*(n2-2)/s2ee*(n1-2)

m=(s1ee/s2ee)*(n2-4)/n1

nu=n1+n2-4

t=rt(10000,nu)

denom=s1ee+m*s2ee

AA=s1xx/denom

BB=m*s2xx/denom

d.mod=(b1-b2)+ (sqrt((AA+BB)/(AA*BB))*t)/sqrt(nu)

q6=quantile(d.mod,prob=c(.025,.975))

q6