OJSOpen Journal of Statistics2161-718XScientific Research Publishing10.4236/ojs.2017.71001OJS-74044ArticlesPhysics&Mathematics Inferences on the Difference of Two Proportions: A Bayesian Approach ThuPham-Gia1*NguyenVan Thin2PhanPhuc Doan2Faculty of Mathematics and Computer Science, Hochiminh University of Science, Ho Chi Minh, VietnamDepartment of Mathematics and Statistics, Université de Moncton, New Brunswick, Canada* E-mail:thu.pham-gia@umoncton.ca(TP);09022017070111513, July 20166, February 2017 9, February 2017© Copyright 2014 by authors and Scientific Research Publishing Inc. 2014This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/

Let π=π12 be the difference of two independent proportions related to two populations. We study the test H 0: π ≥0 against different alternatives, in the Bayesian context. The various Bayesian approaches use standard beta distributions, and are simple to derive and compute. But the more general test H 0: π ≥ η, with η ＞0, requires more advanced mathematical tools to carry out the computations. These tools, which include the density of the difference of two general beta variables, are presented in the article, with numerical examples for illustrations to facilitate comprehension of results.

Proportion Convolution Normal Beta Bayesian Critical Value Appell’s Function
1. Introduction

For two independent proportions π 1 and π 2 , their difference is frequently encountered in the frequentist statistical literature, where tests, or confidence intervals, for π 1 − π 2 are well accepted notions in theory and in practice, although most frequently, the case under study is the equality, or inequality of these proportions. For the Bayesian approach, Pham-Gia and Turkkan (  and  ) have considered the case of independent, and dependent proportions for inferences, and also in the context of sample size determination  .

But testing π 1 = π 2 is only a special case of testing H 0 : π 1 − π 2 ≤ η , with η being a positive constant value, which is much less frequently dealt with. In Section 2 we recall the unconditional approaches to testing H 0 based on the maximum likelihood estimators of the two proportions and normal approximations. A new exact approach not using normal approximation has been developed by our group and will be presented elsewhere. Fisher’s exact test is also recalled here, for comparison purpose. The Bayesian approach to testing the equality of two proportions and the computation of credible intervals are given in Section 3. The Bayesian approach using the general beta distributions is given in Section 4. All related problems are completely solved, thanks to some closed form formulas that we have established in earlier papers.

2. Testing the Equality of Two Proportions2.1. Test Using Normal Approximation

As stated before, taking η = 0 we have a test for equality between two proportions. Several well-known methods are presented in the literature. For example, the conditional test is usually called Fisher’s exact test, and is based on the hypergeometric distribution. It is used when the sample size is small. Pearson’s Chi-square test using Yates correction is usually used for intermediary sample size while Pearson’s Chi-square is used for large samples. Their appropriateness is discussed in D’Agostino et al.  . Normal approximation methods are based on formulas using estimated values of the mean and the variance of the two populations. For example, we have

T 1 = X 1 / n 1 − X 2 / n 2 [ ( X 1 / n 1 ) ( 1 − X 1 / n 1 ) / n 1 + ( X 2 / n 2 ) ( 1 − X 2 / n 2 ) / n 2 ] 1 / 2 , and the pooled version T 2 = X 1 / n 1 − X 2 / n 2 [ ( X 1 + X 2 ) / ( n 1 + n 2 ) ( ( 1 − ( X 1 + X 2 ) ) / ( n 1 + n 2 ) ) ( 1 / n 1 + 1 / n 2 ) ] 1 / 2 , both being

approximately N ( 0 , 1 ) under H 0 : π 1 ≤ π 2 . Cressie  gives conditions under which T 2 is better than T 1 , in terms of power. Previously, Eberhardt and Fligner  studied the same problem for a bilateral test.

Numerical Example 1

To investigate its proportions of customers in two separate geographic areas of the country, a company picks a random sample of 25 shoppers in area A, in which 17 are found to be its customers. A similar random sample of 20 shoppers in area B gives 8 customers. We wish to test the hypothesis that H 0 : π 1 ≤ π 2 against H 1 : π 1 > π 2 .

We have here the observed value of T 1 = 1.9459 and of T 2 = 1.8783 which lead, in both cases, to the rejection of H 0 at significance level 5% (the critical value is 1.64) for H 1 : π 1 > π 2 .

2.2. Fisher’s Exact Test

Under H 0 the number of successes coming from population 1 has the Hyp ( n 1 + n 2 , t = x 1 + x 2 , n 1 , x ) distribution. The argument is that, in the combined sample of size n 1 + n 2 , with x 1 successes from population 1 out of the total number of successes t = x 1 + x 2 , the number of x successes coming from population 1 is a hypergeometric variable.

To compute the significance of the observation we have to compute several tables corresponding to more extreme results than the observed table. It is known that the conditional test is less powerful than the unconditional one.

Numerical Example 2

We use the same data as in numerical example 1 to test H 0 : π A = π B vs H 1 : π A > π B i.e. the proportion of customers in area A is significantly higher than the one in area B. We have Table 1:

the observed data ( x B = 8 ) , and also cases more extreme, which means x B = 0 , 1 , 2 , ⋯ , 7 . The p-value of the test is hence

p -value = ∑ x B = 0 8 ( 25 25 − x B ) ( 20 x B ) ( 45 25 ) = 0.0542 .

Although technically not significant at the 5% level, this result shows that the proportion of customers in area B can practically be considered as lower than the one in area A, in agreement with the frequentist test.

REMARK: The problem is often associated with a 2 ´ 2 table where there are three possibilities: constant column sums and row sums, one set constant the other variable and both variables. Other measures can then be introduced (e.g. Santner and Snell  ). A Bayesian approach has been carried out by several authors, e.g. Howard  and also Pham Gia and Turkkan  , who computed the credible intervals for several of these measures.

3. The Bayesian Approach

In the estimation of the difference of two proportions the Bayesian approach certainly plays an important role. Agresti and Coull  provide some interesting remarks on various approaches.

Again, let π = π 1 − π 2 . Using the Bayesian approach will certainly encounter some serious computational difficulties if we do not have a closed form expression for the density of the difference of two independently beta distributed random variables. Such an expression has been obtained by the first author some time ago and is recalled below.

3.1. Bayesian Test on the Equality of Two Proportions

Let us recall first the following theorem:

Theorem 1: Let π i ~ beta ( α i , β i ) , for i = 1 , 2 be two independent beta distributed random variables with parameters ( α 1 , β 1 ) and ( α 2 , β 2 ) , respectively. Then the difference π = π 1 − π 2 has density defined on ( − 1 , 1 ) as follows:

p π ( x ) = { B ( α 2 , β 1 ) x β 1 + β 2 − 1 ( 1 − x ) α 2 + β 1 − 1 F 1 ( β 1 , α 1 + α 2 + β 1 + β 2 − 2 , 1 − α 1 ; β 1 + α 2 ; ( ( 1 − x ) , 1 − x 2 ) ) / A ,   0 ≤ x < 1 B ( α 1 + α 2 − 1 ; β 1 + β 2 − 1 ) / A , x = 0 , if α 1 + α 2 > 1 , β 1 + β 2 > 1 B ( α 1 , β 2 ) ( − x ) β 1 + β 2 − 1 ( 1 + x ) α 1 + β 2 − 1 F 1 ( β 2 , 1 − α 2 , 1 − α 2 ; α 1 + α 2 + β 1 + β 2 − 2 , α 1 + β 2 ; 1 − x 2 , 1 + x ) / A ,   − 1 ≤ x < 0 A = B ( α 1 , β 1 ) B ( α 2 , β 2 ) (1)

F 1 ( . ) is Appell’s first hypergeometric function, which is defined as

F 1 ( a , b 1 , b 2 ; c ; x 1 , x 2 ) = ∑ i = 0 ∞ ∑ j = 0 ∞ a [ i + j ] c [ i + j ] b 1 [ i ] b 2 [ j ] x 1 i x 2 j i ! j ! (2)

where a [ b ] = a ( a + 1 ) ⋯ ( a + b − 1 ) . This infinite series is convergent for | x 1 | < 1 and | x 2 | < 1 , where, as shown by Euler, it can also be expressed as a convergent integral:

Γ ( c ) Γ ( a ) Γ ( c − a ) ∫ 0 1 u a − 1 ( 1 − u ) c − a − 1 ( 1 − u x 1 ) − b 1 ( 1 − u x 2 ) − b 2 d u (3)

which converges for c − a > 0 , a > 0 . In fact, Pham-Gia and Turkkan  established the expression of the density of the difference using (3) directly and not the series. Hence, the infinite series (5) can be extended outside the two circles of convergence, by analytic continuation, where it is also denoted by F 1 ( . ) .

Here, we denote the above density (1) by π ~ ψ ( α 1 , β 1 , α 2 , β 2 ) .

Proof: See Pham-Gia and Turkkan  .

The prior distribution of π is hence ψ ( α 1 , β 1 , α 2 , β 2 ) , obtained from the two beta priors. Various approaches in Bayesian testing are given below.

Bayesian Testing Using a Significance Level

While frequentist statistics frequently does not test H 0 : π ≤ η vs . H 1 : π > η , for η > 0 and limits itself to the case η = 0 , Bayesian statistics can easily do it.

a) One-sided test:

Proposition 1: To perform the above test at the 0.05 significance level, using the two independent samples { X 1 , i } i = 1 n 1 and { X 2 , i } i = 1 n 2 , we compute p π 1 − π 2 ( π 1 − π 2 | α 1 ∗ , β 1 ∗ , α 2 ∗ β 2 ∗ ) , where α i ∗ = α i + x i and β i ∗ = β i + n i − x i , i = 1 , 2 . This expression of the posterior density of π , obtained by the conjugacy of binomial sampling with the beta prior, will allow us to compute P ( π > η ) and compare it with the significance level α .

For example, as in the frequentist example of Section 2.1, we consider n 1 = 25 , x 1 = 17 , n 2 = 20 , x 2 = 8 and use two non-informative beta priors, that is, Beta ( 0.5 , 0.5 ) .

We note first that π ^ 1 = 17 / 25 = 0.68 , π ^ 2 = 8 / 20 = 0.40 , giving π ^ = 0.28 .

We obtain the prior and posterior distributions of π 1 and π 2 (Figure 1). We wish to test:

H 0 : π ≤ 0.35 vs H 1 : π > 0.35 (4)

We have α 1 ∗ = 17.5 , β 1 ∗ = 8.5 , α 2 ∗ = 8.5 , β 2 ∗ = 12.5 : H 1 has posterior probability Pr ( π > 0.35 ) = ∫ 0.35 1 ψ ( x ; 17.5 , 8.5 , 8.5 , 12.5 ) d x = 0.2855 , and we fail to reject H 0 at the 0.05% level. This means that data combined with our judgment is not enough to make us accept that the difference of these proportions exceeds 0.35. Naturally, different informative, or non-informative, priors can be considered for π 1 and π 2 separately, and the test can be carried out in the same way.

b) Point-null hypothesis:

The point null hypothesis H 0 : π = η vs . H 1 : π ≠ η to be tested at the significance level α in Bayesian statistics has been a subject of study and discussion

in the literature. Several difficulties still remain concerning this case, especially on the prior probability assigned to the value η (see Berger  ). We use here Lindley’s compromise (Lee  ), which consists of computing the ( 1 − α ) 100 % highest posterior density interval and accept or reject H 0 depending on whether η belongs or not to that interval. Here, for the same example, if η = 0.35 , using Pham-Gia and Turkkan’s algorithm  , the 95% hpd interval for π is ( − 0.0079 ; 0.5381 ) , which leads us to technically accept H 0 (see Figure 2), although the lower bound of the hpd interval can be considered as zero and we can practically reject H 0 .

We can see that the above conclusions on π are consistent with each other.

3.2. Bayesian Testing Using the Bayes Factor

Bayesian hypothesis testing can also be carried out using the Bayes factor B, which would give the relative weight of the null hypothesis w.r.t. the alternative one, when data is taken into consideration. This factor is defined as the ratio of the posterior odds over the prior odds. With the above expression of the difference of two betas given by (1) we can now accurately compute the Bayes factor associated with the difference of two proportions. We consider two cases:

a) Simple hypothesis: H 0 : π = a vs H 1 : π = b . Then B = p π ( π | a ) p π ( π | b ) , which

corresponds to the value of the posterior density of π at a , divided by the value of posterior density of π at b . As an application, let us consider the following hypotheses (different from the previous numerical example): H 0 : π = 0.35 vs. H 1 : π = 0.25 , where we have uniform priors for both π 1 and π 2 , and where we consider the sampling results from Table 1. We obtain the posterior parameters α 1 ∗ = 18 , β 1 ∗ = 9 , α 2 ∗ = 9 , β 2 ∗ = 13 . Using the density of the difference (1), we calculate the Bayes factor,

B = ψ ( 0.35 | α 1 ∗ , β 1 ∗ , α 2 ∗ , β 2 ∗ ) ψ ( 0 .25 | α 1 ∗ , β 1 ∗ , α 2 ∗ , β 2 ∗ ) = 0.8416 . This value indicates that the data slightly

favor H 1 over H 0 , which is a logical conclusion since π ^ = 0.28 .

Data on customers in area A and B
Area
ABCombined Response
ResponseYes17825
No81220
Totals252045

b) Composite hypothesis: As an application, let us consider the hypotheses (4), that is, H 0 : π ≤ 0.35 vs. H 1 : π > 0.35 .

In general, H 0 : π ∈ Θ 0 vs. H 1 : π ∈ Θ 1 , where Θ 0 ∪ Θ 1 = R . We have

p 0 = Pr ( π ∈ Θ 0 | posterior ) and p 1 = Pr ( π ∈ Θ 1 | posterior ) (or p 1 = 1 − p 0 ) as posterior probabilities. Consequently, we define the posterior odds on H 0 against H 1 as p 0 / p 1 . Similarly, we have the prior odds on H 0 against H 1 ,

which we define here as z 0 / z 1 . The Bayes factor is B = p 0 z 1 p 1 z 0 . Again, we use the

sampling results from Table 1, yielding the prior and posterior distributions presented in Figure 1 with Beta ( 0.5 , 0.5 ) prior separately for both proportions.

Now, using (4), π ∼ ψ ( α 1 ∗ , β 1 ∗ , α 2 ∗ , β 2 ∗ ) , we can determine the required prior

and posterior probabilities. For example, p 0 = ∫ − 1 0.35 ψ ( t | α 1 ∗ , β 1 ∗ , α 2 ∗ , β 2 ∗ ) d t gives

p 0 = 0.7145 . In the same way, we obtain z 0 = 0.745 , using the prior ψ ( 1 / 2 , 1 / 2 , 1 / 2 , 1 / 2 ) . Since p 1 = 1 − p 0 and z 1 = 1 − z 0 , we have p 1 = 0.2855 and z 1 = 0.255 . Finally, the Bayes factor is B = 0.8566 , which is a mild argument in favor of H 1 .

4. Prior and Posterior Densities of π − η

The testing above can be seen to be quite straightforward, and is limited to some numerical values of the function ψ ( . ) that can be numerically computed. But to make an in-depth study of the Bayesian approach to the difference π − η = π 1 − ( π 2 + η ) , we need to consider the analytic expressions of the prior and posterior distributions of this variable, which can be obtained only from the general beta distribution. Naturally, the related mathematical formulas become more complicated. But Pham-Gia and Turkkan  have also established the expression of the density of X 1 + X 2 , where both have general beta distributions.

4.1. The Difference of Two General Betas

The general beta (or GB), defined on a finite interval, say (c, d), has a density:

f g b ( x ; α , β ; c , d ) = ( x − c ) α − 1 ( d − x ) β − 1 / [ ( d − c ) α + β − 1 B ( α , β ) ] , α , β > 0 , c ≤ x ≤ d (5)

and is denoted by X ~ G B ( α , β ; c , d ) . It reduces to the standard beta above when c = 0 and d = 1 . Conversely a standard beta can be transformed into a general beta by addition of, or/and, multiplication with a constant.

Theorem 2: Let X ~ G B ( α , β ; a , b ) and any two scalars θ , λ . Then

1) X + θ ~ G B ( α , β ; a + θ , b + θ ) ,

2) λ X ~ G B ( α , β ; λ a , λ b ) when λ > 0 . Otherwise, λ X ~ G B ( β , α ; λ b , λ a ) when λ < 0 .

Proof:

1) We have

f X + θ ( y ) = f X ( y − θ ) = ( ( y − θ ) − a ) α − 1 ( b − ( y − θ ) ) β − 1 / [ ( b − a ) α + β − 1 B ( α , β ) ] , a ≤ y − θ ≤ b = ( y − ( a + θ ) ) α − 1 ( ( b + θ ) − y ) β − 1 / [ ( ( b + θ ) − ( a + θ ) ) α + β − 1 B ( α , β ) ] , a + θ ≤ y ≤ b + θ

2) For λ > 0 ,

f λ X ( y ) = 1 λ f X ( y / λ ) = 1 λ ( y / λ − a ) α − 1 ( b − y / λ ) β − 1 / [ ( b − a ) α + β − 1 B ( α , β ) ] , a ≤ y / λ ≤ b = ( y − λ a ) α − 1 ( λ b − y ) β − 1 / [ ( λ b − λ a ) α + β − 1 B ( α , β ) ] , λ a ≤ y ≤ λ b

When λ < 0 ,

f λ X ( y ) = − 1 λ f X ( y / λ ) = − 1 λ ( y / λ − a ) α − 1 ( b − y / λ ) β − 1 / [ ( b − a ) α + β − 1 B ( α , β ) ] , a ≤ y / λ ≤ b = ( y − λ b ) β − 1 ( λ a − y ) α − 1 / [ ( λ a − λ b ) α + β − 1 B ( α , β ) ] , λ b ≤ y ≤ λ a

Q.E.D.

Pham-Gia and Turkkan  gave the expression of the density of X 1 + X 2 , where X 1 and X 2 are independent general beta variables. The density of X 1 − X 2 , which is only mentioned there, is explicitly given below.

Proposition 2:

Let X 1 ~ G B ( α , β ; c , d ) and X 2 ~ G B ( γ , δ ; e , f ) . For the difference X 1 − X 2 defined on ( c − f , d − e ) , there are two different cases to consider, depending on the relative values of c − e and d − f , since X 1 and X 2 do not have symmetrical roles.

Case 1:

c − f ≤ d − f ≤ c − e ≤ d − e (6)

Case 2:

c − f ≤ c − e ≤ d − f ≤ d − e (7)

Theorem 3: Let X 1 and X 2 be two independent general betas with their supports satisfying (6). Then Y = X 1 − X 2 has its density defined as follows:

For c − f ≤ y ≤ d − f ,

f ( y ) = ( y − ( c − f ) ) α + δ − 1 ( d − f − y ) β − 1 B ( δ , α ) ( d − c ) α + β − 1 ( f − e ) δ B ( δ , γ ) B ( α , β ) F 1 ( δ , 1 − β , 1 − γ ; α + δ ; ( c − f ) − y ( d − f ) − y , y − ( c − f ) f − e ) (8)

For d − f ≤ y ≤ c − e ,

f ( y ) = ( y − ( d + f ) ) δ − 1 ( d − e − y ) γ − 1 ( f − e ) δ + γ − 1 B ( δ , γ ) F 1 ( β , 1 − δ , 1 − γ ; α + β ; c − d y − ( d − f ) , d − c d − e − y ) (9)

and for c − e ≤ y ≤ d − e ,

f ( y ) = ( ( d − e ) − y ) β + γ − 1 ( y − ( d − f ) ) δ − 1 B ( β , γ ) ( d − c ) β ( f − e ) δ + γ − 1 B ( δ , γ ) B ( α , β ) F 1 ( β , 1 − α , 1 − δ ; β + γ ; ( d − e ) − y d − c , y − ( d − e ) y − ( d − f ) ) (10)

where F 1 ( . ) is Appell’s first hypergeometric function already discussed.

Proof:

The argument uses first part 2) of Theorem 1 to obtain that − X 2 ~ G B ( δ , γ ; − f , − e ) . Then, it uses the exact expression of the density of the sum of two general betas (see Theorem 2 in the article of T. Pham-Gia & N. Turkkan  ).

Q.E.D.

We denote the above density given by (8), (9) and (10) by φ π ( α 1 , β 1 , α 2 , β 2 ; c , d , e , f )

Note: The corresponding case 2, when relation (7) is satisfied, is given in Appendix 1 (Theorem 3a).

To study the density of π − η = π 1 − ( π 2 + η ) , a particular case that will be used in our study here is the difference between X 1 ~ G B ( α 1 , β 1 ; 0 , 1 ) and X 2 ~ G B ( α 2 , β 2 ; η , η + 1 ) , − 1 ≤ η ≤ 1 , with η being a positive constant.

In this case both Theorem 2 and Theorem 3 apply since c − e = d − f and the middle definition section of φ π ( α 1 , β 1 , α 2 , β 2 ; c , d , e , f ) disappears.

Theorem 4: Let X 1 ~ G B ( α 1 , β 1 ; 0 , 1 ) and X 2 ~ G B ( α 2 , β 2 ; η , η + 1 ) be two independent general beta distributed random variables. Then the density of Y = X 1 − X 2 , defined on [ − ( η + 1 ) , 1 − η ] , is:

1) for − η − 1 ≤ y ≤ − η ,

f ( y ) = ( y + ( η + 1 ) ) α 1 + β 2 − 1 ( − η − y ) α 2 − 1 B ( α 1 , β 2 ) B ( α 1 , β 1 ) B ( α 2 , β 2 ) F 1 ( β 2 , 1 − β 1 , 1 − α 2 ; α 1 + β 2 ; ( η + 1 ) + y η + y , y + ( η + 1 ) )

2) for − η ≤ y ≤ 1 − η ,

f ( y ) = ( ( 1 − η ) − y ) α 2 + β 1 − 1 ( y + η ) β 2 − 1 B ( α 2 , β 1 ) B ( α 1 , β 1 ) B ( α 2 , β 2 ) F 1 ( β 1 , 1 − α 1 , 1 − β 2 ; α 2 + β 1 ; ( 1 − η ) − y , y − ( 1 − η ) y + η )

and we denote this distribution by

Y ~ ξ η ( α 1 , β 1 , α 2 , β 2 ; η ) . (11)

Proof:

This is a special case of Theorem 3.

Q.E.D.

An equivalent form using Theorem 4 leads to a slightly different expression, which gives however, the same numerical values for the density of π − η (see Theorem 4a in Appendix 1).

4.2. Prior and Posterior Distributions of π − η

Let π i , i = 1 , 2 be two independent beta distributed random variables, the first being a regular beta, π 1 ~ beta ( α 1 , β 1 ) , and the second being a general beta, π 2 ~ G B ( α 2 , β 2 ; η , 1 + η ) .

Binomial sampling, with these two different beta priors, leads to the following

Proposition 3: The prior distribution of π − η = π 1 − ( π 2 + η ) is ξ η ( α 1 , β 1 , α 2 , β 2 ; η ) , given by (11), and its posterior distribution is ξ η ( α 1 ∗ , β 1 ∗ , α 2 ∗ , β 2 ∗ ; η ) with α i ∗ = α i + x i and β i ∗ = β i + n i − x i , i = 1 , 2.

Proof:

π 1 − ( π 2 + η ) is the difference of two random variables with respective distribution beta ( α 1 , β 1 ) and G B ( α 2 , β 2 ; η , η + 1 ) , The prior distributions of π − η is hence ξ η ( α 1 , β 1 , α 2 , β 2 ; η ) , as given by (14).

Binomial sampling affects these 2 distributions in different ways. For the first, the posterior is beta ( α 1 + x 1 , β 1 + n 1 − x 1 ) while the posterior distribution of the second is G B ( α 2 + x 2 , β 2 + n 2 − x 2 ; η , η + 1 ) (see Proposition 3a in Appendix 2). Figure 3 shows the prior and the posterior of π 2 + 0.35 .

From Theorem 4, we obtain the expression of the posterior density ξ .35 ( 17.5 , 8.5 , 8.5 , 12.5 ; 0.35 ) of π − η as follows:

f ( x ) = { ( x + 1.35 ) 29 ( − 0.35 − x ) 7.5 B ( 17.5 , 12.5 ) B ( 17.5 , 8.5 ) B ( 8.5 , 12.5 ) F 1 ( 12.5 , − 7.5 , − 7.5 ; 30 ; 1.35 + x 0.35 + x , x + 1.35 ) , − 1.35 ≤ x < − 0.35 ( 0.65 − x ) 16 ( x + 0.35 ) 11.5 B ( 8.5 , 8.5 ) B ( 17.5 , 8.5 ) B ( 8.5 , 12.5 ) F 1 ( 8.5 , − 16.5 , − 11.5 ; 17 ; 0.65 − x , x − 0.65 x + 0.35 ) , − 0.35 ≤ x < 0.65 (12)

Figure 4 shows the above density.

5. Conclusion

The Bayesian approach to testing the difference of two independent proportions leads to interesting results which agree with frequentist results when non-informative priors are considered. Undoubtedly, all preceding results can be

generalized to other measures frequently used in a 2 ´ 2 table.

Acknowledgements

Research partially supported by NSERC grant 9249 (Canada). The authors wish to thank the Universite de Moncton Faculty of Graduate Studies and Research for the assistance provided while conducting this work.

Cite this paper

Pham-Gia, T., Thin, N.V. and Doan, P.P. (2017) Inferences on the Difference of Two Proportions: A Bayesian Approach. Open Journal of Statistics, 7, 1-15. https://doi.org/10.4236/ojs.2017.71001

Appendix 1

Below is the expression of the density of Y = X 1 − X 2 when (7) is satisfied, instead of (6). This expression, with the one given in Theorem 3, covers all cases.

Theorem 3a: Let X 1 and X 2 be two independent general betas with their supports satisfying (10). Then Y = X 1 − X 2 has its density defined as follows: for c − f ≤ y ≤ c − e ,

f ( y ) = ( y − ( c − f ) ) α + δ − 1 ( c − e − y ) γ − 1 B ( α , δ ) ( f − e ) δ + γ − 1 ( d − c ) α B ( α , β ) B ( δ , γ ) F 1 ( α , 1 − γ , 1 − β ; α + δ ; ( c − f ) − y ( c − e ) − y , y − ( c − f ) d − c ) (13)

For c − e ≤ y ≤ d − f ,

f ( y ) = ( y + ( c + e ) ) α − 1 ( d − e − y ) β − 1 ( d − c ) α + β − 1 B ( α , β ) F 1 ( γ , 1 − α , 1 − β ; δ + γ ; e − f y − ( c − e ) , f − e d − e − y ) (14)

For d − f ≤ y ≤ d − e ,

f ( y ) = ( ( d − e ) − y ) β + γ − 1 ( y − ( c − e ) ) α − 1 B ( β , γ ) ( f − e ) γ ( d − c ) α + β − 1 B ( δ , γ ) B ( α , β ) F 1 ( γ , 1 − δ , 1 − α ; β + γ ; ( d − e ) − y f − e , y − ( d − e ) y ) (15)

Proof:

By rewriting Y = ( − X 2 ) − ( − X 1 ) , we can apply the above Theorem 2 and Theorem 3.

Q.E.D

A parallel, and equivalent, result to Theorem 4 is given below:

Theorem 4a: The density of X 1 − X 2 − η is:

For − η − 1 ≤ y ≤ − η ,

f ( y ) = ( y + ( η + 1 ) ) α 1 + β 2 − 1 ( − η − y ) α 2 − 1 B ( α 1 , β 2 ) B ( α 1 , β 1 ) B ( α 2 , β 2 ) F 1 ( α 1 , 1 − α 2 , 1 − β 1 ; α 1 + β 2 ; ( η + 1 ) + y η + y , y + ( η + 1 ) )

For − η ≤ y ≤ 1 − η ,

f ( y ) = ( ( 1 − μ ) − y ) α 2 + β 1 − 1 ( y + η ) α 1 − 1 B ( α 2 , β 1 ) B ( α 1 , β 1 ) B ( α 2 , β 2 ) F 1 ( α 2 , 1 − β 2 , 1 − α 1 ; α 2 + β 1 ; ( 1 − η ) − y , y − ( 1 − η ) y + η )

and we denote Y ~ ξ η ∗ ( α 1 , β 1 , α 2 , β 2 ; η ) .

Proof:

Similar to the proof of Theorem 4.

Q.E.D

Appendix 2

Proposition 3a:

Suppose that X 2 ~ Bin ( n 2 , π 2 ) and π 2 has the prior distribution beta ( α 2 , β 2 ) then the posterior distribution of π 2 + η is G B ( α 2 + x 2 , β 2 + n 2 − x 2 ; η , η + 1 ) .

Proof:

The prior distribution of π 2 + η is G B ( α 2 , β 2 ; η , η + 1 ) (see Theorem 2) with the pdf

f π 2 + η ( π 2 | x 2 ) = [ B ( α 2 , β 2 ) ] − 1 ( π 2 − η ) α 2 − 1 ( 1 + η − π 2 ) β 2 − 1 , η ≤ π 2 ≤ η + 1 ,

The likelihood function is

f X 2 | π 2 + η ( x 2 | θ ) = f X 2 | π 2 ( x 2 | π 2 ) = ( n 2 x 2 ) π 2 x 2 ( 1 − π 2 ) n 2 − x 2 , x 2 = 0 , 1 , ⋯ , n

Thus the marginal distribution of X 2 , the number of success, with π 2 = θ − η , has density:

K ( x 2 | α 2 , β 2 , n 2 ) = ( n 2 x 2 ) B ( α 2 , β 2 ) ∫ η η + 1 ( θ − η ) α 2 − 1 ( 1 + η − θ ) β 2 − 1 π 2 x 2 ( 1 − π 2 ) n 2 − x 2 d θ , = ( n 2 x 2 ) B ( α 2 , β 2 ) ∫ η η + 1 ( θ − η ) α 2 − 1 ( 1 + η − θ ) β 2 − 1 ( θ − η ) x 2 ( 1 + η − θ ) n 2 − x 2 d θ = ( n 2 x 2 ) B ( α 2 , β 2 ) ∫ η η + 1 ( θ − η ) α 2 + x 2 − 1 ( 1 + η − θ ) β 2 + n 2 − x 2 − 1 d θ = ( n 2 x 2 ) B ( α 2 , β 2 ) B ( α 2 + x 2 , β 2 + n 2 − x 2 )

Therefore, the posterior distribution of θ given X 2 = x 2 is

f π 2 + η | X 2 ( θ | x 2 ) = f π 2 + η ( θ | x 2 ) f X 2 | π 2 + η ( x 2 | θ ) K ( x 2 | α 2 , β 2 , n 2 ) = [ B ( α 2 , β 2 ) ] − 1 ( θ − η ) α 2 − 1 ( 1 + η − θ ) β 2 − 1 ( n 2 x 2 ) π 2 x 2 ( 1 − π 2 ) n 2 − x 2 ( n 2 x 2 ) B ( α 2 , β 2 ) B ( α 2 + x 2 , β 2 + n 2 − x 2 ) , with π 2 = θ − η , η ≤ θ ≤ η + 1 = ( θ − η ) α 2 + x 2 − 1 ( 1 + η − θ ) β 2 + n 2 − x 2 − 1 B ( α 2 + x 2 , β 2 + n 2 − x 2 ) , η ≤ θ ≤ η + 1

This is the p.d.f. of G B ( α 2 + x 2 , β 2 + n 2 − x 2 ; η , η + 1 ) .

Q. E. D.

End

ReferencesPham-Gia, T. and Turkkan, N. (1993) Bayesian Analysis of the Difference of two Proportions. Communications in Statistics—Theory and Methods, 22, 1755-1771. https://doi.org/10.1080/03610929308831114Pham-Gia, T. and Turkkan, N. (2008) Bayesian Analysis of a 2 × 2 Contingency Table with Dependent Proportions and Exact Sample Sizes. Statistics, 42, 127-147. https://doi.org/10.1080/02331880701600380Pham-Gia, T. and Turkkan, N. (2003) Determination of the Exact Sample Sizes in the Bayesian Estimation of the Difference between Two Proportions. Journal of the Royal Statistical Society, 52, 131-150. https://doi.org/10.1111/1467-9884.00347D’Agostino, R., Chase, W. and Belanger, A. (1988) The Appropriateness of Some Common Procedures for Testing the Equality of Two Independent Binomial Populations. The American Statistician, 42, 198-202.Cressie, N. (1978) Testing the Equality of Two Binomial Proportions. Annals of the Institute of Statistical Mathematics, 30, 421-427. https://doi.org/10.1007/BF02480232Eberhardt, K.R. and Fligner, M.A. (1977) A Comparison of Two Tests for Equality of Two Proportions. The American Statistician, 21, 151-155. https://doi.org/10.1080/00031305.1977.10479225Santner, T.J. and Snell, M.K. (1975) Small Sample Confidence Intervals for and in 2 × 2 Contingency Tables. JASA, 75, 386-394.Howard, J.V. (1998) The 2 × 2 Table: A Discussion from a Bayesian Viewpoint. Statistical Sciences, 13, 351-367. https://doi.org/10.1214/ss/1028905830Agresti, A. and Coull, B. (1998) Approximate Is Better than Exact for Interval Estimation of Binomial Proportions. The American Statistician, 52, 2.Berger, J. (1999) Bayes Factor. In: Kotz, S., Read, C.B. and Banks, D.L., Eds., Encyclopedia of Statistics, Update 3, Wiley, NY, 20-29.Lee, P.M. (2004) Bayesian Statistics. An Introduction. 3rd Edition, Hodder Arnold, London.Pham-Gia, T. and Turkkan, N. (1993) Computation of the Highest Posterior Density Interval in Bayesian Analysis. Journal of Statistical Computation and Simulation, 44, 243-250. https://doi.org/10.1080/00949659308811461Pham-Gia, T. and Turkkan, N. (1998) Distribution of the Linear Combination of Two General Beta Variables and Applications. Communications in Statistics—Theory and Methods, 27, 1851-1869. https://doi.org/10.1080/03610929808832194Pham-Gia, T. and Turkkan, N. (1994) Reliability of a Standby System with Beta-Distributed Component Lives. IEEE Transactions on Reliability, R43, 71-75. https://doi.org/10.1109/24.285114