On Sample Size Determination When Comparing Two Independent Spearman or Kendall Coefficients

Abstract

One of the most commonly used statistical methods is bivariate correlation analysis. However, it is usually the case that little or no attention is given to power and sample size considerations when planning a study in which correlation will be the primary analysis. In fact, when we reviewed studies published in clinical research journals in 2014, we found that none of the 111 articles that presented results of correlation analyses included a sample size justification. It is sometimes of interest to compare two correlation coefficients between independent groups. For example, one may wish to compare diabetics and non-diabetics in terms of the correlation of systolic blood pressure with age. Tools for performing power and sample size calculations for the comparison of two independent Pearson correlation coefficients are widely available; however, we were unable to identify any easily accessible tools for power and sample size calculations when comparing two independent Spearman rank correlation coefficients or two independent Kendall coefficients of concordance. In this article, we provide formulas and charts that can be used to calculate the sample size that is needed when testing the hypothesis that two independent Spearman or Kendall coefficients are equal.

Share and Cite:

May, J. and Looney, S. (2022) On Sample Size Determination When Comparing Two Independent Spearman or Kendall Coefficients. Open Journal of Statistics, 12, 291-302. doi: 10.4236/ojs.2022.122020.

1. Introduction

One of the most commonly used statistical methods is bivariate correlation analysis. Sometimes it is of interest to compare the correlation of two variables X and Y that have been calculated using two independent samples. For example, in an unpublished Master’s thesis [1], Stuart evaluated several potential biomarkers for severity of symptoms of dry mouth (xerostomia). As part of her assessment of these biomarkers, she examined the association of one of the potential biomarkers, p21, with another potential biomarker, PCNA. Stuart wished to compare two independent groups in terms of the association between p21 and PCNA: 1) healthy subjects and 2) dental patients with xerostomia. An important part of the planning of this study is to ask the question “How large a sample is needed in the two groups to achieve adequate power?”

Tools are widely available for performing sample size and power calculations when the analysis involves the comparison of two Pearson correlation coefficients (PCCs). These include, for example, tables [2], software packages (PASS, nQuery, G*Power), and internet-based tools (e.g., https://www.unistat.com/guide/sample-size-and-power-two-correlations/). However, as best we can determine, there are no easily accessible tools for sample size calculation when the planned analysis will be a comparison of either two Spearman rank correlation coefficients (SCCs) or two Kendall coefficients of concordance (KCCs). The PCC is the most commonly used measure of bivariate association; however, the SCC and KCC are also widely used. For example, Brough et al. [3] used the SCC to measure the associations between peanut protein levels found in various household environments, including dust, surfaces, bedding, furnishings and air. Heist et al. [4] used the KCC as the measure of association in their study of the use of bevacizumab as a chemotherapeutic agent for the treatment of advanced non-small cell lung cancer. The KCC is also frequently used in the analysis of environmental data. For example, Helsel ( [5], pp. 227-228) used the KCC in his examination of concentrations of dissolved iron in water samples.

In May and Looney [6], we provided formulas and charts that can be used to determine the sample size needed for a hypothesis test of a single Spearman or Kendall coefficient. In this article, we extend those results to the situation in which it is desired to compare two independent Spearman or Kendall coefficients.

In Section 2, we briefly describe our previously published methods for sample size determination for a single Spearman or Kendall coefficient [6]. This will facilitate the description of our methods for determining the sample size when comparing two independent coefficients since the results in the two situations are similar in many ways. In Section 3, we present the formulas and charts for comparing two independent Spearman or Kendall coefficients; in Section 4, we provide some notes on the use of the sample size charts; and, in Section 5, we discuss our results.

2. Methods for Sample Size Determination for a Single Measure of Association

Let ξ denote the population value of the measure of association to be used (either the PCC, SCC or KCC). Suppose that we wish to test a hypothesis involving ξ. In the two-sided case, for example, the hypotheses to be tested can be stated as:

H 0 : ξ = ξ 0 vs . H a : ξ ξ 0 , (1)

where ξ0 is the pre-specified null value of the desired measure of association and 1 < ξ 0 < 1 . For the PCC, the most common approach for testing the hypotheses in (1) is to use a test statistic based on the Fisher z-transform of the sample value of the PCC, denoted by r. For any 1 < r < 1 , the Fisher z-transform of r is given by [7]:

z ( r ) = tanh 1 r = ln ( ( 1 + r ) / ( 1 r ) ) / 2 , (2)

which is asymptotically distributed as N ( tanh 1 ρ , σ z 2 ) , where ρ denotes the population value of the PCC and σ z 2 denotes the asymptotic variance of z(r). The transformation in (2) can be applied to the sample SCC or KCC; this also yields an approximately normally distributed transformed coefficient. For the PCC, σ z 2 = 1 / ( n 3 ) [7] and, for the KCC, σ z 2 = 0.437 / ( n 4 ) [8]. For the SCC, the preferred value of σ z 2 depends on ρs, the population value of the Spearman coefficient: for | ρ s | < 0.95 , σ z 2 = ( 1 + ρ s 2 / 2 ) / ( n 3 ) [9]; for | ρ s | 0.95 , σ z 2 = 1.06 / ( n 3 ) [8]. The value σ z 2 = 1.06 / ( n 3 ) is more commonly used for the SCC than the improved value proposed by Bonett and Wright [9] when | ρ s | < 0.95 ; however, the results in this article are based on the Bonett and Wright values.

A test statistic for testing H 0 : ξ = ξ 0 based on the Fisher z-transform is given by:

z ξ 0 = z ( ξ ^ ) z ( ξ 0 ) c 2 / ( n b ) , (3)

where ξ ^ denotes the sample estimate of ξ, ξ0 denotes the hypothesized value of ξ, n denotes the sample size, z ( · ) denotes the Fisher z-transform, and b and c2 are obtained from Table 1.

An approximate p-value is obtained using the appropriate tail probability for zξ0 given in (3) using the standard normal distribution.

Suppose we wish to test H 0 : ξ = ξ 0 using the test statistic in (3). Let ξ1 denote the alternative value of the measure of association that we wish to detect with our planned hypothesis test. The required sample size for detecting the value ξ1 with power φ using a two-tailed test of H 0 : ξ = ξ 0 with significance level α is given by:

n = b + c 2 [ z α / 2 + z 1 φ z ( ξ 1 ) z ( ξ 0 ) ] 2 , (4)

Table 1. Constants needed to apply the Fisher z-transform to measures of association.

a# ρ s 0 denotes the null value of the SCC. c 2 = 1 + ( ρ s 0 2 / 2 ) is used in the sample size calculation if | ρ s 0 | < 0.95 ; otherwise 1.06 is used.

where zγ = upper γ-percentage point of the standard normal distribution, z ( · ) denotes the Fisher z-transform, and b and c2 are obtained from Table 1.

For a one-tailed test of H 0 : ξ = ξ 0 , replace zα/2 in (4) by zα. If the formula in (4) does not yield an integer value, round up to the next largest integer.

In our previous article [6], we provided charts based on (4) for finding the sample size needed to achieve 80% power for tests of the SCC and KCC using significance level 0.05.

3. Methods for Sample Size Determination for Two Independent Coefficients

Suppose that we wish to test the equality of two coefficients ξ1 and ξ2, where ξ denotes either the SCC or the KCC, and that ξ1 will be estimated using a sample that is independent of the sample used to estimate ξ2. In the two-sided case, for example, the hypotheses to be tested are given by:

H 0 : ξ 1 = ξ 2 vs . H a : ξ 1 ξ 2 . (5)

Let ξ0 denote the common value of ξ1 and ξ2 in H0.

We can use the approximate distributional results given in Section 2 for the Fisher z-transform of the estimators of the SCC and KCC to derive a test statistic for testing the hypotheses in (5). Let z ( ξ ^ 1 ) and z ( ξ ^ 2 ) denote the Fisher z-transformed estimators of ξ1 and ξ2, respectively. Assume that ξ1 is estimated using a sample that is independent of the sample used to estimate ξ2; hence, ξ ^ 1 and ξ ^ 2 are independent. Suppose that the same sample size n will be used to estimate ξ1 and ξ2. (The assumption of equal sample sizes will be relaxed in Section 4.5.) Under the null hypothesis H0 in (5), the random variable z ( ξ ^ 1 ) z ( ξ ^ 2 ) is asymptotically normally distributed with mean 0 and variance c 2 ( 2 n b ) , where the appropriate values of b and c2 are given in Table 1. A test statistic for testing the hypotheses in (5) is given by:

z ξ 0 = z ( ξ ^ 1 ) z ( ξ ^ 2 ) 2 c 2 / ( n b ) , (6)

where ξ ^ 1 and ξ ^ 2 denote the sample estimates of ξ1 and ξ2, respectively, n denotes the sample size in each group, z ( · ) denotes the Fisher z-transform, and b and c2 are given in Table 1.

An approximate p-value is then obtained by calculating the appropriate tail probability for zξ0 in (6) using the standard normal distribution.

For the SCC, the hypotheses in (5) would be written as

H 0 : ρ s 1 = ρ s 2 vs . H a : ρ s 1 ρ s 2 . (7)

Let ρs0 denote the common value of the two Spearman coefficients under the null hypothesis in (7); namely, ρ s 0 = ρ s 1 = ρ s 2 . The value of c2 in the test statistic in (6) when testing two SCCs is therefore c 2 = 1 + ( ρ s 0 2 / 2 ) if | ρ s 0 | < 0.95 ; otherwise c2 = 1.06 is used.

Assume that the two independent measures of association ξ1 and ξ2 (either two PCCs, two SCCs, or two KCCs) will be estimated using samples of equal size. Let ξ0 denote the common value of ξ1 and ξ2 in the null hypothesis in (5) and let ξ21 denote the alternative value of ξ2 that one wishes to detect, assuming that ξ1 = ξ0. For a two-tailed test of the null hypothesis in (5), the required sample size for detecting the alternative value ξ21 with power φ using a test based on (6) with significance level α is given by:

n = b + 2 c 2 ( z α / 2 + z 1 φ ) 2 [ z ( ξ 0 ) z ( ξ 21 ) ] 2 , (8)

where zγ = upper γ-percentage point of the standard normal distribution, and z ( · ) denotes the Fisher z-transform.

The values of b and c2 in (8) are given in Table 1. For a one-tailed test of H 0 : ξ 1 = ξ 2 , replace zα/2 in (8) by zα. If the formula in (8) does not yield an integer value, round up to the next largest integer.

A chart for finding the required per-group sample size that will yield 80% power for comparing two independent Spearman coefficients based on samples of equal size using significance level 0.05 is provided in Figure 1 for a two-tailed test and in Figure 2 for a one-tailed test. The corresponding charts for two independent Kendall coefficients are given in Figure 3 and Figure 4, respectively.

Figure 1. Curves for finding the required sample size that will yield 80% power for a two-tailed test to compare two independent Spearman coefficients using significance level 0.05, assuming equal sample sizes in the two groups. To use the chart, first locate the larger value for the SCC associated with the alternative hypothesis along the horizontal axis. Next, draw a vertical line that intersects with the curve corresponding to the smaller value associated with the alternative hypothesis. Finally, draw a horizontal line from the curve to the vertical axis. The point of intersection is the required sample size in each group.

Figure 2. Curves for finding the required sample size that will yield 80% power for a one-tailed test of two independent Spearman coefficients using significance level 0.05, assuming equal sample sizes in the two groups. To use the chart, first locate the larger value for the SCC associated with the alternative hypothesis along the horizontal axis. Next, draw a vertical line that intersects with the curve corresponding to the smaller value associated with the alternative hypothesis. Finally, draw a horizontal line from the curve to the vertical axis. The point of intersection is the required sample size in each group.

Figure 3. Curves for finding the required sample size that will yield 80% power for a two-tailed test to compare two independent Kendall coefficients using significance level 0.05, assuming equal sample sizes in the two groups. To use the chart, first locate the smaller value for the KCC associated with the alternative hypothesis along the horizontal axis. Next, draw a vertical line that intersects with the curve corresponding to the larger value associated with the alternative hypothesis. Finally, draw a horizontal line from the curve to the vertical axis. The point of intersection is the required sample size in each group.

Figure 4. Curves for finding the required sample size that will yield 80% power for a one-tailed test of two independent Kendall coefficients using significance level 0.05, assuming equal sample sizes in the two groups. To use the chart, first locate the smaller value for the KCC associated with the alternative hypothesis along the horizontal axis. Next, draw a vertical line that intersects with the curve corresponding to the larger value associated with the alternative hypothesis. Finally, draw a horizontal line from the curve to the vertical axis. The point of intersection is the required sample size in each group.

To use the chart in Figure 1, first locate the larger of the two SCCs associated with the alternative hypothesis along the horizontal axis. Next, draw a vertical line that intersects with the curve corresponding to the smaller value of the SCC associated with the alternative hypothesis. Finally, draw a horizontal line from the curve to the vertical axis. The point of intersection is the required sample size in each group.

For example, suppose one wishes to test

H 0 : ρ s 1 = ρ s 2 vs . H a : ρ s 1 ρ s 2

and that the common null value under H0 is ρ s 0 = ρ s 1 = ρ s 2 = 0.6 and the alternative value of the SCC that one wishes to detect is 0.4. In other words, the alternative difference that one wishes to detect is ρ s 1 ρ s 2 = 0.6 0.4 = 0.2 . To use Figure 1 to determine the sample size needed in each group to detect this difference with 80% power using a significance level of 0.05, locate max ( | ρ s 1 | , | ρ s 2 | ) = 0.6 on the horizontal axis and locate the curve corresponding to min ( | ρ s 1 | , | ρ s 2 | ) = 0.4 . After drawing the horizontal and vertical lines as described above, we find n = 258 (Figure 5). For ρs1 = 0.4 and ρs2 = 0.2, locate max ( | ρ s 1 | , | ρ s 2 | ) = 0.4 on the horizontal axis and locate the curve corresponding to min ( | ρ s 1 | , | ρ s 2 | ) = 0.2 ; the resulting sample size is n = 351. For a one-tailed test, the sample sizes obtained using Figure 2 are 204 and 277, respectively.

Figure 5. Illustration of how to use the sample size curves in Figure 1. For the example described in Section 3, first locate max ( | ρ s 1 | , | ρ s 2 | ) = 0.6 on the horizontal axis and draw a vertical line that intersects with the curve corresponding to min ( | ρ s 1 | , | ρ s 2 | ) = 0.4 . Then, draw a horizontal line from the curve to the vertical axis. The point of intersection with the vertical axis is the required sample size in each group. After drawing the horizontal and vertical lines described above, we find n = 258 after using graphical interpolation.

If we use the KCC instead of the SCC as the measure of association, the required sample size for a two-tailed test, assuming τ1 = 0.6 and τ2 = 0.4 is n = 99, according to Figure 3. For τ1 = 0.4 and τ2 = 0.2, the sample size is n = 145. For a one-tailed test, the required sample sizes obtained from Figure 4 are 79 and 115, respectively. Note that, in the KCC charts, we use the maximum and minimum values associated with the alternative hypothesis differently than in the SCC charts. For the KCC charts, we first locate min ( | τ 1 | , | τ 2 | ) (rather than the max) along the horizontal axis, and then draw a vertical line that intersects with the curve corresponding to max ( | τ 1 | , | τ 2 | ) (rather than the min).

4. Notes on Using the Charts

4.1. Specification of Planning Values

It may not be possible for the analyst to specify the relevant planning values ρs0, ρs1, and ρs2 for the SCC (or τ0, τ1, and τ2 for the KCC). Since the null hypothesis for the SCC in (7) can also be written as H 0 : ρ s 1 ρ s 2 = 0 , it is not necessary to specify the common null value ρ s 0 = ρ s 1 = ρ s 2 in H0. However, the required sample sizes in the two independent groups depend on the alternative values of ρs1 and ρs2 to be detected, as illustrated in Table 2 for ρ s 1 ρ s 2 = 0.2 (assuming that ρ s 1 > ρ s 2 > 0 ).

Table 2. Required sample sizes to detect ρs1ρs2= 0.2, Power = 80%, Significance Level = 0.05.

As can be seen from Table 2, for a given alternative difference ρs1ρs2 to be detected, the required sample size in each group depends very heavily on the values of ρs1 and ρs2. If there is no information available to help specify reasonable planning values, we recommend performing the type of sensitivity analysis illustrated in Table 2 to assist in selecting values of ρs0, ρs1, and ρs2 for the SCC (or τ0, τ1, and τ2 for the KCC).

4.2. Minimum Detectable Difference

In addition to finding the sample size needed for planning statistical inference for the SCC and the KCC, the charts can also be used to find the minimum detectable difference for a given sample size. For example, suppose that one wishes to compare two independent Kendall coefficients using a two-tailed test, and that the largest possible sample size available to the investigators is n = 100 in each group. Assuming that the smaller of the two alternative values is τ1 = 0.4, one could determine the minimum detectable difference by first drawing a horizontal line from n = 100 on the vertical axis, and then drawing a vertical line from 0.4 on the horizontal axis. One can then simply read off the alternative values larger than 0.4 from this vertical line until it intersects with the horizontal line drawn from the vertical axis. Using Figure 3, we see that with n = 100 in each group, we could detect any alternative value τ2 greater than or equal to 0.6 with 80% power using a two-tailed significance level of 0.05.

4.3. Negative Values

The sample size charts in this article can be used only when ξ1 and ξ2 have the same sign. For negative values of ξ1 and ξ2, one simply enters the appropriate chart with |ξ1| and |ξ2|. If ξ1 and ξ2 are of opposite signs, the formula in (8) is still valid; however, the charts provided in this article do not apply.

4.4. Interpolation Errors

As with any graphical method, these charts are subject to error. For example, one must interpolate graphically if the horizontal line drawn from the appropriate curve in the charts intersects the vertical axis at a value between the tick marks. This is more of a problem for larger sample sizes since the distances between the tick marks are much larger. However, the error for the interpolated value of n will be no larger than the difference between the values of n corresponding to the two relevant tick marks, and an adjustment can be made by slightly inflating the value read from the chart. Our experience has been that increasing the n value obtained from the chart by 5 is usually adequate. We have found that an 6-inch ruler marked off in millimeters is particularly useful for carrying out any necessary graphical interpolation in the charts.

4.5. Unequal Sample Sizes

In Section 3, the assumption was made that the sample sizes were equal in the two independent samples used to estimate ξ1 and ξ2. If it is desirable that the sample sizes be different in the two groups, this can be accomplished as follows. Let n1 and n2 denote the sizes of the samples on which the estimates of ξ1 and ξ2 will be based, respectively. Let a = n2/n1 denote the desired allocation ratio of the sample sizes in the two groups. Let n denote the per-group sample size obtained from the charts in Figures 1-4. Then allocating n 1 = 2 n / ( 1 + a ) to the sample used to estimate ξ1 and allocating 2nn1 to the sample used to estimate ξ2 will yield the desired sample sizes n1 and n2. A non-integer value n1 obtained from the above formula should be rounded up to the next largest integer. To illustrate, consider the example in Section 3, in which the alternative values to be detected were ρs1 = 0.6 and ρs2 = 0.4. The per-group sample size obtained from Figure 1 was n = 258. Assume that the desired allocation ratio is a = 1/2. Then, n 1 = 2 n / ( 1 + a ) = 2 ( 258 ) / ( 1 + 0.5 ) = 344 and n 2 = 2 ( 258 ) 344 = 172 . Equal sample sizes in the two independent groups will maximize power for the test of H 0 : ρ s 1 = ρ s 2 or H 0 : τ 1 = τ 2 , so allocating unequal sample sizes to the two groups will result in a loss of power and should not be done unless absolutely necessary.

4.6. Choice of Values of ρs1 and ρs2 in the Charts

The values of ρs1 and ρs2 presented in our charts were chosen to make the charts as easy to use as possible and to avoid cluttering the graphs. In particular, for the Spearman charts (Figure 1, Figure 2, Figure 5), we presented results for the following choices of max ( | ρ s 1 | , | ρ s 2 | ) , which was plotted on the horizontal axis: 0.05 (0.05) 0.90. We presented results for sample size curves corresponding to the following choices of min ( | ρ s 1 | , | ρ s 2 | ) : 0.0 (0.1) 0.8. If we had included a curve for min ( | ρ s 1 | , | ρ s 2 | ) = 0.9 , for example, this would have consisted of a single point, which we felt would have detracted from the overall visual appeal and interpretability of the charts. In general, for any values of ρs1 and ρs2 that are not available on the sample size curves for either the Spearman or Kendall coefficients, the simple formula for n given in Equation (8) can be used to determine the required sample size.

5. Discussion

In this article, we have presented charts that can be used for sample size determination when planning a study in which hypothesis testing will be used to compare two independent Spearman or Kendall coefficients. In addition to the charts, we have provided simple sample size formulas that can be used for more accurate calculations. We have found Microsoft Excel© to be particularly useful for performing these calculations, and a spreadsheet that accomplishes this is available from the second author.

Despite the widespread use of correlation analysis, it is usually the case that, when planning a study in which correlation will be the primary analysis, little or no attention is given to sample size determination. This general impression was confirmed by our review of studies that used correlation as the primary analysis; none of the 111 studies published in clinical research journals in 2014 provided a power analysis or sample size calculation.

While it is true that two of the key references in this article ( [7] [9] ) are rather old, they provide valid results that are directly relevant to the present article. As stated in the Introduction, we were unable to locate any modern tools (software, tables, graphs, etc.) that can be used to determine the sample size needed for comparing two independent Spearman or Kendall coefficients. Hence, we made use of the classical results from these two articles to develop our new tools for addressing this problem. The present article represents an application of the results in [7] [8], and [9] to extend our results of our recently published article [6], which considered only a single Spearman or Kendall coefficient.

We hope that, by making available the easy-to-use tools presented in this article, analysts will be encouraged to perform sample size calculations for correlation coefficient inference. Given the adverse consequences that can occur when studies are either under- or over-powered, it is extremely important that such calculations be made prior to beginning a research study. Our future research efforts will focus on extending our results in [6] to sample size estimation for the intra-class correlation.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] Stuart, M. (2013) Identification of Novel Molecular Biomarkers for Diagnosis of Salivary Dysfunction. Master’s Thesis, Georgia Regents University, Augusta.
[2] Cohen, J. (1988) Statistical Power Analysis for the Behavioral Sciences. 2nd Edition, Lawrence Erlbaum Associates, Hillsdale, New Jersey.
[3] Brough, H.A., Makinson, K., Penagos, M., Maleki, S.J., Cheng, H., Douiri, A., Stephens, A.C., Turcanu, V. and Lack, G. (2013) Distribution of Peanut Protein in the Home Environment. Journal of Allergy and Clinical Immunology, 132, 623-629.
https://doi.org/10.1016/j.jaci.2013.02.035
[4] Heist, R.S., Duda, G.D., Sahani, D., Pennell, N.A., Neal, J.W., Ancukiewicz, M., Engelman, J.A., Lynch, T.J. and Jain, R.K. (2010) In Vivo Assessment of the Effects of Bevacizumab in Advanced Non-Small Cell Lung Cancer (NSCLC). Journal of Clinical Oncology, 28, 7612.
https://doi.org/10.1200/jco.2010.28.15_suppl.7612
[5] Helsel, D.R. (2012) Statistics for Censored Environmental Data Using Minitab and R. 2nd Edition, John Wiley & Sons, Hoboken, New Jersey.
https://doi.org/10.1002/9781118162729
[6] May, J.O. and Looney, S.W. (2020) Sample Size Charts for Spearman and Kendall Coefficients. Journal of Biometrics & Biostatistics, 11, 7 p.
[7] Fisher, R.A. (1925) Statistical Methods for Research Workers. Hafner Press, London.
[8] Fieller, E.C., Hartley, H.O. and Pearson, E.S. (1957) Tests for Rank Correlation Coefficients. Biometrika, 44, 470-481.
https://doi.org/10.1093/biomet/44.3-4.470
[9] Bonett, D.G. and Wright, T.A. (2000) Sample Size Requirements for Estimating Pearson, Kendall, and Spearman Correlations. Psychometrika, 65, 23-28.
https://doi.org/10.1007/BF02294183

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.