^{1}

^{*}

^{2}

^{3}

^{4}

^{5}

^{6}

Guastello’s polynomial regression method for solving cusp catastrophe model has been widely applied to analyze nonlinear behavior outcomes. However, no statistical power analysis for this modeling approach has been reported probably due to the complex nature of the cusp catastrophe model. Since statistical power analysis is essential for research design, we propose a novel method in this paper to fill in the gap. The method is simulation-based and can be used to calculate statistical power and sample size when Guastello’s polynomial regression method is used to do cusp catastrophe modeling analysis. With this novel approach, a power curve is produced first to depict the relationship between statistical power and samples size under different model specifications. This power curve is then used to determine sample size required for specified statistical power. We verify the method first through four scenarios generated through Monte Carlo simulations, and followed by an application of the method with real published data in modeling early sexual initiation among young adolescents. Findings of our study suggest that this simulation-based power analysis method can be used to estimate sample size and statistical power for Guastello’s polynomial regression method in cusp catastrophe modeling.

Popularized in the 1970 ’ s by Thom [

Even though this polynomial regression method has been widely applied in behavioral studies to investigate the existence of cusp catastrophe, to the best of our knowledge, no reported research has addressed the determination of sample size and statistical power for this analytical approach. Statistical power analysis is an essential part for researchers to efficiently plan and design a research project as pointed out in [

The structure of the paper is as follows. We start with a brief review of the cusp catastrophe model (Section 2), followed by reporting our development of the novel simulation-based approach to calculate the statistical power (Section 3). This approach is then verified through Monte Carlo simulations and is further illustrated with data derived from published study (Section 4). Conclusions and discussions are given at the end of the paper (Section 5).

The cusp catastrophe model is proposed to model system outcomes which can incorporate the linear model with extension to nonlinear model along with discontinuous transitions in equilibrium states as control variables vary. According to the catastrophe systems theory [

where

Furthermore, the paths A, B, and C in

outcome measure

From the affirmative description, it is clearly that a cusp model differs from a linear model in that: 1) A cusp model allows the forward and backward progression follows different paths in the outcome measure and both processes can be modeled simultaneously (see Paths B and C in

To operationalize the cusp catastrophe model for behavior research, Guastello [

where

To demonstrate the efficiency of the polynomial regression approach in describing behavioral changes that are cusp, Guastelly [

1) Change scores linear models

2) Pre-and post-linear models

These alternative linear models add another analytical strategy to strength the polynomial regression method. A better data-model fitting (or a larger

In statistics, power is defined as the probability of correctly rejecting the null hypothesis. Stated in common language, power is the fraction of the times that the specified null-hypothesis value will be rejected from statistical tests. Operationally based on this definition, if we specify an alternative hypothesis

As detailed in Chapter 7 in [

Extending the same concept described above for Guastello’s polynomial cusp regression, we would need to specify the corresponding parameter effect size for all

Power analysis and sample size determination can be developed for specific purpose. Typically, it is developed to detect treatment effect as in clinical trials or to detect the effect of specific risk factor as in regression. Similar development can be done to Guastello’s cusp regression model for specific repressor in asymmetry variable

1) Simulate data with sample size

2) Specify model parameter effect size

3) Calculate

4) Fit the Guastello’s cusp regression model (Equation (2)) with least squares method using the data generated for

5) Repeat Steps 1 to 4 a large number of times (typically 1000) and calculate the proportion of simulations which satisfy the Guastello’s decision rules. This proportion then provides an estimate of the statistical power for the pre-specified sample size and the study specifications given in Steps 1 and 2;

6) With the above established five steps for power assessment, sample size is then determined to reach a pre-specified level of statistical power. This is carried out by running Steps 1 to 5 with a range of sample sizes

The simulation-based approach described above is implemented in free

To verify the novel approach proposed in Section 3, we simulated four scenarios with

Data are generated with the asymmetry variable

With the generated

Four data sets for the four scenarios (e.g.,

Results of other three scenarios in

the existence of a cusp. With regard to Scenario 3 where

To demonstrate the proposed novel simulation method, we estimate sample sizes needed for each of the four scenarios to achieve 85% statistical power employing this method and the estimated parameter

If the novel simulation-based approach is valid, the sample size estimates for each of the four scenarios described in previous section will allow approximately 85% chance to detect the underlying cusp. Therefore, we took a reverse approach to compute statistical power by applying the calculated sample size as input for each of the four scenarios. Results in

_{ } | 0.487^{***} | 0.473. | 0.459 | 0.446 |

0.540^{***} | 0.581^{***} | 0.621^{***} | 0.661^{***} | |

0.456^{***} | 0.411^{*} | 0.367 | 0.323 | |

0.360^{**} | 0.221 | 0.081 | −0.058 | |

0.563^{***} | 0.626^{**} | 0.689^{*} | 0.753 | |

0.468^{***} | 0.435. | 0.403 | 0.371 | |

0.763 | 0.454 | 0.278 | 0.1856 | |

Estimated | 1.053 | 2.107 | 3.160 | 4.214 |

F-Statistic with df = (5, 94) | 60.71^{***} | 15.61^{***} | 7.227^{***} | 4.286^{*} |

Significant codes: ^{***} p-value < 0.00001, ^{**}p-value < 0.001, ^{*}p-value < 0.01, “.”(p-value < 0.05).

To demonstrate this result, we make use Monte-Carlo procedure and randomly sample 36 observations from the simulate data

The best approach to demonstrate the validity of the simulation approach would be to test it with observed data. To use our approach, we need two sets of data from any reported study: parameter estimates as effect size

Briefly, in Chen’s study participants were 469 virgins in the control group for a randomized controlled trial to assess the effect of an HIV behavioral prevention intervention program [

To verify the simulation-based method, the parameter effect size estimates were obtained from the paper with

In the case where analytical solution to power analysis and sample size determination is difficult, simulation represents an ideal alternative as recommended in [

With this approach, researchers can compute statistical power and estimate sample size if they plan to conduct cusp modeling analysis using Gustallo’s polynomial regression method. A detailed introduction to the method can be found in [

To make the presentation easier, we confined this novel simulation approach to the situation of one regressor for each control variable in the cusp model. This approach can be easily adopted and extended to multiple regressors for each of the asymmetric

More and more data suggest the utility of cusp modeling approach in characterizing a number of human behaviors, particularly health risk behaviors, such as tobacco smoking, alcohol consumption, hardcore drug use, dating violence, and unprotected sex [

By conducting this study, we also note that previous studies published in the literature do not report adequate information for power analysis. We highly recommend that journal editors ask authors to report all parameter estimates, including

There are a number of strengths with the method we present in this study. The principle and the computing process are not difficult to follow; the data used for the computing can be obtained; the computing software is written with

This research was support in part by two NIH grants, one from the National Institute On Drug Abuse (NIDA, R01 DA022730, PI: Chen X) and another from the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD, R01HD075635, PIs: Chen X and Chen D).