^{1}

^{2}

^{3}

^{4}

^{5}

^{1}

^{*}

**Background: Daily paediatric asthma readmissions within 28 days are a good example of a low count time series and not easily amenable to common time series methods used in studies of asthma seasonality and time trends. We sought to model and predict daily trends of childhood asthma readmissions over time inVictoria,Australia. Methods: We used a database of 75,000 childhood asthma admissions from the Department ofHealth,Victoria,Australiain 1997-2009. Daily admissions over time were modeled using a semi parametric Generalized Additive Model (GAM) and by sex and age group. Predictions were also estimated by using these models. Results: N = 2401 asthma readmissions within 28 days occurred during study period. Of these, n = 1358 (57%) were boys. Overall, seasonal peaks occurred in winter (30.5%) followed by autumn (28.6%) and then spring (24.6%) (p < 0.0005). Day of the week and month were significantly associated with trends in readmission. Smooth function of time was significant (p < 0.0005) and indicated declining trends in readmissions in 2001-2002 and then increasing, returning to roughly initial levels. Predictions suggested readmissions would continue to increase by 5% per year with boys in the 2 to 5 years age group experiencing the largest increase. Conclusions: GAMs are reliable methods for low count time series such as repeat admissions. Our model implied: health services may need to be revised to accommodate for seasonal peaks in readmission especially for younger age groups.**

Asthma is the most common long-term medical condition in children [

The utility of predictive models for management of paediatric hospital readmissions in general and their benefit for health services research have been demonstrated [

The analysis of seasonality and time trends arise naturally within a time series framework. Many studies of childhood asthma hospital admissions have developed time series models for examining the associations between aeroallergens, pollution, weather and viral infections [17-22] . However, none of these models have been used to make predictions. Furthermore, only one study of childhood asthma readmissions has used a time series model, however, the model’s capability to predict future readmission counts was not tested [

We used daily childhood asthma hospital admissions in Victoria between 1997 and 2009 from the Department of Human Services (DHS). The data collection methodology has been described elsewhere [

We assumed that the mean varied smoothly over time, and that a Poisson process underlay the association between the mean and observed counts. We modeled the mean using a GAM [_{1} (time) + f_{2} (month) + f_{3} (day of week) + error term, with the mean and covariates on a log scale. f_{1} was a smooth non-parametric function, obtained as a combination of smoothing spline basis functions. f_{2} and f_{3} were linear functions, built using model coefficient matrices for parametric components of the model. In other words, we employed a semiparametric model.

We assumed a conditional Poisson distribution and, although the results are not shown here, χ^{2} tests gave us no evidence to reject this assumption for either male or female readmissions separately and, therefore, for all readmissions. We also tried fitting models assuming an underlying negative binomial distribution for the outcome, but this had a poor fit to the data. To cope with data over-dispersion, we used a self-adjusting over-dispersion parameter. The GAM method we used guarded against over-fitting by incorporating a penalized regression spline approach, with increasing penalty with increasing curve “wiggleness”, that is, its second derivative. The choice of a low spline basis dimension also guards against over-fitting by reducing smoothing parameters and hence reducing model inflexibility and increasing the effective degrees of freedom [

We validated our model predictions by training the model on the first 11 years of data and then using the model to predict the final two years. We used χ^{2} goodness-of-fit tests to compare these predictions to the actual data. Using the full 13 years of data, we forecasted readmission counts one year beyond the end of the study. We also included 95% error limits around the predictions. We constructed our graphs using either the Lattice package [

There were 2401 readmissions covering fiscal years 1997-2009, a total of 4748 days, resulting in a daily mean of 0.5057. Boys accounted for 1358 (57%) and girls 1043 (43%) readmissions. Overall, readmissions peaked in winter (30.5%) then autumn (28.6%), spring (24.6%) followed by a trough in summer (16.2%). In contrast, χ^{2} test showed that all other asthma admissions had a differing overall seasonal distribution peaking in autumn (29.4%) then winter (25.6%), spring (23.8%) with a shallower trough in summer (20.9%) (p < 0.0005).

In ^{st} July to 30^{th} June). These are typical of low count time series, that is, infrequently occurring and low magnitude counts. The distribution of daily readmission counts had a very low range, between 0 and 5, with the majority of counts being 0, including the median (^{th} percentile was 1 and even the 99^{th} was only 3. For the boys’ and girls’ distributions separately, their respective 75^{th} percentiles were even lower, both being equal to 0. Even on a per-month aggregate basis, the counts were as low as 2. Runs of consecutive zero readmission days were as long as 54 days for boys’ and 48 days for girls’ readmissions with respective inter quartile ranges and median (IQR) of (1, 3, 5) and (2, 4, 7). Consecutive zero readmission days for total readmissions had IQR (1, 2, 3) with a maximum length of 27 days. On a monthly basis, there were as many as 12 distinct blocks (interspersed by at least 1 readmission) of these consecutive zero runs for total, male and female readmissions. The IQR for blocks of consecutive zero runs per month for total and boys’ was (4, 7, 10) and for

girls’ readmissions (4, 7, 9).

In Figures 2(a) to (c), we display readmissions, stratified by age and gender, as yearly counts, rates per 100,000 population and readmission rates, respectively. A close correspondence between counts and population rates for all age groups can be seen, however, similarity between population and readmission rates is present only for 2 - 5 year olds while the older age groups show a stark contrast. Although not shown here, regression analyses indicated a linear correspondence between counts and population rates but not between readmission rates and population rates.

We fitted our model to total readmissions and to male and female readmissions separately. The model showed that both month and day of the week had significant associations with mean daily readmission counts. Higher counts were expected in March, May, June and November, and the lowest counts in January. This pattern held for both males and females with males experiencing higher peaks. The day of the week had more of an impact for girls than boys. Overall more readmissions expected earlier in the week, from Sunday onwards, and the fewest on Saturday.

The fitted model produced a dispersion scaling parameter which was very close to unity, further justifying the assumption of a Poisson process and explaining why the negative binomial was such a bad fit. The smooth function of time was also significant (p < 0.0005), and indicated that after an initial trend of declining readmission counts until about 2001-2002, they subsequently began to increase returning to roughly initial levels (

Exploration with auto correlation plots and DurbinWatson tests indicated a weak but statistically significant auto correlation at lag 1 of 0.1, 0.08 and 0.06 for total, male and female readmissions respectively and so violated the assumption of independence of the daily readmission counts. In order to adjust for non independence, lag one values were included as part of the GAM. They made no substantial difference to the results and were finally not included due to the predictive role of our model.

The non parametric model also demonstrated time, day of the week as well as day of year to be significantly associated with readmission counts. The day of year variable was in complete agreement with the parametric monthly variable of our final model for both timing and amplitude of peaks and troughs throughout the year.

We validated the model predictions by training the model with the first 11 years of data then using the model to predict the final two years of data. The predictions were subsequently aggregated by month and compared to the actual data. χ^{2} goodness-of-fit tests indicated that there was no statistical difference between the predictions and the data (p = 0.65). In comparison, when using the mean of the training data as a predictor of the final 2 years of data, χ^{2} tests indicated that the monthly aggregated mean predictions were not a good fit to the data (p = 0.03). We also used this method to test the parametric model’s fit to the data and found very similar results to the semi parametric model’s but with slightly higher χ^{2} values and so the parametric model did not have as good a fit to the data as the semi parametric

model. As the semi parametric model outperformed both the mean and the non parametric model in predicting the final 2 years of readmission data this gave us some confidence in using it to predict future readmissions.

Semi parametric models were also fitted to data subsets consisting of the 6 combinations of age group and sex displayed in ^{2} goodness-of-fit tests validated the fit and prediction, but were only marginally better than the mean. Due to their extremely small numbers and the marginal improvement on the mean, we could not consider the models reliable for these 6 subsets.