Growth Curves for Diameter and Height Using Mixed Models : An Application in Eucalyptus Seedling

The growth of a tree or a forest settlement is of great value to a forest enterprise, because many decisions are directly dependent of this information, for instance, determining the optimal cutting age. This study aims to apply a new class of models to fit growth curves for diameter and height of Eucalyptus grandis X Eucalyptus urophylla seedling data. Data were collected from a trial conducted in a green house at the Natural Resources Department at School of Agriculture, Botucatu, São Paulo, Brazil. The experiment’s design was completely randomized with eight treatments and four replications. In this trial, the growth variables referring to the height and the diameter were evaluated, being measured five and four times, respectively. The methodology was carried in a mixed longitudinal model using a new approach based on Box-Cox Normal (BCN) distribution, and comparisons with this model were made assuming normality of the data. The results revealed that the BCN mixed model provided similar results to the standard model in order to estimate growth curves; however, the BCN model was the best result according to Akaike criterion, considering the slight asymmetry in the data set. This approach is of great interest in case of outliers and robust procedures for parameter estimation.


Introduction
The increase in wood consumption and its derivatives highlight the need to gen- erate new seedlings production technologies with a standard of appropriate quality to establish more productive forests.
In the seedlings selection for planting, the criteria are based on characteristics which generally do not determine the real quality, because the seeds vary according to the species, ecological sites, cultivation, transportation, distribution and planting.So, there are several reasons for the use of tests in the standard definition of seedlings quality, being able to add some values that are often required by the market (Gomes et al., 2002).
The survival, the establishment, the cultivation frequency and the early forest growth are necessary ratings for the forest enterprise success, which is directly related to the seedling quality at planting (Gomes, 2001).
In this context, the height/diameter relation is one of the features used to evaluate the forest seedling quality, because it reflects the reserve accumulation, a greater resistance and better fixation in the soil (Carneiro, 1995).Furthermore, the average height and the base diameter are two parameters, which show the high plant quality considered relevant to forestry industry companies (Gomes et al., 1996).
A set of mathematical models for growth in height and in diameter was proposed using linear and nonlinear models by many authors, including Schumacher (1939), Bertalanffy (1951), Richards (1959), Prodan (1968), Machado (1979), Silva (1986), which are known as Brody, von Bertalanffy, logistic and Gompertz models.These models have been widely and successfully used in many works, such as in Machado et al. (1997), Dias et al. (2005), Oliveira et al. (2008), Teo et al. (2011), Carvalho et al. (2014).Such models assume normal distribution for the dataset but, generally this assumption is not verified.Ferrari & Fumes (2017) proposed a new class of distributions called Box-Cox symmetric class, containing distributions that allow it to model asymmetry and that can also include outliers.Special cases include log-symmetric distribution (Vanegas & Paula, 2015), (for instance, log-normal distribution) and the truncated symmetric distributions with support on the positive real line, such as the Box-Cox t (Rigby & Stasinopoulos, 2006), the Box-Cox Normal (Cole & Green, 1992;Stasinopoulos et al., 2008) and the Box-Cox power exponential distributions (Voudouris et al., 2012), among others.
Therefore, the aim of this work is to apply models taken into account this new distribution class to fit growth curves for height and diameter data from a forestry trial of the Eucalyptus grandis X Eucalyptus urophylla hybrid seedlings and compare them with models that assume normality of the data.

The Data Set
This work arised from the data collected from a trial carried out in a green house of the Department of Natural Resources -Forestry Sciences, School of Agriculture, UNESP, Botucatu, São Paulo, Brazil (Bazzo, 2009).The experiment eva-luated the effects of different substrates on the Eucalyptus grandis X Eucalyptus urophylla hybrid seedling development.Different proportions of sewage sludge and carbonized rice husk, which composed the substrates, were compared to a standard substrate used in the green house.For the substrate composition, the sewage used sludge by carbonized rice husk proportions were: 100:0, 80:20,  Figure 2 shows the mean profile analysis.We observed that the eighth

(Non) Linear Mixed Model Approach
In this section, it was described the (non) linear mixed model approach for height and diameter modelling that was used to compare it with a new approach based in the Box-Cox symmetric class (Ferrari & Fumes, 2017).
Having problems involving growth data, the structure of linear mixed model is ordinarily used to model longitudinal data (Melesse & Zewotir, 2017;Soares et al., 2017;Pinheiro & Bates, 2000).In this case, the height and the diameter data were assumed to be normally distributed and the independence assumption was violated in longitudinal data, hence random effects were required.
So, for the height data, which presented a linear trend in descriptive analysis (Section 2.1), the model was defined by: where ijk Y was the height related to the kth time, the jth replication and the ith treatment, 1 i β was the mean of the ith treatment, 2 i β was the slope parameter of the ith treatment, k t was the time (continuous), For the diameter data, which presented a nonlinear trend in the descriptive analysis (Section 2.1), the Gompertz mixed model was proposed with the structure:

(
) ( ) where ijk Y was the diameter related to the kth time, the jth replication and the ith treatment, 1 β was the parameter representing the asymptote, ( ) β was the parameter related to the value of the function at β was the parameter related to the scale of the t The components of variance parameters of the linear mixed model were estimated the restricted maximum likelihood (REML).This methodology is a particular approach form of maximum likelihood estimation which does not calculate estimates on a maximum likelihood fit of all the information, but instead, uses a likelihood function calculated from a transformed set of data.
REML can produce less unbiased estimates of variance and covariance parameters (Pinheiro & Bates, 2000, Chapter 2).The penalized nonlinear least squares (PNLS) algorithm was used for nonlinear model (Pinheiro & Bates, 2000, Chapter 7).All the procedures were in R (R Core Team, 2015), version 3.3.1,using lme 4 for linear and nlme for nonlinear models routines (Pinheiro & Bates, 2000).For the goodness of fit, the standardized residual was used for linear and nonlinear models (Pinheiro & Bates, 2000).

The BCN Model Approach
Ferrari & Fumes (2017) proposed a new class of distributions called Box-Cox symmetric class, which includes the Box-Cox Normal distribution, that can be fitted to data with symmetric and asymmetric shapes.Then, an alternative approach, using the same structure of linear mixed model for the height data, was: ( ) | ~, , , where , For the diameter data, another alternative approach using BCN model was: where ijk Y was the diameter related to the kth time, the jth replication and the ith treatment with a Box-Cox Normal (BCN) distribution, where ijk µ was asso- ciated with: the median of the ith treatment ( i β ), ( ) was the smooth function (in this case, p-spline smoother was used, see Stasinopoulos et al. (2008) for details) which models the time effects, The marginal maximum likelihood was used for BCN models.In this context, the main distribution was been marginalized (Rigby et al., 2014).All the proce-Open Journal of Forestry dures were performed in R (R Core Team, 2015), version 3.3.1,using gamlss routine for linear as well as semiparametric BCN models (Rigby et al., 2014).For the goodness of fit, the quantile residual plot was calculated for BCN models (Rigby et al., 2014).
In these approaches, the random effect ij  represented the between plot variability in the experiment, while the within plot variability was modelled with the variance of the residual error ijk ε in the standard models, and in the BCN, it was modelled through the parameter σ.The Akaike criteria were calculated to compare the different approaches.

Results
The BCN and the normal distributions were plotted on top of the raw data in each time.For the height data, the BCN distribution performed better than the normal distribution (Figure 4).Plots for the diameter data were omitted because they presented the same feature.
In Table 1, there were the mean and the median estimated of growth for height by treatment for linear models ( 1) and ( 2).The estimated means were smaller than the estimated median.This result was expected due to the slight asymmetry presented in the data.In the model ( 4), it was possible to observe the estimated intercept for each treatment, and in this case, the smooth function described the growth curve for diameter.
For nonlinear mixed model (model ( 3)), the estimated parameters for  |AIC| = 111.51.It means that the estimate of the parameter associated to asymptotic growth for this observed period was around 3.5 cm, with an estimated growth efficiency of 0.47 cm in each period.From AIC criteria, the BCN model had a better performance than the standard ones, even though we could observe that the parameter estimate were quite close as well as the AIC (models (1) and ( 2)).This occurred probably due to the presence of slight asymmetry already commented above, but still indicating the BCN can be considered superior than the other.
For diameter, we can observe that the nonlinear BCN mixed model (model (4)) presented lower AIC than the Gompertz model (model (3)).It is important to notice that in the BCN model, a smooth function was taking into account to obtain a better fit.In this case, the smooth function can lead to a better fit but it had a cost of lack of parsimony while Gompertz model uses only three parameters.We guess this sort of modelling can be better investigated in order to get a better fit with few parameters.
Figure 5 shows the plots of standardized residuals for the fitted models to the height and diameter data.We observed that the linear and nonlinear models had a good fit for the height and diameter variables, respectively.For the BCN models (Figure 6), it was explored the quantile residual as a result of implementation of the gamlss routines, whereby we also observed a good fit.
Figure 7 presents the plots of estimated lines of growth for the height data using normal and BCN models and shows that both models were adequate to model the data, although they have different estimation approaches.The linear model is based in the mean and standard deviation around the mean (Figure 7(a)).On the other hand, the BCN model describes the median and a coefficient of the variation around the median (Figure 7(b)).In this context, we observed that there was less variability in the BCN model than in the normal model, as showed by the growth lines.These lines for the BCN model were more concentrated due to the robustness in the estimation process of the parameters.Analogously to the linear model, there was less variability in the BCN model than in the nonlinear model because of the robust estimation.
Furthermore, for all the models, comparisons among the eight treatments were done.The eighth treatment, which was the standard compound, was statistically different from the others with a better growth performance (Figure 9 and Figure 10).This result agrees with the descriptive analysis (Figure 2).
Although the other treatments did not show such growth as the standard, they yielded satisfactory results.This is very important, because the treatments based on compounds by sewage sludge mixture are sanitarily safe for agriculture and a promising alternative to recycle municipal waste.

Conclusion
This paper presented a new approach based on the Box-Cox Normal distribution   to analyse growth data.Additionally, the classical linear and nonlinear models were used to compare the new proposal.We observed that the fitted models presented similar performances, which means that the BCN model can be used to estimate growth curves.Moreover, its structure considered the slight asymmetry of the data and the AIC criteria primed as the best models.Furthermore, in the presence of outliers, this model can be more proper, because it involves a robust process of estimation.However, more studies can be made involving smooth functions in the nonlinear case.
The growth curves are important for determining the time of delivery of the seedlings to the field.Besides that, the genus Eucalyptus species is very important for the economic wood production in Brazil.Finally, curve predictions can be useful when making efficient decisions on the use of this natural renewable resource.
60:40, 50:50, 40:60, 20:80 and 0:100.For this article, the different proportions composed the treatments from one to seven, in the order of proportions as shown above, and the eighth treatment was the standard substrate.The experimental design was a completely randomized design with eight treatments and four replications, each replication being origined by the growth average of 24 plantings.The growth variables measured were the height in five periods (after 90, 105, 120, 135 and 150 days) and the diameter in four periods (after 105, 120, 135 and 150 days), both measurement in centimeters.

Figure 1
Figure 1 revealed a linear growth trend for height (Figure 1(a)) and nonlinear growth trend for diameter (Figure 1(b)).

Figure 1 .
Figure 1.Profile analysis of each treatment.The sequence of the treatment begins from bottom left (Treatment 1) to top right (Treatment 8).(a) Growth in height.(b) Growth in diameter.

Figure 2 .
Figure 2. Mean profile analysis.(a) Growth in height.(b) Growth in diameter.
where ijk Y was the height related to the kth time, the jth replication and the ith treatment with Box-Cox Normal (BCN) distribution, where ijk µ was associated with: the median of the ith treatment ( to the coefficient of variation based on the median and ν was the parameter associated to the asymmetry, with σ was the parameter related to the coefficient of variation based on the median and ν was the parameter associated to the asymmetryand ν ∈ » .

Figure 7 .
Figure 7. Fitted growth lines of height for each treatment.The sequence of the treatment begins from bottom left (Treatment 1) to top right (Treatment 8).(a) Linear mixed-effect model (1).(b) Linear BCN mixed-effect model (3).

Figure 8
Figure8shows the plots of the estimated curves for growth in diameter.The Gompertz mixed model (Figure8(a)) reasonably described the data behaviour, but the Gompertz model has a specific mode and it did not show

Figure 8 .
Figure 8. Fitted growth curves of diameters for each treatment.The sequence of the treatment begins from bottom left (Treatment 1) to top right (Treatment 8).(a) Nonlinear mixed-effect model (2).(b) Nonlinear BCN mixed-effect model (4).
variable and k

Table 1 .
Estimated parameters for fixed effects of linear model and linear and nonlinear BCN mixed models.