Analyzing Accuracy of the Power Functions for Modeling Aboveground Biomass Prediction in Congo Basin Tropical Forests

Allometric equation is the common tools for quantifying and monitoring the amount of carbon stored in forest ecosystems. The model used can be one of the major sources of errors that need to be considered for wood biomass estimations. The power function of plants has been questioned by comparing sixteen models. Some adjustment and model selection criteria and prediction of uncertainties have been computed. Published data on biomass studies and plot inventory were used for this analysis. The results highlight that power function is the best model for modeling aboveground biomass and additional effect on logarithm scales of the predictor variables must be prioritized. The power of the logarithm of diameter as predictor variable must be avoided because this leads to worst adjustment and higher prediction uncertainty. Tree height as a third predictor variable gives the best adjustment and reduces the uncertainty on the biomass prediction around 8 t/ha less than model with the two other predictor variables, the diameter and the wood specific density. The adjustment criteria are sufficient for the appreciation of the prediction quality of the models. The exponent of wood density as predictor variable needs better understanding.

ecosystems (Vashum & Jayakumar, 2012).Significant attention has been made to the fact that change in aboveground biomass may have considerable impacts on climate change or climate change mitigation (Lu et al., 2002).Its estimation is central in quantifying and monitoring the amount of carbon stored by trees.
Different methods for estimating biomass and forest carbon are used.These include the average carbon stock, forest inventory, remote sensing techniques that include correlation of spectral indices with biomass or terrestrial forest carbon (e.g., Landsat, MODIS), aerial photography, 3D digital imaging, radar signal to measure the vertical structure of the forest (ALOS PALSAR, ERS-1, JERS-1, Envisat) and LiDAR.Each method has advantages and disadvantages (Gibbs et al. 2007); however, allometric equations are common methods used (van Breugel et al., 2011) in association with forest inventory and remote sensing.
Different studies (Ketterings et al., 2001;Chave et al., 2004;Molto et al., 2012) have revealed that sources of uncertainty start with inventory of trees when assessing forest carbon stocks.To improve the accuracy of the forest biomass estimates, different sources of errors must be identified, prioritized and action must be taken for minimization (Picard et al., 2014).For allometric equations, four sources of uncertainty can be identified: 1) the error due to the choice of the allometric equation or model misspecification; 2) the prediction error (uncertainty on model's coefficients and on residual error); 3) the measurement error on the tree dimension variables; and 4) the sampling error.
While the sampling error that is dependent on the landscape heterogeneity, the plot size, the shape and the number of the plots can be minimized by the sampling design (Picard et al., 2014); a great effort must be done to reduce the measurement error.The two other sources of errors are the model errors that are dependent on the allometric model.The choice of allometric model appears as the most important (Chave et al., 2004;van Breugel et al., 2011;Melson et al., 2011;Molto et al., 2012;Picard et al., 2016).Appropriate allometric model becomes a major scientific concern for the accurate estimation of forest biomass (Rutishauser et al., 2013).The type of equation (species specific, site-specific, ecosystem specific, pan tropical, etc.) used can also have some impacts on the errors (van Breugel et al., 2011).Chave et al. (2014) have developed unique allometric equation for all ecosystems and concluded that the site effect can be negligible if the diameter, the height and the wood density are included.In a recent study, Djomo et al., 2016;Djomo & Chimi 2017 recommended the use of existing site-specific or ecosystem-specific equations to pan-tropical allometric equations in tropical moist forests.
As presented by Zianis and Mencuccini (2004) and Pilli et al. (2006), the mathematical model commonly used for modeling aboveground biomass was based on the power function.This was founded on the base that the growth of a plant is characterized by the relation of proportionality between its total biomass and its size (West et al., 1997;1999).According to Parresol (1999), existing equations for modeling wood biomass were classified into three types: the linear model with additive error effect (Equation ( 1)), the nonlinear model with additive error effect (Equation ( 2)) and the nonlinear model with multiplicative error effect (Equation (3)) written respectively as follow: where AGB denotes the aboveground biomass, i X the tree dimension va- riables (diameter at breast height, total height, age, crown length and their combinations), i β the coefficients of the equations and ε the residual error.To estimate the fitted parameters (coefficients), the log-transformation is appropriate, indeed necessary, for allometric analysis (Kerkhoff et al., 2009).The linear regression from (Equation ( 3)) can be used assuming that the error is normally distributed and additive on logarithm scale, as: that can not the case for Equation (2).
Thus, modeling of the biomass cannot be limited only to the quality of adjustment and the selection criteria.It is also essential to explore the adequacy of the model established with the biological process of the tree growth.The objectives of this research are: 1) to evaluate the sensitivity of the parameters and their combination included in the models and compare the additive effects to the multiplicative effects; 2) to analyze the uncertainty in the model prediction and 3) to evaluate a methodology to reduce the uncertainty of a selected model for biomass determination.

Harvest Biomass and Forest Inventory Data
Two types of data were used in this study, the destructive aboveground biomass data from different published work and forest inventory data from tropical African forest.
The tree harvest data were from 362 sample trees with diameter and wood density (Table 1) and with 225 trees having height measurement.These data were collected from the transition forest between the dense evergreen forest and semi-deciduous forest in the Democratic Republic of Congo (Ebuy et al., 2011) with 12 trees (diameter, height and wood density), in Cameroon (Fayolle et al., 2013) with 137 trees (diameter and wood density), in Gabon (Ngomanda et al., 2014) with 101 trees (diameter, height and wood density), in evergreen forest in Cameroon (Djomo et al., 2010) with 71 trees (diameter, height and wood density) and in the Boi Tano forest reserve in Ghana (Henry et al., 2010) with 41 trees (diameter, height, and wood density).The mean diameter was 44.9 cm and median 37.6 cm.25% of trees had diameter greater than or equal to 70 cm (90 trees).The numbers of trees with diameter greater than 80 cm and 90 cm were respectively 67 and 47.The maximum diameter was 192.5 cm.Inventory data from the permanent plots were from the Central African Regional Program for the Environment (CARPE) and installed with the Smithsonian Institution's assistance.This work was conducted as part of the assessment of biodiversity in the forest reserves of Dzanga Sangha in Central African Republic with 5 plots (Balinga et al., 2006), of Monts de Crystal National Park with 5 plots (Sunderland et al., 2004), of Waka National Park with 5 plots (Balinga, 2006) in Gabon and of Nouabale Ndoki National Park with 4 plots (Sunderland & Balinga, 2005) in Congo Republic.For those four forest reserves, 19 1-ha permanent plots were set up.The sample trees over 10 cm dbh (diameter at breast height of 1.30 m above ground level) have been measured and identified to species.The maximum diameter value was 188 cm with an average of 24.8 cm and a median of 17.8 cm.
Height allometric equation (Equation ( 5)) that is the best one for moist forests (Djomo et al., 2016) was used to estimate the unmeasured height tree (H) in the inventory and harvest biomass data sets.The wood density values ( ρ ) of species were obtained through the international wood density database (Zanne et al., 2009).For species without wood density values, the average of the plot was assigned.

Modeling Aboveground Biomass
The power function as the relationship between the aboveground biomass (AGB) and predictor variables, the dbh (D), H and ρ is presented by Equation ( 6), derived from Equation (3) where 0 β is the allometric coefficient and 1 2 , β β and 3 β are the allometric exponents.
When natural logarithmic transformation is applied, Equation ( 6) is rewritten as: Towards the recent discussions between Kerkhoff and Enquist (2009) and Packard (2009), Xiao et al. (2011) used Monte Carlo simulations to compare the different approaches and conclude that the log-transformed linear regression will produce more accurate estimates and recommend also applying both statistical and biological analyses.For this purpose, data analysis of the sampled trees was limited to graphical analysis (diameter distribution scatter plots) to check the nature of this error (Figure 1).This allowed choosing log-transformed linear regression because of the multiplicative error in the original scale.
Based on the values of the allometric exponents ( 1 2 3 , , β β β ), nine models were established, divided into 2 groups (Table 2).The first group was composed of four models with two predictor variables, the diameter and the wood density so that 2 0 β = .The predictor variable of model (Equation (a)) was a compound obtained from the combination of two variables D and wood density while (Equation (b)) and (Equation (c)) additive effects models of these variables.Equation (d) characterized the effect of using D square instead of D in Equation (a).
The second group with five models analyzed the effect of height as the third predictor variable.As with the first group, the product of the three predictor variables Equation (e), the square of diameter, the height and the wood density and their additive effect were examined in Equation (f) to Equation (i).The third group used seven others models from many studies characterized by the power of . ln 5.783 6.010 ln 0.929 ln 0.079 ln ln the logarithm of the diameter as predictor variables (Equation (j) to Equation (p)).

Selecting the Best Allometric Models
For each model the following goodness of fit criteria were calculated: the ad-

AGB
the observed biomass of sample tree i, n the sample size, , , est i i AGB − the predicted value of sampled tree i when it has been excluded to the fitted model.The predicted values of aboveground biomass of each model with the plot inventory data have been calculated.
The sixteen models have been compared and the range of predictions values was estimated by plot.These ranges can explain, for each inventory plot, the uncertainty (error) due to the choice of the allometric equation.

Prediction Error Calculation
Models accuracy was analyzed by computing the prediction error propagation at tree level using inventory plot data.The Monte Carlo simulation method was used.Thus, the residual error of each model was simulated by adding to the prediction a random normal distribution error ik ε with mean zero and standard deviation error of the fitted model.The uncertainty on the fitted parameters was simulated with Monte Carlo iteration according to a multi-normal distribution with mean as the estimated fitted parameters and the variance-covariance matrix of the model's coefficients.For each Monte Carlo iterate, k th , random coefficients β , 2k β and 3k β ) and the k th random residual error ( ik ε ) were gener- ated and the corresponding biomass computed.At each kth Monte Carlo iterate, the predicted biomass of the i th tree was for Equation (7) as follow: with k varied from 1 to 10,000.For each model, the 10,000 predictions of the aboveground biomass of each inventory tree by plot were computed to appreciate the uncertainty level.The predictions data were used to calculate for each model and each plot, the Monte Carlo 95% confidence interval ( 25% CI and 95%

Fitted Allometric Parameters
The allometric coefficients and allometric exponents of the power function were calculated (Table 2).The correction of the bias for the back-transformation from the logarithmic scale to original scale was done by changing the coefficient ( ) . Back-transforming equations Equation (j) to Equation (p) does not allow their expressions in power function with the predictor variables so that their allometric exponents are not applicable (NA).For those models, the values of 0 β are higher (4.8 to 131.3) and also lower (0.003).The allometric exponents of models Equation (a) to Equation (i) The major misleading related with RMSE (RMSE = 0.426) is explained by the product of the two predictor variables D and ρ while their additive effect with Equation (b) and Equation (c) improve the adjustment quality.The allometric coefficients of the models of group 2 are about 0.11 for multiplicative effect (Equations (e) to (g) and 0.21 for additive effect (Equations (h) to (i).For this group, the allometric exponents 1 β and 2 β of D and H are respec- tively about 1.85 and 0.93 for the multiplicative effects and, about 2.27 and 0.26 for additive effects.

Choosing the Best Allometric Models
All the comparison criteria (RMSE, R2aj, RMAE, RRMSE, AIC, PRESS, P1_alpha) are characterized by the same trend in the appreciation of the models goodness of fit (Figure 2).The adjusted coefficients of determination lie between 0.927 and 0.964 and the residual errors vary between 0.296 and 0.426.All the allometric coefficients and exponents are significant except the allometric coefficient of Equation (a) as presented in Table 2.In group 3 the regression coefficients of The tests of significance of the allometric exponent of ρ make it possible to accept that they are equal to 1.These results showed that the best models are Equation (h) and i for the three predictor variables D , H and ρ , while Eq- uation (b) and Equation (c) are those for the two predictor variables D and ρ .

Comparison of Model Predictions
For the sixteen models compared, the aboveground biomass predictions have been done on each of nineteen permanent plots.The ranges of the estimations varied from 46.1 t/ha to 218.1 t/ha (Figure 3).The lowest range was obtained with the plot 1 of Waka forest reserve while the highest range was for the plot 5 of Monts de Crystal forest.These results highlight the need for obtaining an equation as reliable as possible.The analysis of the variance was used to compare the three groups and the sixteen models (Table 3).Significant differences are observed between groups of models (F value = 54.967,

( ) 2 16
Pr F e > = − ) and between the 16 models (with F value = 26.2507and the associated probability Pr F e > = − .The interactions between plots and groups of models are not significant (F value = 1.10, ( )

Propagation Error Analysis
The prediction error propagation was analyzed with the aboveground biomass Monte Carlo iterations interquartile values IQ.The analysis of the variance showed a difference between group 3 and the two others which are identical.
Group 3 is characterized by the highest value of uncertainty of 93.6 against 85.70 and 80.7 respectively for group 1 and 2 (Table 3).The comparison of the 16 models shows that the models with additive effect of group 1 and 2 form a homogeneous group with the models Equation (m) to Equation (p) of group 3 and different to the others models as presented in Table 3.The greatest values of uncertainty more than 100.0 t/ha are obtained with those four models Equation

Discussion and Conclusion
Mathematic functions that explain the growth of a plant (Niklas, 1994;Kaitaniemi, 2004;Pilli et al. 2006) are applied for modeling aboveground biomass.It is shown through this study that models of group 3 differ from the other models by the fact that the power of the logarithm of diameter was used as predictor variable.
However, the worst model is Equation (a) with a predictor variable as the product of diameter and wood density.But when using the predictor variables in additional effect, the model Equation (b) becomes the best one with two predictors and can be expressed in power function.The estimated models highlight that the allometric exponent of the wood density as predictor variable equalizes to 1.This value is in conformity with the results of several studies (Fayolle et al., 2013;Ngomanda et al., 2014, Chave et al., 2005, 2014).Indeed, according to Franceschini et al. (2016), the allometry exponent can be interpreted in terms of the relative growth rates.This kind of growth rate cannot be applied to the wood density.Under these conditions one can reasonably admit that the ideal model is summarized, for each tree i, as . This result is in conformity with those of Pilli et al. (2006) who compared the allometric coefficient as the product of a constant value (scalar) and the wood density of the tree.This can explain the misleading quality of Equation (a) with an exponent value of 2.205 and consequently the aboveground biomass prediction value is the highest one with the highest uncertainty.Further research should better consider the exponent of wood density variable in allometric equation.The exponent values of diameter are between 1.85 and 2.42.Zianis and Mencuccini (2004) using a list of 279 biomass allometric equations showed that this value should rather be closed to 2.36.
Many studies have highlighted the importance of tree height as predictor variable in the aboveground biomass equation (Chave et al., 2014;Djomo et al., 2016).This study confirms the adjustment and prediction qualities of models with height but the additional effect of each predictor variable must be taken into account.The same allometric exponent for two or three predictor variables (Equation (e), Equation (f) and Equation (g)) is not appropriate to the modelling of allometric equation of aboveground biomass.The best equations have been obtained with additional effect of each predictor variable as presented by Equation (h) and Equation (i).This study concludes that for modeling allometric equation, the power function which characterizes the growth of a plant is the guide to choose the models to be estimated.Taking only a same exponent coefficient leads to a bad modeling, so that the additional effect of each predictor variable must be prioritized.
This study highlights the trend of the model choice error.The highest (481.6 t/ha) and the lowest (404.7 t/ha) aboveground biomass predictions are obtained by the worse models while the best ones are of middle (428.3 to 431.4 t/ha).The prediction biomass of the "best" models is in agreement with the estimated aboveground biomass in the Congo basin forests as reported by Lewis et al. (2013).
Comparing the aboveground biomass prediction with additional effects, the models with two predictor variables are characterized at mean 8 t/ha of above-

Figure 1 .
Figure 1.(a) Diameter distribution, (b) Scatter plot of above ground biomass with diameter and (c) natural logarithm scatter plot which allowed the normality and the homogeneity of variance.

Figure 2 .D
Figure 2. Trend of the goodness fit criteria of the models and the associated selection criteria AIC, PRESS and P1_alpha = 1 P .In the legend letters a to p represent the model; Example, h correspond to the model Equation (h).

Figure
Figure 3.The range ( ) max min AGB AGB − of the estimated aboveground biomass on each inventory plot with the 16 models equations.
(a), Equation (j) to Equation (l) while the lowest one is related to model Equation (h) with 75.4 t/ha as presented by the Figure4.In spite of the homogeneity of group 2, the models Equation (e), Equation (f) and Equation (g), characterized by non-additive effects of the predictor variables are of strong uncertainty.In comparison with the quality of adjustment criteria, it arises that the best models (Equation (h) and Equation (i)) are characterized by weak uncertainty.However, the models which are badly adjusted (Equation (a) and Equation (l)) have the highest uncertainties.

Figure 4 .
Figure 4. Boxplot of aboveground biomass interquartile of the prediction uncertainty of the sixteen models with Monte Carlo iterations.
ground biomass more than the models with the three predictor variables.The models with the highest values of prediction error are those characterized by the worst adjustment.Therefore, the adjustment and model selection criteria are able to anticipate the prediction quality of the best model chosen.

Table 1 .
Description of harvest sample trees by author: n = sample size; ne = number of species, DR = diameter range; transition = transition between evergreen forest and semi-deciduous forest.

Table 2 .
The fitted allometric equation models and the allometric coefficients ( 0 β ) and allometric exponents ( 1 2 , β β and 3 β ) on natural scale when back-transforming from logarithmic scale; D the diameter (in cm), H the height (in m), ρ the wood density (in g/cm 3 ), AGB, the above ground biomass (in kg).