Use of Logistic Regression Model for Prediction of Non-Timber Forest Products

The use of non-timber is a valuable alternative for the conservation of tropical forests. Juçara (Euterpe edulis Mart.) is considered one of the main alternatives in the Atlantic Forest for the production of açaí pulp. However, there are few studies that aim to evaluate their production. The present study aimed to construct a probabilistic model to predict the production of Euterpe edulis bunches, using dendrometric variables and competition index. Twenty plots of 10 × 50 m were sampled in an area with said specie, showing the arboreal entities with diameter at breast height > 4.8 cm, and recording the Euterpe edulis phenomena. The main variables influencing the production of bunches were assessed using logistic regression model. The logistic regression showed the variables diameter breast height (DBH) and total height (h) as significant to explain the variation between productive and non-productive entities. The competition index tested was not significant (p-value = 0.221). The model of prediction of curl production in Juçara can be written as: Zi = −6.878594 + 0.2522454 × DBH + 0.1951574 × h. The use of a logistic regression model showed potential for prediction of non-timber forest products.


Introduction
The suppression of forests throughout history in order to develop land for other uses has reduced the Atlantic Forest to fragments, threatening the biodiversity of the Brazilian biome [1].Currently, there is only 12% of the original Atlantic Forest coverage, with less than half of on protected areas [2].Most of the agricultural land of Brazil, as well as most of the country's population, 125 million Governments and nongovernmental organizations (NGOs) have supported rural communities as a way to promote conservation and sustainability of tropical forests in Latin America and the world [3].Their actions are often directed towards the use of non-timber forest products (NTFP).For instance, in Brazil the National Plan for the Promotion of Socio-biodiversity Products was implemented in 2009 [4].
There is, however, scarce information on the production and sources of NTFP variation of commercial interest.Since NTFPs are considered an important component in tropical forest conservation strategies, this supposes a contradiction [5].
The Juçara palm (Euterpe edulis Mart.) is one of the most important, abundant and valuable non-timber forest products exploited in the Atlantic Forest.
The extraction of palm heart is its main use [1].The collection of palm heart involves logging entities.This causes the death of the plant, since it has a single stipe and does not regrow [6].Secretly exploited in sections of the Atlantic Rainforest, palm heart harvest has been the greatest threat to the species [7].
The exploitation of fruits for pulp production is a recent option of economic use that generates low impact on the Euterpe edulis population [8].The product is equivalent to the Amazonian açaí, produced by Euterpe oleracea Mart.It is widely consumed today, due to its nutritional and health perspectives, such as the high energy, mineral and anthocyanin levels [7].
Juçara is one of the few tropical species with potential for commercial exploitation.It can be cultivated in native forests through sustainable management practices that guarantee the conservation of the remaining forest fragments [9].
The increasing interest in the management of the species for pulp production makes it necessary to further study the variation in fruit productivity and its causes [10].
Some recent studies have evaluated Euterpe edulis fruit production, however, no methodologies are proposed for using field survey data to estimate yield [10] [11].Neither are there variables that quantify the competition suffered by sampled entities in models constructed for Juçara fruit production assessment.
Studies on competition among tree entities have received great attention because of their strong effects on control in the structure and development of stands [12].Competition between trees takes place when there is a scarcity of resources: production decreases below entity demand [13].
In growth models and forest production, tree competition is an important quantitative variable [14].However, as authors report, this is difficult to measure because their direct causes are not known.There are three categories for competition indices [14]: distance independent, which use variables at settlement level; distance dependent, which insert size and location of neighboring or competing trees with the object tree; and distance semi-independent, which are calculated by considering the neighboring trees in circular plots around the object tree.American Journal of Plant Sciences This investigation aimed at constructing a probabilistic model to predict the production of Euterpe edulis bunches; using dendrometric variables and competition index.

Study Area and Collection of Data
The present study was carried out in fragments of the Semideciduous Seasonal Forest belonging to the Dênis Gonçalves Settlement Project (21˚34'30"S and 43˚12'33"W), located in Zona da Mata/Minas Gerais, in February and March 2016 (Figure 1).The study area is between 409 m and 928 m of altitude.Latosols is the predominant soil class [15].The climate in the region is classified as Cwb (Köppen), its altitude as tropical, mesothermic, with hot summers and high precipitation (October to April), and cold and dry winters (May to September) [16].
The annual average temperature is 18.7˚C and annual rainfall of 1528 mm.
Fragments in the settlement are of secondary forest and total 1393.8ha.They are kept in the Legal Reserve-areas in the interior of rural properties that are destined for conservation, described in the Brazilian Law [17].These are previously planted with Coffea arabica L, and that have been destined to natural regeneration for more than 60 years (personal communication).
Areas were first identified in order to quantify and measure Euterpe edulis entities.Parcels of forest inventory were randomly allocated at these sites.
Twenty plots of 500 m 2 (10 × 50 m) were measured, totaling a sampling effort of one hectare.
All living and dead standing shrubs, with circumference at breast height (CBH) equal to or greater than 15.0 cm (DBH ≥ 4.8 cm), were sampled.Each entity's scientific name, CBH value measured with tape measure, total height, and pheno-phase were recorded.Identification of the botanical material was carried out by consulting the literature, specialists or by comparisons with specimens present in the herbarium of University of Brasília, according to the classification system APG III [18].

Data Analysis
Distance-independent competition indices (IID) [19] were calculated calculated in order to assess each Euterpe edulis entity's competition (Table 1).
DBH, total height (h) and 1 IID were selected using Pearson correlation coef- ficients (Table 2) for the construction of the bunches production model for occurrence probability.As for the significance test alpha = 0.05, the p-value was lower than 2.2e−16 for all indices.
Sampled individuals were stratified by their diametrical distribution and vertical structure, in order understand the influence of dendrometric variables on the production of Juçara clusters.The definition of vertical strata followed the criteria described by [20] (Table 3).Strata were calculated by plot, including all sampled tree entities, of all species found.
Table 1.Competition-independent distance indices calculated in the study.

Competition indices
Author (year) Equation Where: i AS = shaft sectional area of object tree, measured at 1.30 height (m²), ASq = sectional area corresponding to the average diameter (q) of neighboring tree shafts (m 2 ); i BAL = sum of the sectional areas of neighboring tree shafts larger than the object tree's stem.Where, h = total height for each entity; hm = average height for sampled entities; S = standard deviation for average height.
The Logit probability model was chosen for the estimation of bunches production probability in Euterpe edulis entities.This model has been used in several studies and in different areas, such as biology, epidemiology, medicine, economics, engineering, and others [21].
Model of a limited dependent variable, the regressing is a binary variable.It may assume two values; 1 if there is an occurrence of the event and 0 otherwise [22].Thus, these results show the probability of production for a given Juçara entity, given their values of diameter, total height and competition index.All sampled entities from the species were used.There was a variable number of bunches ranging from 0 to 4 bunches.These were transformed into a binary.
The logistic distribution is used as a link function in the Logit model: ) where, P i is the chances of i entity of producing at least one cluster of fruits; f is the cumulative distribution function; X is a vector of explanatory variables; and β are unknown parameters to be estimated.The representation of this relation in the Logit model follows the following form: where i Z = X β , where β are unknown parameters, and X is the vector of explanatory variables.
If occurrence probability of production of the Juçara entity is i P , and the non-occurrence is ( ) . Thus, dividing the probability of occurrence by non-occurrence: Equation ( 5) represents the ratio of the chances of a palm tree to produce; the division between the probabilities that Juçara entity produces, by the probability that there is no production.Taking the natural logarithm of equation ( 5) the logit (L): Therefore, the logarithm of odds ratio (logit), i L , is a linear function of the explanatory variables and the calculated parameters.American Journal of Plant Sciences The parameters of the Logit model are estimated by the maximum likelihood (MV) method.This method's goal is to maximize the likelihood function, that is, to obtain values for the still unknown parameters that maximize the possibility of observing the data of the dependent variable presented in the input matrix.
The logistic function is expressed in this way in this study: where, DBH is the diameter at chest height, h is the total height of the Euterpe edulis entity, IID the 1 calculated competition index, and ɛ the random error.
The estimated coefficients provide the variation in Logit from the change of one unit of the independent variable.Parameters positive values mean that an increase in the independent variable will return an increase in the probability of occurrence of the phenomenon, ergo, they are directly proportional.Negative values for the regressors mean an inversely proportional relation, in which the increase of the independent variable means the decrease of the probability of occurrence of the phenomenon.The hypothesis is that the parameters resulting from the application of the Logit model are positive, with a prediction that the increase in diameter, height and competition index values (with decreased competition) positively affect the probability of Juçara bunches.Chi-square test function, LR, was applied in order to test the null hypothesis, where estimated coefficients are expected to simultaneously equal zero.Similar to the F test for simple linear regression models, the likelihood ratio follows the statistical distribution 2 χ , where the number of degrees of freedom is equal to the number of independent variables in the model [22].
The quality of fit in binary regression models does not follow the conventional measure of adjustment quality, R 2 .Similar measures are employed, called pseudo R 2 [22].McFradden's R 2 ( 2McF R ) and countR 2 were used in this study.
The model was re-assessed without the non-significant variables, to test whether the withdrawal would affect the adjustment.The calculations of the present work were carried out using Stata/SE 12.0 software [23].

Entity Distribution according to Diametric Class and Height Stratification
809 Euterpeedulis entities were sampled in the floristic survey, where 208 were in reproductive phenomena, with a total 373 clusters.
The distribution of the Juçaras in diameter classes showed a higher production concentration in the diameter classes between 15 cm and 20 cm, with 59.62% of productive individuals and 63% of observed cluster production.Table 4 also shows that, with increasing diameter, the proportion of productive individuals in relation to the total class also increases, with production in all individuals in the class between 25 cm and 30 cm DBH.Minimum observed height of sampled Juçara entities was 2 meters; maximum height was 27 meters, achieving a mean value of 10.13 ± 5.45.The vertical stratification of the forest demonstrates a directly proportional relationship, as presented for DBH, between height increase and bunches production (Table 5).
Juçara entities represent 25.34% in the upper stratum.This stratum concentrates 66.22% of registered bunches, and has a higher proportion of productive individuals than non-productive individuals, compared to the middle stratum.
65.85% of the individuals in the upper stratum produced, while only 15.70% of the individuals in the middle stratum presented reproductive phenomena.
The highest entity concentration was naturally expected in the meddle height stratum, due to the expression that determined the vertical stratification.Yet, Euterpeedulis entities in the upper stratum had an expressive percentage of individuals in reproductive phase indicating that there is a strong relationship between curl production and the vertical position.This provides a valuable contribution to the handling of specie curls and fruits.

Juçara characteristics in cluster production in the Semideciduous Seasonal
Forest had similar results in this study to those found in other research with the species in the Dense Ombrophylous Forest [6] [10] [11] [24].Studies in seasonal forest should be continued in order to deepen the understanding of the characteristics of the species in this phytophysiognomy, subsidizing its management.

Model Adjustment
The Logistic Model of Multiple Regression established DBH and Total Height as statistically significant (Table 6).The competition index did not statistically influence bunch production.
The chi-square LR value was 355.15.This implied the rejection of the H0 hypothesis.Therefore, coefficients are jointly significant to explain the production of bunches in Euterpe edulis.indicating that 38.51% of the variation of the dependent variable can be explained by the independent variables of the model.The pseudo R 2 value, below 40%, does not necessarily indicate an insufficient adjustment, given the use of dendrometric variables to estimate a non-timber forest product.Moreover, in binary regression models, the quality of the adjustment is of secondary importance.The most important model relies on the significance and practice of the coefficients linked to explanatory variables [22].
Coefficients were positive.This shows that an increase on DBH and total height increase the probability of Juçara cluster production.The marginal effect indicates that while there is an average increase in the probability of phenomenon occurrence and other variables remain constant, there is a one-unit increase in the analyzed variable.The one-meter increase on height represents on average an increase of 2.5% points in the probability of producing bunches.With DBH, the increase in 1 cm represents, on average, a rise of 2.1% points in the production probability of a specific entity.
Another sensible interpretation in terms of the odds ratio is obtained by calculating the antilogarithm of the angular coefficients of each variable [22].Thus, taking the antilogarithm of DBH and height coefficients, the values of 1.Therefore, the importance of height in explaining Euterpe edulis cluster production is notorious.Height reflects the luminosity incident on the crown of the palm trees.Souza [10], when studying different forms of management of Juçara, observed an increase in fruit production and reduction of inter-annual variation through the practice of agroforestry management.The author highlights that the main altered morpho-climatic characteristic in this type of management is the luminosity pattern.
Shading is recognized as an important facilitation mechanism between pioneer and secondary species.However, it may also be a mechanism of inhibition, as species that demand more light for their development begin to suffer through their absence [25].Forest composition and structure are products of the interaction between several factors, one of which is competition for light, in which each species traces different strategies [26].
Plant height is related to the position the entity occupies in the forest canopy, and with the availability of light incident on the crown of the palm tree [11].The information available in Table 5 shows that entities in the upper stratum produce more bunches.
The luminosity was also registered as an important variable in natural regeneration studies.In an experiment conducted to evaluate the survival and growth of Euterpe edulis seedlings, Ribeiro et al. [9] obtained higher survival of the transplanted palm trees in clearing environments when compared to transplanted plants in the understory of the forest.A nonlinear regression generated a positive relation between canopy opening and seedling survival.
Withdrawing the competition index 1 IID , not significant in the model, did not change the significance in the re-estimation of the model (Table 7).2), with multicollinearity between the variables.
Thus, the model of prediction of curl production in Juçara can be written as: Adding each entity's Z value to the algebraic formula representing logit bonding probability (Equation ( 4)), results in each entity's probability producing at least one cluster of fruits.The graphical representation of the production probability for all 809 sampled entities shows the sigmoid curve, characteristic of the cumulative distribution function (Figure 2).
It can be noted that in the re-estimated model, DBH had a greater contribution than height, given the higher value of its coefficient.Larger diameters are associated with individuals with greater use of available natural resources, such as water and nutrients.Recent studies have demonstrated a high relationship between the number of infructescences and palm diameter [10] [11].
Despite significant results from the model, there is a limitation to the present study worth noting: the single measurement of Juçara entities.Several studies have reported the variation in Euterpe edulis production between different years, occurring alternation of productive individuals and productivity per plant [10] [11] [27].Another issue is fruit ripening.Paludo et al. [11] found 31.3% and 66.2% of the matrices in years of high and low production, respectively.Without the formation of mature fruits, the produce was aborted or pre-matured.
A long-term follow-up is necessary to evaluate specie behavior in Semidecidual Seasonal Forest, and, thus, to continue adjusting the model.Although the competition index tested in the present study was not significant, dependent and semi-independent indices of distance should be evaluated, since they may respond better to the probability of producing Juçara curls.

Conclusions
The use of a logit regression model demonstrated the potential for prediction of non-timber forest products.It allowed to specify the variables of greater contribution to production, which can be important in determining management practices.
Higher bunch production in the upper stratum (h > 12 m) indicated the

A
. L. S. de Abreu et al.DOI: 10.4236/ajps.2017.8111932848 American Journal of Plant Sciences inhabitants, are located in this biome.

19 and 1 .
23 are respectively obtained.With each one-unit increase in DBH, the entity's chance to produce increases 1.19 times.Regarding height, the one-unit increase represents a 1.23 increase in production probability.
The chi-square LR value for the model with DBH and height was 353.56, rejecting the H0 hypothesis that coefficients equal zero.The countR2 value, 0.8171, indicated that 81.71% of predicted values are the same as observed.The 2 McF R value was 0.3834, indicating that 38.34% of the variation of the dependent variable can be explained by DBH and h.These results demonstrate a low contribution of the competition index tested, given the reduced decrease in the fit quality of the model, when compared to the first adjustment.The DBH has a high correlation with IID 1 (r = 0.7917, Table

Table 2 .
Correlation between dendrometric variables, number of bunches and indexes of competition.

Table 3 .
Height stratification of sampled forest stands.

Table 4 .
Diameter distribution of Juçara entities and cluster amount per diameter class.

Table 5 .
Euterpe edulis distribution and bunches production, where: Lower = The countR 2 value, which established the amount of correct predictions in relation to the total number of observations, was 0.8232.This indicates that the model predicts 82.32% of observations correctly.The 2

Table 6 .
Logit results for the estimation of Euterpe edulis production probability.

Table 7 .
Logit results for the assessment of production probability of Euterpe edulis with independent variables DBH and h.