Normal and Bootstrap Confidence Intervals in Bitterlich Sampling

The Bitterlich Sampling (horizontal point sampling) is a common method in forest inventories. By this method, the Horvitz-Thompson estimator is used in a number of independent sampling points for the estimation of overall tree volume in a forest area/stand. In this paper, confidence intervals are con-structed and evaluated using the normal approach and two bootstrap methods; the percentile method (C α ) and the bias-corrected and accelerated method (BC α ). The simulation results show that the normal confidence interval has better coverage of true value at sample size 10. At sample sizes 20 and 30, it seems that there are no substantial differences in coverage between confidence intervals, although it could be noted a small superiority of BC α method. At sample size 40, the coverage of the three confidence intervals is higher than the nominal coverage (95%).


Introduction
Sampling in forest inventories is usually done by installing random points on the ground and selecting a group of trees around the points. Trees are generally selected using the two most well-known forest sampling methods: the fixed-area plot sampling and Bitterlich Sampling (BS) or horizontal point sampling.
In fixed area plot sampling, fixed shape and size are defined at each point (center) and are the basic sampling unit in which all the trees are measured (Kershaw Jr., Ducey, Beers, & Husch, 2016;Matis, 2004). In BS, the tree j is selected in the sample if the random point i is at a distance cr j from the tree, where r j is the radius of the circular surface (cross-section) of the tree at 1.30 m height from the ground basal area and c is a constant, which is suitably selected to achieve a desired sampling density (Gregoire & Valentine, 2007;Roesch, Green, & Scott, 1993). The probability of selecting trees, by this method, is proportional to their basal area. The Horvitz-Thompson estimator can be used for parameter estimations such as the total volume of the forest area (Horvitz & Thompson, 1952;Schreuder, Gregoire, & Wood, 1993). The distribution of total estimates from sampling with probability proportional to size is unknown (Hájek, 1981), therefore estimating confidence intervals based on the normal distribution may not be accurate. In forestry, many sampling designs with probability proportional to size (prediction) have a small sample size, so arising the question: how much accurate and consistent confidence intervals can be estimated in these cases (Magnussen, 2001)? This is also happening for small-scale forest management several times, so for economic reasons non-large fixed-areas samples or Bitterlich sampling points are selected. The simple application of the bootstrap method gives reliable estimates of variance for all regression estimators that have been used as well as for the Horvitz-Thompson estimator of BS (Schreuder, Ouyang, & Williams, 1992). In the case of small sample sizes, the estimating confidence interval with bootstrap methods did not behave well (Schreuder & Williams, 2000). The nearest neighbor techniques; parametric, bootstrap and jackknife variance estimators produced comparable results (McRoberts, Magnussen, Tomppo, & Chirici, 2011). Recent research (Lyons, Keith, Phinn, Mason, & Elith, 2018) revolves that the resampling procedure provided accurate estimates of error for remote sensing classification and accuracy assessment. In general, there seem to be no results for confidence intervals evaluation with BS and bootstrap methods.
The purpose of the research is the evaluation of confidence intervals which have been created with Horvitz-Thompson estimator by applying the BS and utilizing bootstrap methods with small sample sizes. The results will be of great practical value because the data comes from a solid productive forest ecosystem.
In the next chapter, the BS is described somewhat more extensively, since the method is unknown in general, apart from those dealing with forest ecosystems. Additionally, methods of constructing and evaluating confidence intervals are given and the dataset acquisition is described. In chapter 3, the results are given and discussed while conclusions are drawn in the 4th chapter.

Methods and Data
The BS can be described in various ways (Eriksson, 1995). The application of the method can be done (De Vries, 1986;Overton & Stehman, 1995) as follows: In a simple random or systematic way, we place a sample of n points on the forest area, of which we want to estimate the characteristic Y. From each sample point, we aim all the trees at 1.30 meters height above the ground (breast height), projecting an angle to diameter by means of an instrument (e.g. relaskop), making a complete (360˚) rotation around the point. Trees, whose diameter at the breast height is greater than the angle α, are considered to be trees of the sample. If the diameter is equal to the projection of the angle α, there are ways in which it is judged whether these trees belong to the sample (De Vries, 1986;Kershaw Jr. et al., 2016). If y j is the volume of the trunk of the j-th tree, then the volume of all the trees (M) of the forest area is given by the The Horvitz-Thompson estimator of Y (De Vries, 1986;Schreuder et al., 1993) at the i-th sampling point is given by the following formula where F is the criterion of tree selection (Basal Area Factor, BAF), A is the area of the forest, the tree basal area (the area of the cross-section at the breast height of the tree) of the j tree and m i the number of trees selected in the sampling point i. The probability of selection, ij ij g FA π = , depends on tree basal area of the tree and therefore larger in volume trees have a greater probability of being selected in the sample.
Although BS has many attractive features, the selected sample of trees at a single sampling point is a sample-group of adjacent trees, with consequence Y values being correlated (Overton & Stehman, 1995). Better estimates of the characteristics of the forest area are made by taking a number of n independent points. Then, the estimate of Y is given as where jj π is the probability of both trees j and j΄ being included in the sample. An unbiased variance estimator (Palley & Horwitz, 1961;Schreuder et al., 1993) is given by the formula The variance, as well as the Ŷ estimates, can be easily generated (Schreuder et al., 1993), either considering BS as a special case of sampling with a probability proportional to size, where the number of trees is a random variable (Palley & Horwitz, 1961) or considering it as a simple random sampling of the n from N clusters in the population (Schreuder, 1970).
Both a normal and two bootstrap confidence intervals were estimated (Efron, 1982;Efron & Tibshirani, 1993). The bootstrap intervals were calculated with the percentile method (C α ) and the bias-corrected and accelerated method (BC α ).
Assuming that Ŷ is normally distributed, a confidence interval for Y with where z α/2 is the value of the standard normal distribution and  ( ) . se the estimated standard error.
The (1 − α) 100% confidence interval with the percentile method, C α , is given where ( ) 2 a Y and ( ) 1 2 a Y − the 100α/2 and 100(1-α/2) percentiles respectively of the bootstrap distribution. In the C α interval, with BC α a correction is made for bias and skewness. Thus, the corresponding interval with BC α is estimated given where ( ) In the Equations (1) & (2), Φ (.) is the standard normal cumulative distribution function and 0, z a are the coefficients for bias and acceleration.
Finally, was calculated the percentage of confidence intervals covered by the Y parameter, the percentages miscoverage of Y on each side, the average width of the confidence intervals, as well as the coefficient of variation of the widths confidence intervals.

Results and Discussion
The results of the experiment are presented in Table 1. We note that for all confidence intervals the total coverage of the real population value increases correspondingly with the increase of the sample size. The opposite happens with the failure coverage percentages of the true value, which is constantly decreasing. Table 1. Percentage (%) coverage failure of the true value Y, from the left (Lfcov) and from the right (Rfcov), % total coverage (Tcov), width of confidence intervals (Width) and widths coefficient of variation (CVwidth) as a function of the sample size (1 − a = 0.95). The widths of confidence intervals as well as their variability decrease as the sample size increases for all estimated confidence intervals. Thus, the simulation iteration (5000) and sampling bootstrap (1500) numbers appear to be sufficient, ensuring consistency for all the estimates. With sample size 10, the overall coverage of the normal confidence interval is better (92.8%) than the coverage of both bootstrap methods (90.95% and 91.65%), but the width (1730) of the normal interval is greater from the widths of the bootstrap methods (1653 and 1681). At sample sizes, 20, 30 and 40 are not being observed significant differences in the overall coverage of the three confidence intervals, with the BCα method having slightly better coverage rates. The same is true for the confidence intervals widths, but now they are slightly smaller in the normal confidence interval. The variability of the confidence intervals is approximately the same (15.40, 15.46, 15.83). The 95% nominal coverage approach appears to be between sample sizes 30 -40 since in size 40 and the three confidence intervals it exceeds 95% nominal coverage. By comparing the two bootstrap methods, BCα has a slightly better coverage up to sample size 30 and, correspondingly, slightly larger widths in confidence intervals. With t-approach, the probability of coverage reached 94.5% for the 10-sample size, but at the same time significantly increased the confidence interval width (21.35%). Thus z-approach was preferred in order to keep at the same level the confidence interval widths up to less than 5%. A research result of Zhou & Dinh (2005) for the mean of the sample shows that if ˆ0.3 n γ < , where γ is the skewness of the sample, the confidence interval which is based on t-approximation is good enough. The study found ˆ0.15 n γ < where ˆ0.46 γ = and could be verified this result by considering BS as a simple random sampling n of N clusters of the population. The bootstrap confidence intervals were not well behaved for sample size 10, and this comes to an agreement with a relative result by Schreuder & Williams (2000) for small sample sizes, although for different variables of the forest stand.

Conclusion
In conclusion, all three methods of constructing confidence intervals, to a large extent, almost approximate the nominal coverage in sample size 30, while providing satisfactory coverage (>93%) in sample size 20. The normal confidence interval still has satisfactory coverage in the sample size 10, while for the same sample size, the bootstrap methods do not seem to perform well. The results came from a particular forest ecosystem with a clustered spatial distribution of trees and continuous management. However, it also needs research from other, different forest ecosystem structures in order to better evaluate the same confidence intervals, but also other types of confidence intervals suggested by the literature.