Optimizing Forest Sampling by Using Lagrange Multipliers

In two-phase sampling, or double sampling, from a population with size N we take one, relatively large, sample size n. From this relatively large sample we take a small sub-sample size m, which usually costs more per sample unit than the first one. In double sampling with regression estimators, the sample of the first phase n is used for the estimation of the average of an auxiliary variable X, which should be strongly related to the main variable Y (which is estimated from the sub-sample m). Sampling optimization can be achieved by minimizing cost C with fixed varY , or by finding a minimum varY   , , , for fixed C. In this paper we optimize sampling with use of Lagrange multipliers, either by minimizing variance of Y and having predetermined cost, or by minimizing cost and having predetermined variance of Y.


Introduction
All decision-making requires information.In forestry, this information is acquired by means of forest inventories, systems for measuring the extent, quantity and condition of forests [1].More specifically, the purpose of forest inventories is to estimate means and totals for measures of forest characteristics over a defined area.Such characteristics include the volume of the growing stock, the area of a certain type of forest and nowadays also measures concerned with forest biodiversity, e.g. the volume of dead wood or vegetation.
The main method used in forest inventories in the 19 th century was complete enumeration, but it was soon noted that there was a possibility to reduce costs by using representative samples [2].Sampling-based methods were used in forestry a century before the mathematical foundations of sampling techniques were described [3][4][5][6][7][8][9].In this paper we attempt to optimize sampling with use of Lagrange multipliers, either by minimizing variance of the forest variable we are interested in and having fixed cost, or by minimizing cost and having fixed variance of the variable in question.

The Method of Lagrange Multipliers
Lagrange multipliers is a method of evaluating maxima or minima of a function of possibly several variables, subject to one or more constraints [10].This method, which is due to Joseph Louis de Lagrange (1736-1813), is used to optimize a real-valued function , where x 1 , x 2 , , x n are subject to m (<n) equality constraints of the form where g 1 , g 2 , , g n are differentiable functions.
This determination of the stationary points in this constrained optimization problem is done by first considering the function  where and λ 1 , λ 2 , , λ m are scalars called Lagrange multipliers.By differentiating (2) with respect to x 1 , x 2 , , x n and equating the partial derivatives to zero we obtain 1 0, 1, 2, , .
Equations ( 1) and (3) consist of m + n unknowns, namely, x 1 , x 2 , , x n ; λ 1 , λ 2 , , λ m .The solutions for x 1 , x 2 , , x n determine the locations of the stationary points.The following argument explains why this is the case.
Suppose that in Equation ( 1) we can solve for m x i 's, for example, x 1 , x 2 , , x n , in terms of the remaining nm variables.By Implicit Function Theorem (see Appendix 1), this is possible whenever In this case, we can write Thus f(x) is a function of only n -m variables, namely, x m+1 , x m+2 , , x n .If the partial derivatives of f with respect to these variables exist and if f has a local optimum, then these partial derivatives must necessarily vanish, that is, Now, if Equations ( 5) are used to substitute h 1 , h 2 , , h m for x 1 , x 2 ,  x n , respectively, in Equation ( 1), then we obtain the identities By differentiating these identities with respect to x m+1 , x m+2 , , x n we obtain 1 1, 2, , ; Let us now define the vectors 1 2 , , , , , , Equations ( 6) and ( 7) can then be written as : : : : : : where : : : From Equation ( 8) we have : : : : : : By making the proper substitution in Equation ( 9) we obtain : : : where Equations ( 10) can then be expressed as 1 0, 1, 2, , .
From Equation ( 11) we also have Equations ( 12) and ( 13) can now be combined into a single vector equation of the form , which is the same as Equation (3).We conclude that at a stationary point of f, the values of x 1 , x 2 , , x n and the corresponding values of λ 1 , λ 2 , , λ m must satisfy Equations (1) and (3).

Lagrange Multipliers in Sampling Optimization
In two-phase sampling, or double sampling, from a population with size N we take one, relatively large, sample size n.From this relatively large sample we take a small sub-sample size m, which usually costs more per sample unit than the first one.In double sampling with regression estimators, the sample of the first phase n is used for the estimation of the average of an auxiliary variable X, 1 X , which should be strongly related to the main vari- able Y.
In the sub-sample m both auxiliary X and main Y variables are measured, in order to estimate their means X 2 and 2 Y , respectively.The regression estimator Ŷ and its estimated variance var Y are ( [7,[11][12][13]): Solving the system of Equations ( 14), ( 15) and ( 16) we find n, m and λ.The reverse problem, viz.finding where m Y s is the variance of Y in the sub-sample m, and r is the estimated correlation coefficient between X and Y.
var Y for fixed C, is solved in a similar way.
In order to explain how Lagrange multipliers wo describe the following example: Assume cost C of an industry producing two products x and y, is gi An approximate cost function could be We assume an approximately normal distribution of Ŷ , so that the 95% confidence interval for Y would be: where . Now we must choose n and m in such a way, that half a confidence interval does not exceed a value D, fixed a priori, where D also may be expressed as a fraction (E) of var Y : To this end we construct the Lagrange function: rk, we that the total ven by the equation hat is     .Thus, we have: By solving the equatio find that x = 13, y = 7 and λ = -71.Consequently, to the limit, if production was 19 instead ther words, if we required 19 total proction units, the total cost would be reduced by 71 monetary units (710 -71 = 639).Generally, λ represents itation is increased by one unit.

Other Uses of Lagrange Multipliers in Forest Inventories
If there are not enough sample plots to give sufficiently good inventory results using only forest measurements, we may try to make use of auxiliar with forest variables.The most obvious way is to u ratio or regression estimators (see Appendix 2).T libration estimator of De extension of the regression estimator for obtaining population totals using auxiliary information.Both regression and calibration estimators can be employed if there are auxiliary variables for inventory sample plots known for which the population totals are also known, e.g.variables obtained from remote sensing or from GIS systems.The appeal of calibration estimators for forest inventories comes from the fact that they lead to estimators which are weighted sums of the sample plot variables, where the weight can be interpreted as the area of forest in the population that is similar to the sample plot.
The basic features of the calibration estimator of Deville and Särndal [14] in terms of estimating means can be described as follows.Consider a finite population U consisting of N units.Let j denote a general unit, thus . In a forest inventory the population is a region where units are pixels or potential sample a vector of plo yj and ts.The units in a forest inventory will be referred to here as pixels, and it will be assumed that an inventory sample plot gives values to the forest variables for an associated pixel.Each unit j is associated with a variable auxiliary variables xj.The population mean of x, x is assumed to be known.The y variables in a forest inventory are forest variables and the x variables can be spectral variables from remote sensing or geographical or climatic variables obtained from GIS databases.
Assume that a probability sample S is drawn, and y j and x j are ob j in S, the objective being to estimate the mean of y, served for each (18) If the distance between d j and w j is defined as the calibration estimator will be the same as the regression estimator where ˆd X and b (a weighted regression coefficient vector) are ˆd If the model contains an intercept, the corresponding variable x will be one for all observ Equation (18) will then guarantee that the s w j add up to one.This means that when estimating totals, the weights Nw j will add up to the number of pixels in the population.Thus Nw j te ations, and the calibration weight known total can be in-rpreted as the total area, in pixel units, for plots of forest similar to plot j.The standard least squares theory implies that the regression estimator (19) can be expressed in the form ˆr It is assumed that the intercept is always among the parameters.Estimator (21) is defined if the moment matrix Some of the weights w j in (17)  Since the calibration estimator is asymptotically equivalent to the regression estimator, Deville and Särndal [14] suggest that the variance of the calibration estimator should be computed in the same way as the variance of the regression estimator using regression residuals.There is no design-un matic sampling [7].The emphasis on area interpretation for the weights has the same argument behind it as was used by Moeur and Stage [14] for the most similar neighbour method (MSN), where unknown plot variables are taken from a plot which is as similar as possible with respect to the known plot variables.
presents a percentage of the total area, and all the forest variables are logically related to each other.The difference is that in the calibration estimator we obtain an estimate of the area of the sample plot for the whole population whereas in the MSN method each pixel is associated with a sample plot.Since there is no straightforward way of showing that the MSN method produces optimal results in any way at the population level, it may be safer to use the calibration estimator for computing population-level estimates for forest variables.The problem with the calibration estimator is that it does not provide a map.If a map is needed, then the weights provided by the calibration estimator need to be distributed over pixels using separate after-processing.
Lappi [15] proposed a small-area modification of the calibration estimator which can be used when several subpopulation totals are required simultaneously.He used satellite data as auxiliary information for computing inventory results for counties.Sample plots in the surrounding inclusion zone are also us pulation so that the prior weight decreases as distance increases.The error variance is computed using a spatial variogram model.Block kriging [16] provides an optimal estimator for subpopulation totals under such a model, but kriging can produce negative weights for sample plots, and the weights are different for each y variable.Thus it is not possible to give areal interpretations to sample plot weights in kriging.
production is limited with a limitation of 20 units, t 20 x y The economic meaning of λ is this: λ is the reduction of the total cost of 20 units.In o du the marginal effect on the cost function, when production lim y variables correlated se he ca ville and Särndal[14] is an ns' system (a), (b) and (c), we 710  .
π j be the inclusion probability and d j the basic sampling design weight prior weights d j and posterior w istance function G, taking account of the calibration equation eights w j for a positive d