_{1}

^{*}

Nitrogen rate trials are often performed to determine the economically optimum N application rate. For this purpose, the yield is modeled as a function of the N application. The regression analysis provides an estimate of the modeled function and thus also an estimate of the economic optimum,
N
_{opt}. Obtaining the accuracy of such estimates by confidence intervals for
N
_{opt} is subject to the model assumptions. The dependence of these assumptions is a further source of inaccuracy. The
N
_{opt} estimate also strongly depends on the N level design,
i.e., the area on which the model is fitted. A small area around the supposed
N
_{opt} diminishes the dependence of the model assumptions, but prolongs the confidence interval. The investigations of the impact of the mentioned sources on the inaccuracy of the
N
_{opt} estimate rely on N rate trials on the experimental field Sieblerfeld (Bavaria). The models applied are the quadratic and the linear-plus-plateau yield regression model.

The effect of N fertilizer on the yield of agricultural crops can be studied using N response functions. Such functions are usually fitted to the data from N rate trials by regression. The available function types used for modeling purposes in the course of this discussion are for example, quadratic (e.g. [1,2]) or the linear-plus-plateau functions (e.g. [

The economic optimum is reached when the marginal cost of the N fertilization corresponds to the marginal revenue, i.e. when the returns above the N fertilizer cost (RANC) are maximized. For a given product price p (in EUR·Mg^{–1}) and a N fertilizer price r (in EUR·kg^{–1}), these returns are computed as

where Y is the measured yield in Mg∙ha^{–1} and N is the N rate in kg∙ha^{–1}. The N rate where these returns above the N fertilizer cost are maximized, is the economically optimal N rate, N_{opt}.

The evaluation of any N rate trial is usually followed by the analysis of the residuals and the determination coefficient R^{2}, to justify the choice of the model applied. However, as discovered by [^{2} is not a suitable measure, as it barely depends on the model chosen. The point estimate for N_{opt}, which is derived from the fitted model does not however provide any information on the accuracy or reliability. Therefore, our objective is to compute and discuss confidence intervals for N_{opt}, which will be based on quadratic and linear-plus-plateau N response functions, with the consideration of extensive N rate trials as an example. The results could be used for optimizing decision making in nitrogen management. However, we shall see that the confidence intervals are very long, and further, they strongly depend on the model chosen and on the area of N levels used to fit the model using regression functions. These sources of inaccuracy make it nearly impossible to locate the optimum N fertilizer rate in a way that supports decision making.

In the quadratic yield model, the expected yields, E(Y_{i}), are described by a quadratic function of the total N application rates, N_{i}. Therefore, the yields, Y_{i}, i = 1, ∙∙∙, n, were modeled as random variables that depended on N_{i} in the following way:

N_{i} denote fixed levels of N application rates, b_{j}, j = 0, 1, 2, the fixed unknown coefficients of regression, and e_{i} the error variables, which are assumed to be independent and normally distributed with an expected value 0 and a common unknown variance s^{2} > 0. The unknown coefficients b_{j}, are estimated using the least-squares estimates b_{j}. This requires at least three different N-levels. Otherwise the estimates would not be unique.

The economically optimal N application rate, N_{opt}, is the N-rate where the expected returns above N fertilizer cost, E(RANC) = p×(b_{0} + b_{1}N + b_{2}N^{2}) – r×N, are maximized. This applies to the following optimum N rate:

N_{opt} results from the N where the first derivative of the model parabola, E(Y) = b_{0} + b_{1}N + b_{2}N^{2}, equals r×p^{–1}, which is the ratio between N fertilizer and the product price.

By using the least-squares estimates b_{j} for estimating the coefficients b_{j}, the point estimate for N_{opt }was immediately reached_{ }:

Note that this point estimator, is biased^{1} because it is not a linear combination of the unbiased least-squares estimators b_{1} and b_{2}. It is in essence a ratio.

According to [_{opt} consists of all hypothetical values N_{0} whose simple null hypothesis,

cannot be rejected. By using (3), the null hypothesis in (5) can be reshaped to

This is a linear hypothesis, so it can be tested in the framework of general linear models using the usual Ftest [_{0} for which H_{0} in (5) or (6) cannot be rejected. For those interested in the area of statistical inference, this test and the corresponding derivation of the confidence set, which can be produced by explicit mathematical formulae, is described fully in [_{opt} was implemented in the Fortran program VINO. EXE, which can be downloaded from the internet [

Under the assumptions made (quadratic model, independent and normally distributed homoscedastic errors), tests of linear hypotheses, such as in (6), are considered exact, so exact confidence intervals are obtained, whereas confidence intervals derived according to [11,16,17], which are also called Wald intervals, would only provide approximate confidence intervals. They are symmetric around the point estimate, which is not even unbiased. Therefore, they “may not accurately reflect the actual, often asymmetric, uncertainty in an estimate” [

In the linear-plus-plateau model, the economic optimum, N_{opt}, equals the transition of the increasing straight line to the horizontal, unless the price ratio rp^{–1} is greater than the gradient of the increasing straight line (cf. [_{opt} could be obtained using PRISM. They do not, contrary to the parabola-based confidence sets, depend on prices.

The test field, Sieblerfeld (5 ha), is in the Tertiary hills of Upper Bavaria, Germany, and it has two very different yield zones. The soil texture in the high yielding-zone is a sandy loam with an available field capacity of the rooted soil horizons of 160 mm. In the low-yielding zone the soil texture is a loamy sand with an available field capacity of the rooted soil horizons of 100 mm [

The trial design was a randomized complete block design with four blocks in each yield zone. To investigate the dependence between yield and N fertilizer rate, 11 plots that were given with different rates of N (0, 80, 100, 120, 140, 160, 180, 200, 220, 240 and 260 kg∙ha^{–1}) were selected randomly from each block in both zones, so that there were n = 44 yield measurements in each yield zone. The yields were measured with a plot combine. The plots were 12.5 m^{2} with a length of 10 m and a width of 1.25 m.

Note that randomized complete block designs are advantageous over complete block strip trials because only the former ensures independence of the variable under study. Designs of the latter kind are widely used [23-25],

but their lack of randomization within strips can result in the type of strip heteroscedasticity and correlation identified in [

The results are presented in four figures (Figures 2-5) with different models and different designs of N levels. The narrow boxes show the confidence intervals, or, more generally, the confidence sets for N_{opt} when the ratio between N fertilizer price r and crop price p is r×p^{–1} = 0.0054545 (EUR·kg^{–1}) (EUR·Mg^{–1})^{–1} = 5.4545. This applies, for example, to r = 1.20 EUR·kg^{–1} and p = 220 EUR·Mg^{–1}. This ratio corresponds to the model’s target slope that is to be reached by the optimum N fertilization. It is indicated by dotted lines.

In [

analysis and the mentioned methods to compute confidence sets for the N optima, can be applied.

A likelihood ratio confidence interval in the quadratic model need not be symmetric around the point estimate. In

including the zero N rate is considered, the confidence interval [204 kg∙ha^{–1}, 328 kg∙ha^{–1}] for the high-yielding zone is only 36 kg∙ha^{–1} long to the left of the point estimation 204 kg∙ha^{–1}, whereas to the right it is 88 kg∙ha^{–1}, which is nearly three times as long. This is, above all, due to the fact that yields from very high N rates do not seem to sink. For the low-yielding zone, a point estimation of 199 kg∙ha^{–1} and a confidence interval of [173 kg∙ha^{–1}, 260 kg∙ha^{–1}] present a similar situation with regards to the asymmetry. It seems that a concave parabola with a vertex at the far right of the point estimation is easily compatible with the measured data, whereas a concave parabola with a vertex at the far left of the point estimation is not suited for fitting the yields, since those to the right of such a vertex do not sink.

We can see in Figures 2-5 that all confidence intervals or, more generally, confidence sets are very long.

In _{opt} in the low-yielding zone [173 kg∙ha^{–1}, 260 kg∙ ha^{–1}], is somewhat shorter than the 95% confidence interval for N_{opt} in the high-yielding zone [204 kg∙ha^{–1}, 328 kg∙ha^{–1}]. This can be explained as follows. In contrast to the high-yielding zone, the higher N doses in the lowyielding zone did not result in further increases in yield while a small yield depression could even be observed at 260 kg∙ha^{–1} that forced the regression function to go down. Consequently, the parabola’s maximum can be more clearly determined so that a long confidence interval could also be avoided. Higher N rates in the higheryielding zone have a shortening effect on the confidence interval for N_{opt}.

When considering the enormous length of both the 95% confidence intervals, it becomes clear that even in such extensive N rate trials, the ex post estimated optimum N rates can only be roughly estimated. It is not possible, in retrospect, to limit the optimal nitrogen quantity to a level of less than 87 kg∙ha^{–1} length in the low-yielding zone or 124 kg∙ha^{–1} in the high-yielding zone. The length of the confidence intervals results from the fact that N fertilization have a very wide marginal profit area. From the economic point of view, however, the additional N quantity used ruins the economic advantage of the increase in yield. This leads to a very flat function around the optimum for returns above N fertileizer cost, making it compatible with many model parabolae with widely ranging vertices. The set of these vertices is the confidence interval, which is therefore very long.

The length of the confidence intervals gives us a first insight into the uncertainty concerning the true, unknown N_{opt}. Although their enormous length makes clear that reasonable statements about N_{opt} can hardly be made, this source of uncertainty is the only one that can be controlled statistically because the confidence sets could be shortened if more data were available.

In order to analyze the influence of the N rate trial design on the point estimate and confidence set for the economically optimum N rate, point estimates and confidence intervals for N_{opt} with different designs of N levels were calculated. In ^{–1}) are considered, whereas in

Taking zero fertilization not into account leads to a higher estimation of the economically optimum N rate, N_{opt}, and to the lengthening of the confidence interval. This can be observed in the quadratic model (compare

The gap could disappear if, in addition, the error probability were so “little” that every vertex can be considered as being a “little” compatible with the data. In practice, all these confidence sets are, of course, worthless. Based on the research findings, feasible fertilization rules cannot be given at a high level of confidence. The gap type or the extreme length of these confidence sets, without zero fertilization, is an indication of the fact that the function for the returns above N fertilizer cost, which is to be maximized, must be relatively flat in the area of its maximum, making it very difficult to locate. _{}

If, however, the zero N rate is taken into account (

However, by excluding zero fertilization, the model is limited to a really small area of interest and the weakness is then of little importance. The smaller the region, the better it can be modeled by a simple function. On the one hand, the confidence interval is extended when omitting the zero fertilization, but on the other hand, it is also made more trustworthy since it is less influenced by the model assumptions. _{}

In the introduction five function types used for modeling the yield response are mentioned (quadratic, linearplus-plateau, Mitscherlich, quadratic-plus-plateau, squareroot). The evaluation of N rate trials based on these models is related to the point estimates for N_{opt}. Usually, confidence intervals are only computed in the linearplus-plateau model, where N_{opt} is one of its parameter. The confidence intervals of [

As far as the goodness of fit is concerned, it can be stated that both models, the quadratic (_{opt} (cf. [28-31]).

Such discrepancies also apply to the confidence intervals for N_{opt} and their length. The linear-plus-plateau model provides smaller confidence sets. When considering all N levels, as in ^{–1}) (kg∙ha^{–1})^{–1} = 26 for the high-yielding zone and 0.033 (Mg∙ha^{–1}) (kg∙ha^{–1})^{–1} = 33 for the low-yielding zone. ^{–1}. Thus, the economic optimum, N_{opt}, equals the transition of the increasing straight line to the horizontal, where confidence intervals for N_{opt} could be obtained using PRISM [

The shorter length of the confidence intervals can be attributed to the fact that the transition from the increaseing straight line to the horizontal is clearly easier to identify than the position where a parabola has a rather flat positive gradient (_{opt} differed so much in both models. When considering only the design with all N levels, these confidence intervals did not even overlap. In the high-yielding area, the linear-plus-plateau model resulted in a 95 % confidence interval of [158 kg∙ha^{–1}, 200 kg∙ha^{–1}] (^{–1}, 328 kg∙ha^{–1}] (_{opt} estimate lies between 106 and 148 kg∙ ha^{–1} (^{–1} (

The latter clearly shows that a confidence interval does not sufficiently enough reflect the uncertainty of the true optimum N rate. The fact that the true exact model is unknown and that no simple model reflects the reality in an appropriate way makes the uncertainty much greater.

From statistical point of view, the accuracy of the N_{opt} estimate can be seen from a confidence interval. Confidence intervals would become shorter if the trials were based on more data. Their length would even tend to zero if the number of data tended to infinity, and thus, it seems that the inaccuracy of the N_{opt} estimate could be controlled by expanding the field trials.

However, as the Sieblerfeld trials have shown, another source of inaccuracy of the N_{opt} estimate is given by the yield response model, because no model reflects reality exactly. As far as the point estimation of N_{opt} is concerned, marked discrepancies with respect to the underlying model had already been pointed out by many authors [6,28-30]. These discrepancies equally affect the corresponding confidence intervals, and therefore, the confidence sets differ with respect to the model choice. In the Sieblerfeld trials, confidence intervals could be obtained that did not even overlap when computed on the basis of two different models (quadratic and linear-plus plateau). This kind of uncertainty would not decrease by expanding the trials.

A third source of inaccuracy arises from the chosen design of N levels on which the model is fitted. To avoid long confidence intervals or even gap-type confidence sets, which additionally contain profit minima, the empirical data should clearly indicate a concave model, and therefore, the confidence intervals should be based only on designs that include the zero N rate and very high N. By using such datasets, these extreme N levels will have, due to their leverage effect, a strong influence on the regression function and thus also on the point estimation and the confidence interval for N_{opt}. These are then mainly determined by the choice of the modeling of the yield function between the extreme N levels, which lie far away from the economic optimum. Therefore, to diminish the large impact of the model choice on the N_{opt} estimate, the area on which to fit the model should only be a small neighborhood around the supposed true N optimum. Then, however, a concave shape of the N-Y scatterplot is no longer clearly indicated, and large confidence sets would arise. They would be more trustworthy as they less depend on the model choice, but to make them usable, they need to be much smaller, which requires an exorbitant increase of the number of data in the trial, and this is hardly possible in practice.

The wide range and large difference of confidence intervals under different models and N level designs might also be attributed to the fact that the area of N fertilizer applications that are very close to the economic optimum is very wide. It is a general experience that the economic profit would not reveal much of a difference in an N application range of 150 kg∙ha^{–1} to 250 kg∙ha^{–1}. Usually, each of these values lies in approximation to the optimum. From an ecological point of view, the challenge should be to identify the corresponding N rate recommendations on the basis of trial results from the lower limit of such ranges, so that economically unnecessary N balance surpluses are avoided.