_{1}

^{*}

The world needs around 150 Pg of negative carbon emissions to mitigate climate change. Global soils may provide a stable, sizeable reservoir to help achieve this goal by sequestering atmospheric carbon dioxide as soil organic carbon (SOC). In turn, SOC can support healthy soils and provide a multitude of ecosystem benefits. To support SOC sequestration, researchers and policy makers must be able to precisely measure the amount of SOC in a given plot of land. SOC measurement is typically accomplished by taking soil cores selected at random from the plot under study, mixing (compositing) some of them together, and analyzing (assaying) the composited samples in a laboratory. Compositing reduces assay costs, which can be substantial. Taking samples is also costly. Given uncertainties and costs in both sampling and assay along with a desired estimation precision, there is an optimal composite size that will minimize the budget required to achieve that precision. Conversely, given a fixed budget, there is a composite size that minimizes uncertainty. In this paper, we describe and formalize sampling and assay for SOC and derive the optima for three commonly used assay methods: dry combustion in an elemental analyzer, loss-on-ignition, and mid-infrared spectroscopy. We demonstrate the utility of this approach using data from a soil survey conducted in California. We give recommendations for practice and provide software to implement our framework.

Climate change is likely to put enormous strain on nature and human societies in the coming decades. It is largely driven by the release of atmospheric carbon dioxide (CO_{2}) that was once sequestered in the earth, either in fossil fuels or as soil carbon. Since agriculture began, soils have lost about 50% - 70% of their carbon to the atmosphere. Soil still accounts for the 2nd largest store of carbon on Earth after the ocean, containing about 7.5 times that of the atmosphere [

In pursuit of solutions, a growing movement of farmers and other advocates are highlighting “regenerative agriculture” as a way to make agriculture a net sink, rather than a source, of carbon—CO_{2} from the atmosphere and sequestering it in the land as soil organic carbon (SOC). Regenerative agriculture provides a variety of ecosystem services including water use efficiency, biodiversity, and overall soil health. These may be sufficient to support its use, but to pay for regenerative agriculture on the basis of SOC sequestration, decision makers need to know how much carbon is sequestered by different strategies.

In order to measure SOC sequestration, at a minimum scientists must be able to measure how much SOC is in a given plot of land at a given point in time. This task is referred to as “SOC stock estimation” Soil scientists accomplish SOC stock estimation by collecting multiple cores of soil from a given plot, preparing/processing the samples, and analyzing (assaying) their SOC concentration by a number of different techniques. SOC is either presented on its own, as a concentration, or it is converted to stock using soil bulk density measured on nearby intact cores. Both sampling and assay of SOC concentration are subject to uncertainties and both become expensive at the volumes necessary to overcome these uncertainties. All else equal, increasing the number of samples and the number of assays will reduce uncertainty while driving up costs. A process called compositing allows investigators to reduce cost by mixing together sampled cores and assaying the mixture(s), but compositing incurs additional error when there is uncertainty in the assay.

In this paper, we resolve this trade-off by presenting sampling and assay as an optimization problem. Given a fixed budget, we derive the sampling and assay sizes that minimize estimation uncertainty. Conversely, given a fixed estimation precision we’d like to achieve, we derive the optimal sizes to minimize the budget. The solutions depend on the heterogeneity and mean SOC concentration of the plot(s) under study, the assay error, and the costs associated with sampling and assay.

Our paper is organized as follows. In Section 2, we situate our work in the soil science and statistics literature. In Section 3 we formalize the objectives of SOC

estimation. We then turn to the logistics and statistics of estimation, covering sampling in Section 4, compositing in Section 5, sample preparation in Section 6, and assay in Section 7. Section 8 contains our main results: optimal sample and assay sizes to maximize precision under budget constraints. Scientists, farmers, and policy-makers can use these results to design their own efficient sampling and compositing strategies. To facilitate practical use of our methods, we demonstrate their use by applying them to data from a soil survey in California in Section 9. We derive optimal assay strategies and composite sizes in this setting. Section 10 discusses additional nuances, challenges, and extensions of stock estimation, and provides recommendation for practice. All of our work is supported by R software, available at https://github.com/spertus/soil-carbon-simulations.

As part of this paper we review the components of stock estimation and the processes of sampling, compositing, and assay. We focus on estimating the average concentration of SOC in a plot. In order to make minimal assumptions about the plot under study and for our results to be as general as possible, we take the design-based perspective on estimation. Thus, the model of SOC concentration in the plot is minimal. Specifically, we do not make any assumptions about the spatial distribution of SOC concentration. Inference proceeds from random sampling, while SOC concentration is unknown but fixed. Webster and Lark [

The design-based perspective contrasts with the model-based or “geostatistical” perspective, originally developed to map gold mines [

Patil et al. [

We analyze three laboratory assay methods used to measure SOC concentration in soil samples: loss-on-ignition (LOI), dry combustion in an elemental analyzer (DC-EA), and mid-infrared spectroscopy (MIRS). LOI involves measuring the difference in mass before and after heating samples in a furnace. The heating cooks off the organic matter in the soil—along with an unpredictable amount of “mineral” or structural water. The amount of mass lost can be mapped to the SOC concentration in the sample using ordinary least squares regression [_{2} released during the burn. DC-EA machines vary in their specifics, but are generally considered the gold-standard for precise determination of SOC concentration [

The core contribution of this paper is similar in spirit to a classical power analysis, which determines how many samples are needed to estimate quantities to within a desired precision or to run a hypothesis test at a desired power. Kravchenko and Robertson [

There is a precedent for analyzing optimal designs in soil science, but most of this work has been done in the geostatistical literature and generally concerns how to optimally distribute samples given an assumed model. If scientists have access to a reliable variogram describing the spatial distribution of SOC, then the sampling design can be optimized to minimize estimation or prediction variance van Groenigen et al. [

New ways of measuring SOC stocks continue to emerge at a rapid pace, driven by advances in technology and data science. Assay can now be accomplished directly in the field using techniques like mobile infrared spectroscopy, eddy covariance assay, inelastic neutron scattering, and laser-induced breakdown spectroscopy. These techniques tend to involve far more assay error than laboratory analyses [

SOC concentration (e.g. percent SOC or grams of SOC per kg of soil) is a (non-random) three-dimensional function in latitude, longitude, and depth. In this paper, we are interested in estimating the average concentration, μ , in a bounded area of land to some fixed depth; or the total stock of SOC T in the area. Typically, estimation occurs within fixed depth profiles, which can then be aggregated to whole-profile stock or concentration estimates. The equivalent soil mass method provides an important alternative strategy wherein profiles are defined to some predetermined mass, not depth [

We follow the convention of estimating concentrations and stocks within profiles defined by depth or mass. We thus suppress dependence on depth as we develop our ideas. For concreteness, the reader may imagine we are only discussing top-soil concentration or stock in what follows, though our analysis applies to any profile. We also stress that the maximum depth of the survey is very important. Many physical, chemical, and biological mechanisms can move SOC downward or cause soil loss at depth. Long-term management can impact deep soil SOC, so concentrations and stocks may need to be estimated down to a meter or more to accurately account for the SOC sequestration of different management strategies [

If we are only interested in average concentration, it suffices to estimate μ . If we want to estimate the stock T , we also need the bulk density in grams per cubic centimeter d, the area of the plot in square meters A , and the length of the profile in meters L. Assuming that bulk density is constant within depth, the total amount of carbon within the depth profile is

T ≡ 10 4 × L × A × μ × d .

The factor 10^{4} includes conversion of %SOC to gram per gram, and bulk density to grams per cubic meter. Different factors may be applied to report SOC in tons per hectare (Mg∙ha^{−}^{1}).

In reality, SOC is never exactly the same across a study area. The degree of heterogeneity can be expressed as the plot variance, σ p 2 , which is the average squared distance of SOC concentration from the mean μ (for a definition in symbols see Section A in the Appendix). If every point in the plot has the same SOC concentration μ , then σ p = 0 . On the other hand, if the SOC concentration is highly variable across the plot then σ p will be large. The maximum value, σ p = 50 , is attained when half the plot is 0% SOC and the other half is 100% SOC. Along with the assay precision, the plot heterogeneity σ p allows us to characterize the uncertainty in estimates of μ .

Investigators typically estimate μ by sampling relatively small amounts of soil from the plot under study. Soil samples can be taken using an auger, a corer, or by digging a pit. An auger can mix soil horizons, while with a corer horizons are typically kept distinct. Compaction can occur with either method, which may skew depths or density estimates. Digging a pit and drawing samples from the side may yield the best samples, with clear horizons and no compaction, but is relatively destructive and very labor intensive. In what follows, we typically refer to a distinct (uncomposited) sample as a “core,” though in principle it could be drawn by any of the above methods. Taking cores at randomly sampled locations can ensure that estimates of μ are unbiased. In this section, we describe three random sampling approaches that are regularly used in practice: uniform independent random sampling, stratified sampling, and cluster sampling.

Uniform independent random samples (UIRSs) are generated by sampling n points uniformly—no particular locations are favored and independently—the location of a particular core does not affect the location of any other. In the soil science literature, UIRSs are sometimes equated with “simple random samples” [

Stratified sampling can be used to take advantage of auxiliary information about the distribution of SOC, which can yield more precise estimates. For example, in rangeland the distribution of SOC may be driven by topography, vegetation type, mineralogy, microclimates, or land-use history [

Finally, cluster random samples are drawn by first choosing a point at random and then deterministically sampling along a regular transect or grid extending from the original point. Cluster random samples with a single random starting point are sometimes called “systematic random samples” in the soil science literature [

In this paper, we assume that cores are gathered by UIRSing. This makes our results quite general, and covers the wide range of applied cases where UIRSing is used. Furthermore, the variance of sample means from a stratified or cluster sample is typically lower than that of a UIRS—lower variance is the main reason why more sophisticated designs are used. Thus our results can be interpreted as a providing an upper bound on the uncertainty of these other sampling designs. Finally, if we assume that SOC is distributed completely randomly in a given plot (i.e. with no spatial correlation), then the properties of estimates based on a UIRS are equivalent to those based on stratified or cluster random sampling.

There are, however, certain land types or surveys where UIRSing can be logistically infeasible. For example, in row crop studies, only treated rows can be sampled, which is typically much easier to achieve using cluster sampling. Furthermore, note that there is a logistically optimal way to collect n cores by UIRSing. First, sample all n points from the plot, find the shortest path through all n points, and move along that path collecting cores at the sampled points. This is called the “traveling salesman problem” in computer science. The length of the shortest path through a UIRS of size n generated in a plot of area A tends to be about 0.72 n A [

Compositing is the practice of combining cores together from a particular profile in order to capture variability in the plot while reducing assay costs. Where we call n the number of cores, sampled from the field, k is the number of samples left after compositing. Edge cases are n = k , when we do no compositing, and k = 1 , when we composite down to one sample. We assume here that each composited sample is comprised of equal proportions of the constituent cores. We also assume that n is divisible by k and that each composited sample is comprised of exactly n/k cores. For example, we might take a UIRS of n = 30 cores from a plot and composite down to k = 6 composited samples of size n / k = 5 constituent cores. We also assume that samples are perfectly homogenized after compositing, so that equal parts of constituent samples are present in any given aliquot of the composited sample. Perfect homogenization may be difficult to achieve in some types of soils, like soils with high clay content that tend to clod, which can compromise the validity of compositing. Our final assumption is compositing additivity, which implies that the SOC concentration in a composited sample is equal to the mean SOC concentration of its constituent cores. Compositing additivity is met for SOC, but not for other properties like pH, which needs to be considered if investigators plan to measure such properties using the same samples.

There are two reasons why more compositing is not always better. First, assay error leads to (hopefully) unbiased but still variable assays, which needs to be reduced by assaying multiple cores or else by assaying a single core multiple times. Second, compositing is itself an error prone process. It can be very difficult to achieve exactly equal proportions and perfect homogenization, especially in heavy clay soils. These challenges can be alleviated and the errors are hedged by assaying more, smaller composite samples. Finally, in order to do inference we typically need to estimate the plot heterogeneity σ p , which can only be estimated when k ≥ 2 , a topic we return to in Section 8.3.

Logistically, compositing is almost always done in the field to reduce the labor of transporting all n cores to the laboratory. A drawback is that it may be more difficult to achieve good homogenization in field using crude tools on field-moist soil. Furthermore, it is generally important to composite at random. If nearby cores are composited together, which can arise naturally if compositing is done sequentially along a transect or shortest UIRS path, the properties of the sample variance of composited samples may be different. For example, suppose that nearby points tend to have similar SOC concentrations and that nearby points are systematically composited together. In this case the sample variance of composited samples of nearby samples will underestimate σ p 2 , which will lead to over-optimistic conclusions about the precision of an estimate of μ [

Sample preparation affects both the cost and precision of estimates of μ , and generally depends on the assay method (see

Procedure | LOI | DC-EA | MIRS |
---|---|---|---|

Transportation | √ | √ | √ |

Oven Drying | √ | × | × |

Air Drying | × | √ | √ |

Sieving | √ | √ | √ |

Grinding | × | √ | √ |

Check for IC | × | √^{*} | × |

After drying, samples are passed through a 2 mm sieve, which helps remove large bits of organic material (e.g. large roots) and rock. Nevertheless, it can be challenging to differentiate between aggregates and rocks, and to make sure that all > 2 mm aggregate material makes it through the sieve. In particular, some soils are too hard once they dry and must be broken up with a mortar and pestle before they can be sieved. Roots may also be picked out by hand. Some studies aim to isolate and separately quantify root fractions. Furthermore, when comparing plots (e.g. in an experiment), carbon in roots can overshadow differences in SOC content [

After drying, samples are ground to a fine powder (e.g. in a ball mill), which helps ensure homogenization and accurate assay. MIRS can be very sensitive to the size and uniformity of the grind [

Finally, many elemental analyzers (EAs) used for DC-EA cannot distinguish between SOC and soil inorganic carbon (e.g. carbonates). For such machines, assays give the concentration of total carbon, not just organic carbon. Soils must be checked in advance for inorganic carbon before assay. If the pH is greater than 7.4, ground samples may be treated with hydrochloric acid to remove carbonates [

In this section, we review the three major methods for assaying SOC concentration before introducing the concept of assay error. For more details on these assay methods, as well as newer in situ methods see the recent reviews by Nayak et al. [

DC-EA is the gold standard for SOC assay. EAs are expensive to purchase, maintain, and run, but they measure carbon directly and at a fairly high throughput. EAs combust aliquots of soil at high temperatures (around 1000˚C) in a pure oxygen environment, and assay the CO_{2} released using gas chromatography. DC-EA is generally the most precise and expensive assay method for SOC, but the precision and cost per analysis will vary by EA model.

To assay a sample by LOI, investigators measure the mass of dried soil samples, bake them at around 550˚ celsius in a muffle furnace, and then measure how much mass was lost during baking [

MIRS works by shining light in the mid-infrared range (4000 - 400 cm^{−}^{1} or 2500 - 25,000 nm) on dried samples and measuring the wavelengths that are absorbed [

From a statistical perspective, the assay process is important because additional random error is introduced into the data. Unbiased assays are centered on the true SOC concentration of the (composited) sample. Biased assays systematically overestimate or underestimate the SOC concentration. It is not guaranteed that assays are unbiased (see Bellon-Maurel and McBratney [

As an example, a realized assay error of 1.1 will cause a true SOC concentration of 1% to appear as 1.1% and a true SOC concentration of 5% to appear as 5.5%. A precise assay method has a small σ δ 2 so realizations tend to be close to 1, and the measured SOC concentration is close to the true SOC concentration. Note that we will sometimes use an additional subscript to refer to a specific method, e.g. σ δ , DC-EA is the assay error variance of DC-EA.

In this section we highlight our main results. We provide a formula for the precision of estimates of μ given a sample size n and a number of assays k. We derive the optimal n and k that will maximize precision while maintaining a given budget.

Suppose we have a UIRS of size n and that composites are formed randomly from n/k samples in equal proportions and with perfect homogenization, so that k assays are taken. Suppose S i * is the assayed SOC concentration of the ith composited sample. Our estimator is the mean of these assayed composite samples:

μ ^ = 1 k ∑ i = 1 k S i * .

This is an unbiased estimator of μ , so that E [ μ ^ ] = μ . Its variance is

V ( μ ^ ) = σ p 2 ( 1 + σ δ 2 ) n + μ 2 σ δ 2 k . (1)

If there is no assay error, this reduces to the usual formula for the variance of a UIRS mean: σ p / n . Because the estimator is unbiased, it’s expected error (mean-squared error) is also equal to (1). In order to reduce the error, we can either gather more samples n or make more assays k. The optimal allocation of samples and assays will depend on the plot parameters σ p and μ , the assay error variance σ δ 2 , and a cost model for sampling and assay.

We now introduce such a cost model. Call cost_{c} the cost of sampling a single core, cost_{P} the cost of sample preparation, and cost_{A} the cost of assaying a single composited sample. Note that these costs depend on the sampling and assay methods employed. For example, cost_{A} under LOI is considerably lower than cost_{A} under DC-EA. We assume that the cost of compositing itself is negligible, but it could easily be included in cost_{c}. Finally, we assume a fixed cost of the study cost_{0}, which doesn’t vary over n and k. The total cost is:

cost 0 + n ⋅ cost c + k ⋅ ( cost P + cost A ) . (2)

We ultimately want to choose both an optimal n and k, which we call n_{opt} and k_{opt} respectively, as well as a sample prep and assay method. We first consider the sample prep and assay methods to be fixed, optimizing only for n and k, and then discuss how to choose among strategies.

Given the cost model along with the plot and assay parameters, the composite size that minimizes the error in Equation (1) is:

n opt k opt = σ p 1 + σ δ 2 μ σ δ × cost P + cost A cost c . (3)

The optimal composite size thus depends on the ratio of plot heterogeneity σ δ and the degree of assay error σ δ . It also depends on the ratio of assay and sampling costs, though it is less sensitive to small changes in cost due to the square root applied to this ratio. Note that there are two boundary conditions that are not reflected in Equation (3). Namely, if we initially find k opt < 1 then we take k opt = 1 with the implication that all cores should be fully composited to 1 composite sample. On the other hand, if we find k opt > n opt , then set k opt = n opt with the implication that all sampled cores should be assayed without compositing. Ultimately, there are only gains to compositing if

σ p 2 ( 1 + σ δ 2 ) ( cost P + cost A ) > μ 2 σ δ 2 cost c .

Otherwise no compositing should be done.

Given a fixed budget B, we can compute the optimal variance V ( μ ^ ) opt . The optimal variance can be difficult to interpret. Taking the square root yields the optimal standard error SE ( μ ^ ) opt we can achieve at budget B:

SE ( μ ^ ) opt = σ p ( 1 + σ δ 2 ) cost c + μ σ δ cost P + cost A B − cost 0 (4)

The optimal standard error is on the same scale as the estimate (i.e. percent SOC).

Finally, different sample prep and assay methods involve trade-offs between the costs and the assay error. Clearly, if a method is both cheaper and less erroneous, it is preferred. But how much error should we tolerate for a cheaper assay? The relative efficiency of different methods is the ratio of the minimum errors they are able to achieve, per Equation (4). The relative efficiency of method 1 over method 2 is:

SE ( μ ^ ) opt , 1 SE ( μ ^ ) opt , 2 = σ p ( 1 + σ δ 1 2 ) cost c + μ σ δ 1 cost P 1 + cost M 1 σ p ( 1 + σ δ 2 2 ) cost c + μ σ δ 2 cost P 2 + cost M 2 (5)

A relative efficiency close to 1 suggests a near toss-up between different sample prep and assay strategies. On the other hand, a large relative efficiency suggests that method 2 is more efficient than method 1, and vice versa for a small relative efficiency. The upshot is that for any budget, we can achieve substantially more precise estimates when the relative efficiency is far from 1.

Alternatively, given a maximum variance V that we can tolerate, we might ask for a minimum budget over all ways of allocating the budget to samples and assays. This is the inverse of the previous problem. The expressions for the optimum n and k are fairly complicated. We provide details in Section B.2 in our Appendix.

So far we have assumed that we know the parameters σ δ and σ p . In practice, these quantities must be estimated with gathered data or, when planning a survey, based on physical reasoning and past studies.

An unbiased estimator of the plot variance σ p 2 is the usual sample variance with an adjustment factor for the size of composites:

σ ^ p 2 = n k [ 1 k − 1 ∑ i = 1 k ( S i * − μ ^ ) 2 ] ,

whereas above, μ ^ = 1 k ∑ i = 1 k S i * As previously noted, this formula will underestimate the sample variance if composite samples are systematically more homogeneous than the plot itself. This can happen, for example, when composites are grouped together by distance instead of randomly.

We can estimate σ δ using replicated assays, detailed in Section C.1 of our appendix. For methods like LOI or MIRS that involve calibration, the additional error due to calibration must be taken into account. See Section C.2.

Putting these pieces together, we can estimate the overall standard error of μ ^ by:

SE ^ ( μ ^ ) = σ ^ p 2 ( 1 + σ ^ δ 2 ) n + μ ^ 2 σ ^ δ 2 k

If the sample size n is not too small, then an asymptotic confidence interval based on the t-distribution with n − 1 degrees of freedom will be approximately correct. Specifically, denote t ( 1 − α / 2 ) as the ( 1 − α / 2 ) quantile of the t-distribution with n − 1 degrees of freedom. The interval

[ μ ^ − t ( 1 − α / 2 ) × SE ^ ( μ ^ ) , μ ^ + t ( 1 − α / 2 ) × SE ^ ( μ ^ ) ]

bounds the true mean μ with probability about ( 1 − α ) . Checking if a particular value of μ (say μ 0 ) is in this interval is equivalent to a level α t-test of the null hypothesis H 0 : μ = μ 0 . Often, a researcher will set α = 0.05 to yield a 95% confidence interval: [ μ ^ − 1.96 × SE ^ ( μ ^ ) , μ ^ + 1.96 × SE ^ ( μ ^ ) ] . In instances where the confidence interval includes values less than 0, in particular if 1.96 × SE ^ ( μ ^ ) > μ ^ , it is valid to set the lower confidence limit equal to 0.

Often, investigators aim to estimate the difference between average SOC concentrations, either between two plots at the same time or within the same plot at different times. Let μ 1 and μ 2 be the mean SOC concentrations in plot 1 and plot 2. Then the parameter of interest is μ 1 − μ 2 . Let μ ^ 1 and μ ^ 2 be estimators of μ 1 and μ 2 , as above. Then the difference in means, Δ ^ 1,2 = μ ^ 1 − μ ^ 2 , is an unbiased estimator of μ 1 − μ 2 . Furthermore, assuming independent UIRSing in each plot, the standard error is:

SE ( Δ ^ 1,2 ) = V ( μ ^ 1 ) + V ( μ ^ 2 ) .

The optimum SE of the difference can be attained by separately optimizing V ( μ ^ 1 ) and V ( μ ^ 2 ) , as above, yielding sampling and assay sizes of n 1 , k 1 for plot 1 and n 2 , k 2 for plot 2. A reasonable estimate of the SE is SE ^ ( Δ ^ 1,2 ) ≡ V ^ ( μ ^ 1 ) + V ^ ( μ ^ 2 ) . An approximate ( 1 − α ) confidence interval on the difference is:

[ Δ ^ 1,2 − t ( 1 − α / 2 ) × SE ^ ( Δ ^ 1,2 ) , Δ ^ 1,2 + t ( 1 − α / 2 ) × SE ^ ( Δ ^ 1,2 ) ]

where t ( 1 − α / 2 ) is now the ( 1 − α / 2 ) quantile of the t-distribution with min ( n 1 , n 2 ) degrees of freedom.

If sample sizes are fairly small, say n 1 , n 2 < 30 , the difference-in-means will generally not have a normal distribution. In this case, a permutation test should be used to test for a difference between μ 1 and μ 2 . Permutation tests provide an exact level α test at any sample size, without assumptions about the distributions of the samples. Permutation confidence intervals can be derived by testing a range of hypotheses over a grid of effect sizes. The corresponding 1 − α confidence interval contains all effect sizes that are not rejected at level α . Pesarin and Salmaso [

In this section we demonstrate a practical application of our analysis. We draw on a variety of sources to estimate parameters and costs. We stress that the results are not intended to provide universal guidance on sampling, sample prep, and assay—they are highly sensitive to the inputs. The open-source software and web tool we provide are intended to enable investigators to draw their own conclusions from their own inputs.

We combine data from multiple sources to estimate σ δ for DC-EA, LOI, and MIRS: σ δ , DC-EA , σ δ , LOI , and σ δ , MIRS , respectively. σ δ , DC-EA is estimated from assays on samples taken from rangeland soils in Marin County, California by the Silver Lab at UC Berkeley, referred to here as the Marin data. The samples were run in duplicate on a Carlo Elantech Elemental analyzer at UC Berkeley. We use the method presented in Section 8.3 to compute σ ^ δ , DC-EA , i on each sample and took the median across samples to get σ δ , DC-EA . We applied the methods presented in detail in Section C.2 of our appendix to estimate the additional assay error in LOI and MIRS calibrated to DC-EA assays. Briefly, we derived the validation root mean squared error (RMSE_{v}) for LOI by regressing LOI assays on DC-EA assays taken at the Agricultural Diagnostic Laboratory at the University of Arkansas (ADL). These assays were taken on samples from several sites in Colorado collected by the Wainwright Lab at Lawrence Berkeley National Laboratory. We estimated σ δ , MIRS using the RMSE_{v} provided in _{v} values reported in a range of studies. These errors were then divided by our estimates of μ ^ and added to the DC-EA error variance estimate to approximate their overall assay error variance on a multiplicative scale. We also computed the SE assuming no assay error and no cost to assay, which represents a typical power analysis and provides a lower bound on the SE across assay methods.

We used the Marin data to get estimates of σ p and μ in the topsoil (0 - 10 cm) and in deep soil (50 - 100 cm). Within depth profiles, we computed the sample mean and standard deviation at each site and then took the median over sites as our estimates σ ^ p and μ ^ . The samples were collected using transect sampling, not UIRSing, but should provide reasonable estimates.

Inputs: All inputs are summarized in _{v} for LOI was 0.31 in the range of the Marin data assays. Dividing by μ ^ and combining this with the DC-EA error, we estimated an error variance for LOI of σ ^ δ , LOI = 0.11 in the top soil and σ ^ δ , LOI = 0.67 in deep soil. England and Viscarra Rossel [_{v} of 0.11 for MIRS, yielding an estimate of σ ^ δ , MIRS = 0.05 in the top soil and σ ^ δ , MIRS = 0.25 in deep soil.

The cost of sampling and the fixed costs of the survey are not well constrained. We set the fixed cost at cost 0 ≡ 200 and cost_{c} at 5, 20, or 40 USD to reflect cheap, medium, and expensive sampling. We assumed a transport cost of 2.00, a cost of 4.00 for oven drying, 1.00 for air drying, 2.00 for sieving, 4.00 for

Description | Notation | Value(s) |
---|---|---|

DC-EA assay variance | σ ^ δ , DC-EA | 0.02 |

LOI assay variance | σ ^ δ , LOI | 0.11, 0.67 |

MIRS assay variance | σ ^ δ , MIRS | 0.05, 0.25 |

Plot heterogeneity | σ ^ p | 0.68, 0.12 |

Plot mean | μ ^ | 3.57, 0.48 |

Fixed cost | cost 0 | 200 |

Cost of DC-EA assay | cost DC-EA | 26.00 |

Cost of LOI assay | cost LOI | 9.25 |

Cost of MIRS assay | cost MIRS | 10.30 |

Cost of taking a core | cost c | 5, 20, 40 |

grinding, and 2.00 for acid testing for inorganic carbon. Without root picking, this puts the cost of sample prep at 8.00 for LOI, 11.00 for DC-EA, and 9.00 for MIRS. The assay costs for DC-EA and MIRS were estimated in [_{P} and cost_{A} for each method yields cost DC-EA = 26.00 USD , cost LOI = 9.25 USD , and cost MIRS = 10.30 USD .

Outputs:

Optimal composite sizes are provided in

Topsoil (0 - 10 cm) | Deep Soil (50 - 100 cm) | |||||
---|---|---|---|---|---|---|

5 USD | 20 USD | 40 USD | 5 USD | 20 USD | 40 USD | |

DC-EA | 20 | 10 | 7 | 26 | 13 | 9 |

MIRS | 5 | 3 | 2 | 2 | 1 | 1 |

LOI | 2 | 1 | 1 | 1 | 1 | 1 |

Relative efficiencies are given in

In this paper, we statistically formalized the sampling and assay processes to characterize the precision of SOC concentration estimation while making minimal assumptions. We derived optimal composite sizes to maximize precision under a fixed budget. Although we did not discuss it extensively, we also solved the inverse problem of minimizing costs given a fixed precision (see section B.2 of our Appendix).

We applied our method to data from a California rangeland, bringing in costs and errors of measurement from other studies [

First, we found assay error to be a significant source of uncertainty in SOC estimation that is usually not taken into account. Indeed, many analyses in the SOC literature compute uncertainty estimates accounting only for plot heterogeneity (or in some cases, only inter-plot heterogeneity). We found that incorporating assay error and costs can double the uncertainty (in terms of standard error) compared to the conventional approach of not incorporating assay error.

Furthermore, the depth of soil under study was an especially important consideration for the assay method employed. We found that efficiencies varied much more across the assay methods when attempting to quantify deep soil concentrations rather than top soil. In terms of the coefficient of variation, top soil can be quantified by any assay method to within about 5% of the mean at a budget of 1000 USD. On the other hand, DC-EA seems far better at accurately quantifying deep soil concentrations than other methods despite its high cost. Equation (4) reveals that when the plot heterogeneity σ p is high the estimation error will be driven largely by the cost of sampling while the cost and precision of assay have little effect. Intuitively, we need many samples to characterize the heterogeneity within the plot, and cheaper, less precise assay methods generally

Soil Profile | Sampling Cost | SE_{DC-EA}/SE_{LOI} | SE_{DC-EA}/SE_{MIRS} |
---|---|---|---|

Topsoil (0 - 10 cm) | 5 USD | 0.70 | 0.90 |

Topsoil (0 - 10 cm) | 20 USD | 0.79 | 0.93 |

Topsoil (0 - 10 cm) | 40 USD | 0.83 | 0.95 |

Deep Soil (50 - 100 cm) | 5 USD | 0.25 | 0.48 |

Deep Soil (50 - 100 cm) | 20 USD | 0.36 | 0.61 |

Deep Soil (50 - 100 cm) | 40 USD | 0.42 | 0.67 |

allow many samples to be collected and assayed. Conversely, if the plot heterogeneity is low, it is better to collect a few samples that accurately represent the average plot concentration and focus the budget on assaying them as precisely as possible.

We also found that with our inputs, the benefits of compositing were quite variable. Compositing many cores together is beneficial when the assay method is fairly expensive and precise (e.g. DC-EA), while sampling is fairly cheap (e.g. 5.00 USD per core). Equation (3) reveals that compositing may also be resourceful when the plot heterogeneity is large compared to assay error.

We did not incorporate bulk density into our analysis. Estimating bulk density is critical to converting from concentrations to stocks. Estimating SOC stock is especially necessary in studies of carbon sequestration and climate change mitigation, while SOC concentration is typically the parameter of interest from a soil health and functioning perspective. As with SOC concentrations, bulk density tends to vary substantially across a landscape. However, investigators frequently take only one bulk density sample and bulk density is also prone to assay error [

When assays introduce error into an estimation process, compositing samples may have major ramifications for both the precision and cost of estimates. In this paper, we detailed the processes involved in soil organic carbon estimation and derived optimal composite sizes to maximize estimation precision when assays are subject to multiplicative errors. An analysis of data from California rangeland indicated that DC-EA would yield more precise estimates than LOI or MIRS, and that, for any given budget, compositing samples before assay would yield more precise estimates than assaying individual samples. Optimal composite sizes and assay methods will depend on parameters of the plot under study and on the costs of sampling, sample preparation, and assay. Thus, our results are not meant to provide universal guidance. We hope that the framework we presented here will be useful to investigators aiming to design efficient soil surveys for soil organic carbon concentration and stock.

I thank Philip Stark for his advice and mentorship on the statistics of soil carbon estimation. Whendee Silver and Haruko Wainwright provided their data and perspectives. This paper would not be possible without numerous conversations with Paige Stanley and Jessica Chiartas, who also provided enormously helpful feedback on an earlier draft of this paper.

Jacob Spertus is supported by a National Science Foundation (NSF) Graduate Research Fellowship (DGE 1752814). This work also received support from the Berkeley Food Institute’s 2020 summer fellows program. The open-access publication of this paper was supported by the Berkeley Research Impact Initiative.

The author declares no conflicts of interest regarding the publication of this paper.

Spertus, J.V. (2021) Optimal Sampling and Assay for Estimating Soil Organic Carbon. Open Journal of Soil Science, 11, 93-121. https://doi.org/10.4236/ojss.2021.112006

We have a plot P ⊂ ℝ 3 . An element of P is a 3-tuple ( x , y , z ) . x denotes longitude or x-axis distance from an origin (e.g. the lower left hand corner of a rectangular plot), y denotes latitude or y-axis distance, and z denotes depth.

At ( x , y , z ) the soil has some concentration of SOC, which we will denote by c ( x , y , z ) ∈ [ 0,100 ] with units in percent or equivalently grams SOC per hectogram of soil. Note that sometimes SOC concentration is reported in grams per kilogram. Note also that c ( x , y , z ) is best conceptualized as an average over a small window centered at ( x , y , z ) . Taking the design-based perspective, we consider c ( x , y , z ) to be fixed but unknown.

To convert to grams SOC per volume of soil, take d ( x , y , z ) to be the density of the soil at point ( x , y , z ) , e.g. in grams per cubic centimeter. This is called the “bulk density” in soil science. The amount or stock of carbon in a small area centered at point ( x , y , z ) is thus c ( x , y , z ) × d ( x , y , z ) . The total amount or stock of carbon in a plot is:

T = ∫ P c ( x , y , z ) × d ( x , y , z ) d P = ∫ 0 x max ∫ 0 y max ∫ 0 z max c ( x , y , z ) × d ( x , y , z ) d z d y d x

Assuming constant bulk density means that d ( x , y , z ) = d and the total carbon becomes:

T = d ∫ P c ( x , y , z ) d P = d × μ

where μ ≡ ∫ P c ( x , y , z ) d P is the population average SOC concentration—the key parameter to be estimated through soil sampling. The bulk density d must also be estimated.

The population variance of a plot is formally:

σ p 2 = ∫ P [ c ( x , y , z ) − μ ] 2 d P

The population variance is a measure of heterogeneity that is instrumental in determining the precision of estimates of μ .

μ and σ p are estimated using sampled cores. The plot is often sliced into profiles along depth, and positions ( x , y ) locations are randomly sampled within depth. From here on we will assume we are sampling within a profile and ignore depth. Randomly sampled positions are denoted { ( X i , Y i ) } i = 1 n and the n corresponding cores are denoted { c ( X i , Y i ) } i = 1 n , or { C 1 , ⋯ , C n } when the location is not important. We suppose here that these cores are selected by UIRSing. The properties of the sample mean of cores from a UIRS, C ¯ = 1 n ∑ i = 1 n C i , are simple and well-understood: C ¯ is unbiased ( E [ C ¯ ] = μ ) and has variance V [ C ¯ ] = σ p 2 / n .

Given a UIRS { C 1 , ⋯ , C n } compositing bins the cores into k groups of size n/k. The groups are denoted by a set of indices { g 1 , ⋯ , g k } , where g 1 = { 1, ⋯ , n / k } , g 2 = { n / k + 1, ⋯ , 2 n / k } , etc. The cores in each group are physically mixed together to form composite samples { S 1 , ⋯ , S k } . Under compositing additivity,

we have S i = k n ∑ j ∈ g i C j . Note also that the above notation covers the case where no compositing is done, with k = n and S i = C i .

Under equal proportions compositing of cores from a UIRS, it follows immediately that the sample mean of the composited cores is an unbiased estimate of μ :

E [ 1 k ∑ i = 1 k S i ] = E [ 1 k ∑ i = 1 k ∑ j ∈ g i k n C j ] = E [ 1 n ∑ i = 1 n C i ] = μ

Furthermore, because the sample mean of the composite samples is equivalent to the sample mean of the constituents, it’s variance is also

V [ 1 k ∑ i = 1 k S i ] = V [ 1 n ∑ i = 1 n C i ] = σ p 2 n .

assay error is drawn from an unknown distribution with positive support and denoted δ i . Measured samples are S i * = S i δ i . We assume assays are unbiased so that E ( δ i ) = 1 and E ( S i * ) = S i where the expectation is with respect to the assay error only (not the sampling distribution). We also assume that the assay error has constant variance V ( δ i ) = σ δ 2 , that does not depend on S i .

Our estimator is the mean of k measured samples composited from n cores: μ ^ = 1 k ∑ i = 1 k S i * . Under our assumptions, μ ^ is an unbiased estimator:

E [ μ ^ ] = E [ 1 k ∑ i = 1 k S i δ i ] = E [ 1 k ∑ i = 1 k [ ∑ j ∈ g i k n C j ] δ i ] = 1 k ∑ i = 1 k [ ∑ j ∈ g i k n E [ C j ] ] E [ δ i ] = 1 k ∑ i = 1 k [ ∑ j ∈ g i k n μ ] = μ . (6)

Its variance is:

V [ μ ^ ] = 1 k 2 ∑ i = 1 k V [ S i δ i ] = 1 k 2 ∑ i = 1 k ( V [ S i ] V [ δ i ] + E [ S i ] 2 V [ δ i ] + E [ δ i ] 2 V [ S i ] ) = 1 k 2 ∑ i = 1 k ( k n σ 2 σ δ 2 + μ 2 σ δ 2 + k n σ 2 ) = σ 2 ( 1 + σ δ 2 ) n + μ 2 σ δ 2 k

For a budget B, fixed in advance, we seek the solution to the optimization problem:

V ( μ ^ ) opt = min M , P min n , k σ p 2 ( 1 + σ δ 2 ) n + μ 2 σ δ 2 k (7)

st : cost 0 + n ⋅ cost c + k ⋅ ( cost P + cost A ) ≤ B (8)

k ≥ 1 (9)

k ≤ n (10)

whereas above cost_{0} is the fixed cost, cost_{c} is the cost of sampling a single core, cost_{P} is the cost of sample prep, and cost_{A} is the cost of assay. A and P additionally denote the assay and sample preparation schemes, which affect costs and σ δ .

For a fixed A and P, the inner optimization problem can be solved in closed form using a Lagrange multiplier for the constraint. The optimal sampling and assay sizes are then:

n opt = ( B − cost 0 ) σ p 1 + σ δ 2 [ σ p ( 1 + σ δ 2 ) cost c + μ σ δ cost P + cost A ] cost c (11)

k opt = ( B − cost 0 ) μ σ δ [ σ p ( 1 + σ δ 2 ) cost c + μ σ δ cost P + cost A ] cost P + cost A (12)

This solution ignores the constraints k ≥ 1 and k ≤ n . If we find k opt < 1 , then set k opt = 1 and n opt = ( B − cost 0 − cost P − cost A ) / cost c . If we find k opt > n P , M * then set k opt = n opt = ( B − cost 0 ) / ( cost c + cost P + cost A ) . To obtain integer solutions while staying under budget, n opt and k opt should be rounded down.

B.2. Minimum Cost for a Given PrecisionGiven a maximum variance V that we can tolerate, we seek the minimum budget over all ways of allocating the budget to samples and assays while achieving that precision. Formally:

B opt = min M , P min n , k cost 0 + n ⋅ cost c + k ⋅ ( cost P + cost A ) (13)

st : σ p 2 ( 1 + σ δ 2 ) n + μ 2 σ δ 2 k ≤ V (14)

k ≥ 1 (15)

k ≤ n (16)

The solution of the inner optimization (for fixed M and P) is:

n opt = [ ( cost c ⋅ σ p 2 ( 1 + σ δ 2 ) cost P + cost A ) 1 / 2 + 1 ] μ 2 σ δ 2 V (17)

k opt = σ p 2 ( 1 + σ δ 2 ) V ( 1 − [ ( cost c ⋅ σ p 2 ( 1 + σ δ 2 ) cost P + cost A ) 1 / 2 + 1 ] − 1 ) . (18)

The constraints k ≥ 1 and k ≤ n are not respected by these solutions. If we find k opt < 1 , then set k opt = 1 and n opt = σ p 2 / V (obtained, for example, when there is no assay error). If we find k opt ≥ n opt , then set k opt = n opt = ( σ p 2 ( 1 + σ δ 2 ) + μ 2 σ δ 2 ) / V . To get integer solutions n opt and k opt should be rounded up.

Suppose we have r replicated, unbiased assays for the ith sample. The replicates are denoted { S i 1 * , S i 2 * , ⋯ , S i r * } , where S i j * = S i δ i j is the true SOC concentration in composite sample i multiplied by an independent, mean 1 assay error. The

sample mean over replicates is S ¯ i = 1 r ∑ j = 1 r S i j * , which is an unbiased estimate of S i with variance V [ S ¯ i | S i ] = V [ S i j * | S i ] / r . An unbiased estimate of S i 2 is the squared sample mean minus its variance, i.e. S ¯ i 2 − 1 r ( 1 − r ) ∑ j = 1 r ( S i j * − S ¯ i ) 2 . Thus we might estimate σ δ 2 by plugging in the unbiased estimators of V [ S i j * | S i ] and S i 2 :

σ ^ δ 2 = 1 r − 1 ∑ j = 1 r ( S i j * − S ¯ i * ) 2 ( S ¯ i * ) 2 − 1 r ( r − 1 ) ∑ j = 1 r ( S i j * − S ¯ i * ) 2 . (19)

This is not necessarily unbiased. If the numerator and denominator are independent, then Jensen’s inequality shows this is conservative in expectation: E [ σ ^ δ 2 ] > σ δ . We used this replicated measurement technique to estimate the error of DC-EA.

σ ^ δ 2 can be computed on any sample that is replicated 2 or more times. One strategy to estimate σ δ is thus to duplicate every sample ( r = 2 ) and then take the average or median, though the variance of estimates may be high. Plotting σ ^ δ i 2 against S i * should indicate potential violations of the constant assay error variance assumption. Another strategy is to replicate a single sample some large number of times, say r = 30 , but this will not provide information about the constant variance assumption. A good balance is to measure a few samples along a grid of S i * values some moderately large number of times, say r = 5 . Under constant assay error variance, the estimates σ ^ δ i 2 should be fairly close and there should not be a trend in S i * .

C.2. Prediction MethodsSuppose we have a method that is calibrated to an unbiased assay (e.g. DC-EA) by regression, like LOI or MIRS. There are two sources of error in the estimate. First, there is the assay error of the calibration assay which can be estimated directly through replication as per Section C.1. Second, there is the error in the calibration itself, i.e. prediction error. We discuss two ways to estimate the prediction error. To approximate the total error of a prediction method we recommend simply adding the pieces together.

Prediction methods typically assume an additive error model and estimate the variance of the additive error out of sample. We will call this estimate RMSE_{v} for validation root mean squared error. Now, if SOC concentration has been first transformed to the log scale, then the model implicitly assumes that error is multiplicative on the original scale, our estimate of the error in prediction is then exp (RMSE_{v}). On the other hand, if SOC is modeled directly (i.e. without a log transformation) then we can approximate the error on a multiplicative scale by assuming that the additive error pertains to the average SOC assay, which suggests dividing by the average assay. Thus we estimate the prediction error by

RMSE v μ ^ .

In our application, LOI and MIRS were calibrated both to DC-EA assays. For both of these methods, we had an RMSE_{v} of SOC modeled on the original scale (without a log transform) so we estimated the error of these methods as

σ ^ δ , LOI = σ ^ δ , DC-EA + RMSE v , LOI μ ^

σ ^ δ , MIRS = σ ^ δ , DC-EA + RMSE v , MIRS μ ^

Recall that the area of plot P is A . Finding the shortest path through n points is known as the traveling salesman problem. If the n points are generated randomly and independently with density f ( x ) then the Beardwood-Halton-Hammersley theorem for ℝ 2 says that the length of the shortest path converges to:

L n / n → β 2 ∫ ℝ 2 f ( x ) d x

β 2 is an unknown constant, but analytical bounds and numerical simulations have pegged it at about 0.714 [