A Connection between Geometrical Spreading and the Adjoint Field in Travel Time Tomography

The goal of tomography is to reconstruct a spatially-varying image function s(x,m), where x is position and m is a finite-length vector of parameters. Many reconstruction methods minimize the total L2 error E ≡ eTe, where individual errors ei quantify misfit between predictions and observations, to quantify goodness of fit. So-called adjoint state methods allow the gradient ∂E/∂mi to be computed extremely efficiently from an adjoint field, facilitating image reconstruction by gradient-descent methods. We examine the structure of the differential equation for the adjoint field under the ray approximation and find that it has the same form as the transport equation, whose solution involves the well-known geometrical spreading function R Consequently, as R is routinely tabulated as part of a ray calculation, no extra work is needed to compute the adjoint field, permitting a rapid calculation of the gradient ∂E/∂mi.

which is valid at high-frequencies when the scale length of heterogeneities in the medium is much larger than the wavelength of the waves. Since the 1970's, the simplicity of ray calculations has underpinned the use of travel time tomography in a variety of disciplines, including seismology [1] [2], oceanography [3], petroleum exploration [4], geotechnical engineering [5] and cosmology [6]. In some disciplines, ray-based tomography is being superseded by full wavefield methods [7] [8]; nevertheless, it remains an important part of a tomographer's toolbox on account of its computational efficiency. Over the last several decades, the development of the so-called adjoint state method [9] has allowed tomographic imaging to be applied in cases where it was hitherto fore infeasible, because of vastly reduced computational effort. To date, this efficiency mainly has used to enable computationally-intensive forms of tomography, and especially to full wavefield tomography [7] [8]. Nevertheless, adjoint state methodology is very widely applicable. It has the potential for significantly speeding up even computationally-light problems, including ray-based tomography. The feasibility of using adjoint state methods in this form of tomography was first investigated by [10], who demonstrate its effectiveness. In this paper, we further explore it application. We study the mathematical structure of the differential equation that arises out of the adjoint state method (the equation for the so-called adjoint field) and show that it is very closely related to and in important cases identical to the transport equation of ray theory. This relationship provides an intuitive understanding of the adjoint field and suggests further ways of obtaining further computational efficiency.
Our analysis is divided into four sections: first, we review how the adjoint state method is used to streamline the computation of a critical quantity need to perform tomography; second, we review the concept of the geometrical spreading of rays and its connection to the transport equation; third, we use the adjunct state method to derive and solve the differential equation for the adjoint field; and lastly, we show that the adjoint equation is very closely related to the transport equation and that its solution can be trivially constructed when the solution to the transport equation (the geometrical spreading function) is known.

The Adjoint State Method for Computing the Error Derivative
The main purpose of this section is to define the error derivative, discuss its usefulness and review how the adjoint state method is used to compute it, in the special case where the unknown image is linked to the observed data via the source term in a linear differential equation.
Here ( ) .,. is the inner product over spatial coordinates. Now, consider the simple case in which the data solves the linear differential equation T s =  (together with some appropriate boundary condition). Here, the image ( ) , s x m is the source term in the differential equation. By differentiating the differential equation, we obtain an expression for jk G : The partial derivative of total error j H is computed by differentiating and inserting into (2): Here † denotes adjoint and ( ) ( ) Here, ( ) . δ is the Dirac impulse function. The resulting equation is then solved only for those points at which the error is known and adjoint field λ is taken to be the sum of the s λ  . This procedure is equivalent to solving the original adjoint equation with error:

The Transport Equation of Ray Theory
The main purpose of this section is to review the geometrical interpretation of the transport equation and to highlight its link to ray divergence. However, in order to provide some background for readers unfamiliar with ray theory, and to establish nomenclature, we also present an abridged derivation of the equation.
In many cases, the imaging problem involves a field ( ) of time t as well as spatial coordinates x and that satisfies a wave equation of the form . Here, the differential operator  contains only spatial derivatives and depends on parameters m . The equation reduces to the spatial equation Fourier transformation of time t to angular frequency ω , where  denotes a transformed variable. The ray approximation is the solution to this equation in the limit ω → ∞ , and is achieved by postulating that the solution can be written as a Laurent series of the form [18]: (6) Here i is the imaginary unit.
slowness function; that is, a material property that is inversely proportional to the local propagation velocity. Inserting (4) into the differential equation and equating equal powers of ω lead to the Eikonal equation for ( ) and a sequence of equations for ( ) k A , the lowest order of which is the transport equation [19]: The unit normal to a surface of equal travel time is ( ) 1 of these vectors connecting surfaces of increasing travel times defines a ray; that is, a parametric curve ( ) x  with arclength  and tangent ( ) t  (Figure 1(A)). The volume enclosed by a group of rays is called a ray tube. The Eikonal equation, written as two coupled first order equations in ( ) The ray's starting point is ( ) is then the path integral of the slowness along the ray, as can be seen by manipulating the formula for the directional derivative The transport equation, written in terms of ( ) t  , is: etc., Normals to wave fronts define rays (blue curves) with tangents t . Neighboring rays enclosing a solid angle dΩ at the source define a ray tube. (B) Relationship between ray tangents t and ray tube cross-sectional area S. Gauss's theorem is applied to a small volume V along the ray tube, with the shape of a section of a cone, whose cross-sectional area S changes with arc-length  and whose volume is The tangent t is parallel to the sides of the section and normal to its ends. See text for further discussion. The quantity t ∇ ⋅ has a simple geometric interpretation, as can be seen by applying Gauss' theorem (e.g. [20]) to a volume V along a ray tube, which has the shape of a section of a cone (Figure 1(B)). The cross-sectional area of the ray tube increases from S on the end nearest to the source, to d S S + at a distance d further away. For small volumes, the integral in Gauss' theorem is ( ) According to the transport equation, the fractional decrease in  , measured along a ray, is equal to the fractional increase in area S of the ray tube. In many cases, the quantity  has the interpretation of the energy density, so the transport equation embodies conservation of energy. Conventionally, the area of the ray tube is written ( ) ( ) R  is the geometrical spreading function and dΩ is the solid angle subtended by the ray tube at the source (e.g. [19]). Consequently, ( ) ( ) where c is a constant. Ray-tracing algorithms that solve (9) typically tabulate both T and R (e.g. [21] [22]).

Adjoint Equation for Travel Time Tomography
The main purpose of this section is to derive and solve the adjoint equation Our derivation is equivalent to, but different than, the one by [10], being a direct application of perturbation theory, as contrasted to one that employs Lagrange multipliers.
In travel time tomography, travel time observations ∇ in the direction of the background ray direction 0 t is 1 s . Since 1 s plays the role the source term in the differential equation, the formulation in (3) is applicable. If we define d to be an increment of arc length along the unperturbed ray, then this is just an equation involving the directional derivative The perturbation in travel time is the integral of the perturbation in slowness along the unperturbed ray. We rewrite the equation for 1 T as: Using the rules ( ) (e.g., [23]) we obtain an expression for the adjoint equation: As is typical of first-order equations, the "left hand" boundary condition associated with  implies a "right hand" boundary condition for †  (e.g. [14]); that is, while The formal solution to (17) is well-known (e.g. [24]): Here the constant C is chosen to enforce the boundary condition ( ) 0

Analysis of the Role of the Geometrical Spreading
The main purpose of this section is show that the solution to the adjoint equation can be constructed from the geometrical spreading function, and to interpret this result.
In any region in which 0 0 e = , the adjoint Equation (18) has the same form as the transport Equation (12). Since the error ( ) e x is rarely known within the medium, but rather only on its boundary B x , this restriction is satisfied by all commonly-encountered cases. As we will show below, the similarity of form provides considerable insight into the behavior of the adjoint field λ .
Ray divergence enters into the adjoint equation through the 0 t ∇ ⋅ term. In Since the rays of a plane wave do not diverge, Now consider the case where the background slowness is everywhere too small by an amount b, so that the background error 0 0 obs e T T = − grows linearly with distance z; that is, ( ) 0 , , e x y z bz = . We will assume that this error is known only on the boundary B z z = . Following (5), the adjoint equation is Because of the Dirac impulse function, the boundary condition for λ requires some scrutiny. We will consider that the error is defined just below the boundary, at . In order to satisfy both the boundary condition of ( ) 0 An expected, 1 0 H < , since increasing 1 m lowers the error. Also as expected, 1 H is proportional to the area 2 L of the prism, since the larger its area, the larger the region to which the slowness perturbation is applied. Interestingly, 1 H is independent of the position H z of the prism; that is, the prism can be moved up or down without affecting the error. As we will show below, this insensitivity to position is due to the absence of ray divergence in this plane wave case.
We now consider a spherical wave propagating in the r-direction in through a homogenous sphere with 0 B r r ≤ ≤  x x x < < where the rays project the prism. Because the rays do not diverge, the size of this region is independent of the depth of the perturbation.   , , e r br θ ϕ = . We will assume that this error is known only on the boundary B r r = . The adjoint Equation (18) For a position H r away from the origin where a spherical cap of thickness D and area 2 L is possible, the partial derivative of total error is:  function can be written as ( ) , B R x x ; that is, the geometrical spreading function at x associated with the ray that ends at B x . Then, the adjoint field is then: Here, the dot product between the ray tangent and surface normal is introduced to account for the increased surface area intersected by the ray tube, in the case (unlike the examples, above) where the ray tube obliquely impinges upon the boundary. Now, suppose that slowness perturbation is represented with voxels, where voxel k has volume k V , amplitude k m , and centroid position ( ) k x .
When the adjoint field varies slowly compared to the length scale of a voxel (a requirement that excludes the source point) the error derivative is: Here B x is the end point of the ray passing through ( ) k x . This result emphasizes the link between the geometrical spreading function R and the partial derivative of total error E. (When the voxel is close to, or overlaps the origin, k H is still well-defined and finite, but the inner product in (27) must be computed appropriately).

Conclusion
The key result in this paper is the demonstration that the adjoint equation in