Analytical and Numerical Investigations of Probabilistic Monochromatic Problem ()
1. Introduction
Fourier transformation tools are used to obtain information about spectra for a given data set. As any data has an uncertainty, Fourier transformation techniques can be supported by probabilistic theory captured by Bayes’ theorem [1] to improve scientific results and conclusions. The works [2] [3] demonstrate some advantages of the probabilistic ansatz over a conventional approach.
A common and misleading assumption in the field is that the Nyquist theorem determines the spectral band limitation. However, as the Nyquist frequency follows from finite data sampling, it can only be an upper limit but cannot be an estimator for the limit caused actually by a source and/or diagnostic throughput. A similar reasoning applies to the lower spectral limit. After a basic formalism has been derived, it could be shown in Ref. [4] that the band limitation can be well inferred from experimental data. Thereby, a linear Gaussian inversion technique was used to infer spectral amplitudes. Furthermore, a settings posterior was introduced which estimates non-linear parameters and hyperparameters of a problem. For instance, the spectral band limits and the uncertainty on the derived quasi-continuous spectrum, originating in non-probed Fourier coefficients, have been inferred jointly.
For many applied analysis schemes, certain model assumptions and their implications are not tested but assumed to be valid with infinite trust. From scientific point of view, any analysis scheme should be tested given simulated noisy data for which all model assumptions are clear. Then, analysis results and model assumptions can be investigated which is achieved objectively by a probabilistic ansatz. If this can be carried out analytically, valuable information is available when only actual measured data are given for a scientific problem.
The problem of a monochromatic source is a good example to show the powerfulness of Bayesian inference and to test model assumptions. In this work, the formalism derived in Ref. [4] is adapted to a monochromatic problem in Section 2 and applied to simulated data with different noise levels in Section 3. In addition, analytical and numerical investigations are carried out, so that the probabilistic findings can be understood in more detail. For example, a better signal-to-noise ratio improves the estimation of the frequency, while more data points compensate a worsened signal-to-noise ratio. The conclusion section can be found at the end.
2. Adapted Probabilistic Formalism
The formalism derived in Ref. [4] is adapted below to a monochromatic problem, involving one frequency parameter
and one amplitude parameter
. In addition, a hyperparameter is at hand, entering in the normal prior for
. Abstractly, Bayes’ theorem reads then
(1)
with the joint posterior on the left-hand, the likelihood
devided by the evidence, the amplitude prior
the prior for the hyperparameter, and the prior for the frequency
.
The amplitude
maps linearly to the data domain via the vector
dependent on the frequency in a non-linear manner. The data
with
entries are assumed to be acquired independently, and the measurement uncertainty of each data point follows a normal distribution with standard deviation
. With these assumptions, the Gaussian likelihood becomes
(2)
with the covariance matrix
. The amplitude prior takes the form
(3)
for vanishing prior mean and variance
. After some analytical operations, one obtaines the Gaussian amplitude posterior
(4)
conditional on
and
. In the above equation, the posterior variance is given by
, and the posterior mean reads
.
From the remaining terms, the so-called settings posterior
(5)
is derived which can be interpreted as Ockham’s razor with respect to
. This settings posterior has no general analytical solution, and, thus, the normalisation constant
cannot be stated further. For the remainder of the paper,
and
are taken as uniform distributions. By this choice,
can be used to investigate analytically and numerically the quantity
(6)
Finally, the evidence can be identified as
(7)
The constant
varies with the chosen model H, including likelihood, priors, and the prior domains for
and
. Hence, this constant is linked to the model posterior
which is of importance, when the model is further abstracted, or a comparison with an alternative model is carried out.
3. Monochromatic Problem
For a monochromatic even source, data sets with different noise levels are simulated, and the formalism derived in the previous section is applied. To explain the results found, analytical and numerical investigations are carried out. While this problem is treated abstractly in the following, the main results will be presented for two examples with low and high noise contribution to ease the presentation.
(A) Simulated Data: Two Examples
Data sets are modelled for the real-world interferometer found in Ref. [5] which achieves an optical path difference to obtain an interferogram. Formally, the interferogram
is simulated with the amplitude
and the frequency f = 50 GHz. The optical path difference grid
is chosen as
with
with constant increment
. Accordingly, the total length L = 31.56 mm and the length
of the single-sided region follow. Two noisy data sets
are generated with the noise levels
at 0.05 and 1.0 (see Figure 1).
(B) Application of Formalism
The linear mapping of the amplitude parameter is given by
, yielding
. For the two simulated data sets, Figure 2 shows
, varying in
and for three values of
. Even for the high noise level case,
peaks close to 50 GHz, and the uncertainty
remains reasonably small. The peak has the full width at half maximum (FWHM) of about 8 GHz which is roughly twice as large as the classical spectral resolution stated by
. High and low values of
seem to have little effect on the results.
![]()
Figure 1. Simulated interferograms
(without noise, black) and
(with noise, red). Formally,
was used with amplitude
, frequency f = 50 GHz, and optical path difference grid x with increment
(number of data points
). The data-like quantity
resembles a sample of
, assuming a Gaussian noise component with variance
. Furthermore, the noise is chosen to be independent of x. (a)
(low noise); (b)
(high noise).
![]()
Figure 2. Conditional Gaussian amplitude posterior
with mean
and standard deviation
. This posterior depends on the frequency
, prior standard deviation
for
, and noisy interferogram data
. The conditional posterior is obtained for two sets
(see Figure 1) modelled by a monochromatic source with the frequency f = 50 GHz, the amplitude S = 0.5 and the noise levels
(left column) and 1 (right column). The posterior mean (cyan) peaks at values of about 0.5 close to 50 GHz. The spectral FWHM of this peak accounts for about 8 GHz. The posterior standard deviation (dashed-white for its multiples) increases with the noise level, as one would expect. Only small values of
affect
when the noise level is high.
The settings posterior
is proportional to the quantity
which takes very large values even on logarithmic scale and, thus,
is used in the following. For the two cases,
is shown in Figure 3(a) and Figure 3(b) versus the spectral parameter and for three prior values of
. All three cross-sections look similar, but close to 50 GHz the peak is highest for values near
. Identifying this global maximum by
, the ratio
takes reasonable numbers and is presented for the
![]()
Figure 3.
, being proportional to joint posterior
, dependent on frequency
and prior standard deviation
given interferogram data set
(see Figure 1) with low and high noise levels
(left column) and 1 (right column), respectively. Since
takes large values,
is presented in (a) and (b). Deviding by the global maximum, the ratio
becomes manageable from numerical point of view as shown in (c) and (d). This ratio peaks in the vicinity of 50 GHz and
(
). The spectral width of
increases with the noise level from about 10 to 200 MHz but remains well below the classical spectral resolution of about 4.75 GHz. The width with respect to
remains constant and covers one order of magnitude. (a)
(low noise case) for broad spectral domain at three values of
; (b)
(high noise case) for broad spectral domain at three values of
; (c)
(low noise case) for spectral domain 49.9 - 50.1 GHz; (d)
(high noise case) for spectral domain 49 - 51 GHz.
relevant domains of
and
by Figure 3(c) and Figure 3(d) for both data sets. This ratio is narrow in the vicinity of 50 GHz and in the domain
. Furthermore, the peaking has a spectral width of about 10 MHz and 200 MHz for the low and high noise case, respectively. When interpreted as spectral resolution, this width is smaller by several orders of magnitude with respect to its classical counterpart of about 4.75 GHz.
(C) Analytical and Numerical Investigations
To have further inside in the results, the conditional amplitude posterior and
are analytically investigated. In order to do so, the noise contribution in the data is neglected, meaning that
is replaced with the noise-less interferogram
. This allows the tracking of the dependencies of the posteriors on the original amplitude S and frequency f, noise level
, and number of data points
. Furthermore, a second factorisation
can be justified. Parallel to this investigation, numerics and conditional posteriors are presented.
1) Conditional Amplitude Posterior
Starting point is the variance
of the conditional posterior
(see Equation (4)). Formally, one finds for the inverse
(8)
As long as the spatial sampling is well enough, one can use the approximation
(9)
as shown by Equation (A1), to get the posterior variance
(10)
This variance increases quadratically with the noise level and reduces with the number of data samples. Thus, more data points per spatial unit can compensate the noise contribution, at least partly. However, this assumes that the noise is independent of the data sampling.
The modulated sinc function in the denominator of Equation (10) depends on
, L and
for which the interferogram is available. The sinc function becomes unity and vanishes for small and large frequencies, respectively. For instance, since both data examples share the same spatial domain, the sinc function is close to 0 above about 20 GHz. The amplitude prior influences
, when the ratio
becomes larger than 1 which is obtained for a large noise level, a small value of
and/or few data points.
For the posterior mean,
(11)
follows, leaving aside the noise contribution but the uncertainty on the mean is still captured by
. With the approximation (A1) and the trigonometric identity (B1), one can resolve
(12)
Indeed, one finds the original amplitude
for
and prior values with the condition
. Furthermore, due to
there is a peaking of
in the vicinity of
with the FWHM determined by both trigonometric arguments. Away from f, the oscillating amplitude drops like
due to the sinc function which can be seen in Figure 2.
2) Conditional Posterior for f1
Now the information is available to investigate
. Neglecting the noise, the triple product in the exponent of the exponential in
reads
(13)
The global maximum is close to
, and, assuming a sufficiently large frequency f, one gets the approximation
(14)
when the Taylor series expansions
and
are used. Then,
(15)
remains, and the term which is independent on
takes very large numerical values and is treated in more detail in the next subsection. The remaining term can be rewritten by a quadratic exponent of a Gaussian with posterior mean
and square root of the variance
(16)
using Equation (10) for large frequencies and
. This uncertainty increases with the noise but decreases with the square root of the number of data points per spatial unit, the signal level and the spatial domain covered. For the data set examples, one inserts
mm,
mm,
,
and 2 to find the astonishing numbers
and 171.9 MHz, respectively. This explains the narrowness of
with respect to
as shown in Figure 3(c) and Figure 3(d).
In fact,
changes with
on a GHz, but the peaking in
is of the order of MHz. Hence, the factorisation
with the conditional Gaussian posterior
(17)
and
(18)
is justified. Thereby,
needs to be taken at
. The remaining probability
is investigated in the next subsection.
Numerics
Because the variance
changes in
on GHz scale but
on MHz scale, only the exponential needs to be taken into account for the determination of
and
. Furthermore, one faces large numbers, and, hence, the exponent is dealt with directly.
At a given
, the frequency at which the maximum of the exponent occurs is identified as the posterior mean
. The posterior standard deviation
is obtained when the ratio of the exponent to its maximum at
reads
. For both data sets, the ratios are shown in Figure 4(a) and Figure 4(b) in the vicinity of the peak. While the uncertainty of
and, hence, the width of the Gaussian augment with the noise level (see also Figure 5(a) and Figure 5(b)) as described by Equation (16), f remains included in the 2-
band centred at
. Furthermore, the choice of
affects
and
only for small values (see Figure 5(a) and Figure 5(b)).
3) Posterior for
After the factorisation, one combines Equations (16) and (18) like
(19)
with the important term
![]()
Figure 4. Ratio of exponent to its maximum dependent on frequency
and keeping constant prior standard deviation
. For the data set with low noise level (
, left), the peaking is more narrow than the one for the high noise level (
, right). The maximum location and the width of the ratio are used to estimate posterior mean
and standard deviation
. (a) Ratio for low noise data set at three values of
. For
,
GHz and
MHz are extracted which are consistent with 50 GHz used to simulate the data; (b) Ratio for high noise data set at three values of
. For
,
GHz and
MHz are estimated.
and the deviation of
from 50 GHz increase with the noise level. Furthermore, small prior values for
affect
.
![]()
Figure 5. Posterior mean
and standard deviation
for frequency
dependent on prior standard deviation
for noise levels
(left) and 1 (right). The original frequency f = 50 GHz (red line) is included usually in the band
. However,
increases from about 10 MHz to 200 MHz given both noise levels. The posterior
(bottom row) changes little for the different noise levels and has a global maximum close to S = 0.5 (dashed-black). (a)
for
; (b)
for
; (c)
for
; (d)
for
.
(20)
The exponent of
becomes
(21)
which is monotonically rising in
from 0 to about
. In case
and
, the number of data points dominates. For the two examples with
, large numbers
and
are at hand as could be seen in Figure 3(a) and Figure 3(b).
As the exponent vanishes for
,
and, thus, P and
diverge. Hence, this global maximum should be excluded by a proper choice for the prior of
. Doing so, P has the global maximum
at
(22)
supposed a large
. Interestingly,
reduces the noise impact drastically, so that even for elevated noise levels
, the maximum is still given by S with minor corrections.
The uniform prior
may be finite for the domain
, and the maximum of P in this domain is denoted by
. Then, the peaking can be made more obvious by
(23)
after some algebra for a large
. With the normalisation of the above equation
(24)
the posterior becomes
(25)
In case, the prior is finite for the domain
, so that
, one can set
. This reveals the simple forms
(26)
and
(27)
of Equation (23). With the above expression and Equation (19), the constant
(28)
follows with proper unit Hz. K depends in a complicated way on the noise level, the signal, the number of data points, the spatial domain for which the data is acquired, and the limits
and
. For the two examples, one gets
(
) and
(
).
Numerics
Starting point is Equation (19) which is rewritten like
(29)
to handle large numerical values. Thereby,
is evaluated at
. The far right-hand side of the above equation is numerically available, and the maximum
at
and the normalisation
can be determined for the domain from
to
.
Choosing for the moment
and
, so that the global maximum
is included, the above formalism yields
(213) and
(6.36) for the data set simulated with
(1). For both examples, the order of magnitude of
is well described by Equation (26).
For the data sets, both
are located at
and 0.501 which is in the vicinity of S (see Figure 5(c) and Figure 5(d)) and is in agreement with expression (22). For the different noise levels,
has a similar shape; the probability reduces drastically below
but remains finite above up to
, at least.
4) Model Posterior
The posterior of the model with uniform priors for
and
is given by
(30)
This posterior weighs the constants obtained with respect to the spectral and amplitude domains. Hence,
increases with the constants in the numerator and smaller domains in the denominator. When one investigates Figure 3(a) and Figure 3(b), it is obvious that
increases, when
and
are chosen to include
and are separated by few GHz or even some 100s MHz. Similarly, choosing
and
to include the global maximum
at
(see Figure 5(c) and Figure 5(d)) elevates
. Furthermore,
has a global maximum when
and
approach
from either side. This is numerically confirmed by scanning
and
over several orders of magnitude and evaluating the model posterior for the two data sets (see Figure 6(a) and Figure 6(b)).
(D) Joint Posterior and Marginal Posteriors for S1 and f1
Because the model posterior peaks at
, the marginal posterior for
is simply given by the Gaussian
with posterior mean
and standard deviation
(see Section III C2 and Figure 5(a) and Figure 5(b)). Furthermore, the joint posterior
(31)
can be evaluated and is shown for both data sets in Figure 7(a) and Figure 7(b). This joint posterior peaks close to S and f used to simulate the data, and its width increases with the noise level. However, S and f are usually located inside the peak, and, hence, one can trust the results within the posterior uncertainties.
The marginal posterior for
is obtained by the marginalisation
(32)
The above integration is performed numerically, and the results are shown in Figure 7(c) and Figure 7(d) for both data sets. Another way to achieve the marginalisation is to draw samples for
from
, and for each of those samples draw samples for
from
. Histograms of these posterior samples are given in Figure 7(c) and Figure 7(d).
![]()
Figure 6. Model posterior
. The stated model uses a uniform prior for the prior standard deviation
in the domain from
to
. As these limits approach
for
and 0.501 for
(dashed-white), the posterior increases. Thereby,
is closely linked with the original amplitude S = 0.5. (a) Low noise case
; (b) High noise case
.
![]()
Figure 7. Joint posterior
and marginal posterior
of frequency
and amplitude
for low noise (
, left) and high noise data set (
, right). The joint posterior peaks close to the original values S = 0.5 and f = 50 GHz used to simulate the data sets. For the higher noise level, the posterior width increases, and the location of the global maximum separates further from S and f. By numerical integration, the marginal
is evaluated, and its shape is close to a normal distribution. This is confirmed by a second approach which uses 105 samples from the joint posterior to estimate the marginal posterior mean and standard deviation for
. The histogram of the samples is given as well. (a)
for low noise level; (b)
for high noise level; (c)
for low noise level. The marginal mean and standard deviation read 0.4998 and 0.0013, respectively; (d)
for high noise level. One finds that the marginal mean and standard deviation account for 0.505 and 0.026, respectively.
Evaluating the mean and standard deviation from these samples approximates
by a Gaussian. In doing so for the data set with
(1), the marginal posterior mean
(0.505) and standard deviation 0.0013 (0.026) are obtained. Both normal distributions are good approximations for
(see Figure 7(c) and Figure 7(d)).
4. Conclusions
The presented work investigates a monochromatic problem with an adapted version of Bayes’ theorem to infer an amplitude parameter, a frequency parameter and a hyperparameter which acts on the amplitude prior only. The amplitude at a given frequency is a parameter which has a linear mapping to the data domain. If the measurements have a normal noise contribution (Gaussian likelihood) and the amplitude prior is chosen to be Gaussian, an analytical linear inversion technique is enabled. The amplitude posterior, being conditional on the frequency and prior information, has similiarities with the result of a conventional Fourier transformation analysis approach. However, the estimation of the frequency reveals a well localised domain which is in accordance with the measured data. The corresponding posterior for the frequency is orders of magnitude narrower than the width of the convolution function used by conventional approaches to estimate the spectral resolution. The prior information captured by one hyperparameter is tested as well, and its posterior peaks over a one order of magnitude and is very robust for significantly different noise levels on the data.
The findings for the monochromatic problem are investigated analytically. This reveals, for instance, the influence of the signal-to-noise ratio, the amount of data points and the spatial domains covered by the measurements on the results. The presented approach can be followed to examine more complex problems which involve more than one frequency and one amplitude and other diagnostic imperfections like a variable offset. Since this offset has a linear correspondence to the data domain, it would enter in the conditional amplitude posterior. Any non-linear parameter and additional hyperparameter would be estimated by the joint settings posterior specific for a certain problem. In doing so, a profound understanding of model implications could be established in analytical terms for applications relying on Fourier transformations. This is essential for designing a diagnostic for a given problem with somewhat known signal-to-noise ratio and hardware limitations, or for comparing results of different models in an objective way when a data set is given and, for example, the number of contributing frequencies is unknown.
Acknowledgements
This work has been carried out within the framework of the EURO fusion Consortium and has received funding from the Euratom research and training programme 2014-2018 and 2019-2020 under grant agreement No. 633053. The views and opinions expressed herein do not necessarily reflect those of the European Commission.
Appendix A: Integral Approximation
The spatial grid
with
may be given by the set
of uniformly distributed locations separated by the constant increment
, so that
and
. Then, the sum
(A1)
is approximated by an integral over the spatial domain using the trigonometric identities (B1) and (B2). The approximation reveals two modulated sinc functions with a spatial and a spectral dependence. The spectral dependence is due to the sum and difference frequencies
and
, respectively. Spatially, the modulation depends on the length
of single-sided domain which is present on side only. The spatial dependence of the sinc function is given by the length L of the total domain covered.
Appendix B: Trigonometric Identities
The trigonometric identities are
(B1)
(B2)