History-by-History Variance in Monte Carlo Simulation of Radiation Interactions with Matter

Radiation interaction with matter is by nature stochastic. Monte Carlo simulation makes possible, practical, economical and ethical, many experiments otherwise not. In-silico simulation of random samples of radiation histories places into our hands data which may be treated and analysed in radically different ways. This work reports history-by-history computation of the variance in simulation output from FLUKA, a general-purpose Monte Carlo code. Variance computed history by history is compared with variance estimated from varying batch sizes. This work also addresses the issue of under-sampling where, against conventional expectations, the variance spikes as we sample more histories. The background to the spikes is traced by reconstructing single histories interaction by interaction—a novel level of details atypical of Monte Carlo simulations in the field, where statistical convergence of averaged quantities has been the focus.


Introduction
Radiation interaction with matter is a stochastic process. Two particles of the same type and the same energy colliding with the same target may lead to dramatically different outcomes. Take a third particle of the same type and the same energy, the outcome would again be very different. Collect enough samples, however, we find a well-behaved Gaussian distribution. Single collisions are not reproducible, but the collisions are predictable.

Radiation Histories
A radiation history begins with a particle colliding with a matter (something which is not vacuum). Each radiation history is typically very eventful, as the particle traverses the material, takes steps and collides with different atoms or nuclei. Most collisions produce progenies, which may be particles of different types. In a photoneutron collision, for example, the photon produces neutron(s), a different particle type. Some collisions lead to the absorption of a particle; it will no longer appear in the radiation history but the radiation history does not necessarily stop there, as there is all too likely to be progenies propagating the history further. A radiation history ends when the first parent (the source particle) and all progeny particles have either been absorbed or depleted their energy through collisions.
A radiation field consists of many radiation histories, each started off by a particle-the source particle which would later produce progeny particles. A radiation field may be of a nuclear reactor, a research collider (e.g. the Large Hadron Collider at CERN), an archeological or forensic laboratory, a student's laboratory or a radiotherapy clinic. Such is the broad spectrum of applications of ionizing radiation where we seek solutions from Monte Carlo simulation of radiation interactions. Analytical solutions based on probability distributions alone cannot solve the problem.
The predictable behavior described in the first paragraph, or the probability distribution, differs not only by the particle type (e.g. photon, electron, neutron, alpha or...) and the particle energy (spanning over ten orders of magnitude), but also the composition of the target (e.g. a% hydrogen, b% carbon, c% sulfur and d% lead). The target is itself far from homogeneous, each commanding a separate set of probability distributions. Radiation fields in real life are usually mixed with different types of particles, each at different energies evolving by different trends. Radiation histories are complex; different probability distributions take effect at different points in time and in space. This complexity easily overwhelms analytical methods. We resort to Monte Carlo simulations-taking random numbers at each step and each collision, to sample the step length, angle, energy, collision type, and progenies.
Generating random numbers and looking up probability densities are but the most straight-forward components in Monte Carlo simulation of radiation interactions. The physics components are far more complex, ruling out in-house software production. General-purpose Monte Carlo codes for radiation interactions have therefore emerged from CERN, Los Alamos National Laboratory and National Research Council Canada. FLUKA [1], MCNP [2], EGSnrc [3], PENE-LOPE [4] and GEANT4 [5], each with distinct design philosophy and development history, serve user communities worldwide. MCNP is particularly elaborate in neutron transport and statistical analysis. EGSnrc and PENELOPE simulate photons, electrons and positrons only. GEANT4 is a toolkit built on a contemporary object-oriented software design. FLUKA is not a toolkit; it simulates seamlessly a broad variety of particle types and heavy ions over a wide energy spectrum from eV to TeV [6] [7] [8].

Averaging and Convergence
Monte Carlo simulations in the literature generally report "statistically meaningful" averaged quantities produced from simulating millions to trillions of radiation histories. In radiotherapy for example, beams of particles are shot into the body in a carefully-controlled delivery. No two radiation histories would be the same because the dice is rolled too many times within a single radiation history. Each particle deposits different amounts of energy at different points; some in the tumour, others missing the tumour and deposit their energies within healthy tissues instead-which are never reported-such a suggestion could even appear scandalous.
Quantities of interest, the intent of the simulation requested by the user in the input, are called "tallies" and "scores" by the MCNP and FLUKA communities, respectively. Tallies vary with the application at hand. Absorbed dose, dose equivalent and displacement per atom (DPA), for instance, apply to treatment planning, radiation protection and studies of material damage, respectively.
Authors generally report averaged quantities-not outliers. Whereas it is customary for referees to request for the number of histories in the absence of such a statement, it is hardly ever practical or possible to infer any statistical uncertainty of a Monte Carlo output from the number of histories simulated-unless one has detailed knowledge of the full collection of probability densities involved, along with the full knowledge of the techniques (e.g. variance reduction 1 ) and transport parameters the user applied. That error-bars decrease inversely with the square-root of the number of histories is a common expectation but will not apply unless the simulation is sufficiently sampled i.e. there is no undersampling in space (e.g. obscure corners hardly tracked by particles), geometry (e.g. materials of small dimensions bound to be missed) and physics (e.g. resonance regions of neutron cross-sections).

Variance Estimation
The variance of a score may be estimated either batch by batch or history by history. The former is the existing practice in the FLUKA community. The latter estimates variance more accurately; a brute-force implementation, however, would require storing the score from each history, which can be prohibitive in terms of computer memory.
Batch-by-batch estimation of variance is realised by dividing the simulation into batches. Instead of running a simulation of N histories, B simulations of N B histories are ran. Each of the B simulations is made independent by applying unique random number seeds. At the completion of each simulation, the averaged score, 1 Variance reduction techniques cause simulations to converge statistically within a shorter amount of computation time by cutting corners during simulations. When correctly applied, quantities of interest to the user will not be biased.
is obtained. i x , the contribution by individual histories, is no longer recoverable. The standard deviation, is therefore not calculable. What remains calculable is the standard error, which is the standard deviation of N B samples of b x rather than N samples of i x . The standard error is therefore used as an estimate for the standard deviation; B is most often taken to be ten. This estimation is known to be B dependent unless B is sufficiently large.

Materials & Methods
In a FLUKA simulation, 10 5 independent samples of 7 MeV photons impinge a tungsten block which is partitioned into 128,797 voxels of 5 mm × 5 mm × 5 mm. History-by-history computation of variance is applied to each voxel. Spatially-differentiated energy deposition is studied alongside event-by-event single histories.

History-by-History Computation of Variance
In this work, history-by-history computation of variance is implemented in FLUKA via the comscw.f, usrini.f and usrout.f routines. The algorithm implemented here is adopted from Sempau et al. [13]; the same as that implemented in BEAMnrc by Walters et al. [14]. It provides an economical solution to calculating variance on the fly as a simulation progresses with growing number of histories. The memory footprint does not increase with increasing number of histories. The algorithm summons three counters for each score: • a counter for the score ( i x ) for the current history; • a counter for the score, cumulative over all the histories so far • a counter for the square of the score, cumulative over all the histories so far At the end of simulation, the two cumulative counters are used to calculate the standard deviation, For scores with multiple bins, the three counters come in the form of three arrays, one array element to each bin. The tungsten block simulated here consists of 128,797 voxels, corresponding to 128,797 bins.

Single Histories Interaction by Interaction
Interaction-by-interaction single histories are studied using FTREE; technical implementation has previously been reported [9].

Results & Discussion
Spatial distribution of energy deposition (E) in tungsten ( Figure 1) presents the penetration problem. Energy deposition along the central beam axis (radial position around zero) and at shallow depth is well sampled. Deeper into tungsten, further off axis, sampling is poor and σ is high.

History-by-History Computation of Variance
With increasing number of histories, there is considerable fluctuation (Figure 2) along the central axis itself in both E and σ , which was computed history by history. Low photon interaction cross-sections in air is evident from the fluctuating E and high σ at d < 0. Difficult penetration pf photons in tungsten is evident from the fluctuating E and increasing σ at d > 10 cm. Two spikes in σ are particularly noticeable in Figure 2

Single Histories Interaction by Interaction
From the 10 5 histories ran, five photoneutrons were observed. As expected, ( γ , In the listings, each row represents either the creation of a new particle (the particle type is named, preceded by an arrow) or a collision (the collision type is named, preceded by an asterisk). nth level of indentation indicates the nth generation within the radiation history. + > precedes particles which are siblings of the previous particle of the same generation; − > otherwise. The kinetic energy of the particle is given in the first column. For neutrons below 20 MeV, an This is now an opportune point to revisit the two spikes in Figure 2

Conclusion
History-history computation of variance has been implemented and compared against batch-by-batch estimation. The issue of undersampling was addressed.
Two spikes in variance, even as it was apparently decreasing monotonically up to that point, were traced to their origin by examining single histories interaction by interaction using FTREE. The particle undergoing a low-probability event (photoneutron collision), of entirely correct physics, was uniquely identified.
Two radiation histories were presented, featuring invaluable data which would