Second-Order MaxEnt Predictive Modelling Methodology. II: Probabilistically Incorporated Computational Model (2nd-BERRU-PMP)

Abstract

This work presents a comprehensive second-order predictive modeling (PM) methodology based on the maximum entropy (MaxEnt) principle for obtaining best-estimate mean values and correlations for model responses and parameters. This methodology is designated by the acronym 2nd-BERRU-PMP, where the attribute “2nd” indicates that this methodology incorporates second- order uncertainties (means and covariances) and second (and higher) order sensitivities of computed model responses to model parameters. The acronym BERRU stands for “Best-Estimate Results with Reduced Uncertainties” and the last letter (“P”) in the acronym indicates “probabilistic,” referring to the MaxEnt probabilistic inclusion of the computational model responses. This is in contradistinction to the 2nd-BERRU-PMD methodology, which deterministically combines the computed model responses with the experimental information, as presented in the accompanying work (Part I). Although both the 2nd-BERRU-PMP and the 2nd-BERRU-PMD methodologies yield expressions that include second (and higher) order sensitivities of responses to model parameters, the respective expressions for the predicted responses, for the calibrated predicted parameters and for their predicted uncertainties (covariances), are not identical to each other. Nevertheless, the results predicted by both the 2nd-BERRU-PMP and the 2nd-BERRU-PMD methodologies encompass, as particular cases, the results produced by the extant data assimilation and data adjustment procedures, which rely on the minimization, in a least-square sense, of a user-defined functional meant to represent the discrepancies between measured and computed model responses.

Share and Cite:

Cacuci, D. (2023) Second-Order MaxEnt Predictive Modelling Methodology. II: Probabilistically Incorporated Computational Model (2nd-BERRU-PMP). American Journal of Computational Mathematics, 13, 267-294. doi: 10.4236/ajcm.2023.132014.

1. Introduction

An accompanying work [1] has presented a comprehensive second-order predictive modeling (PM) methodology based on the maximum entropy (MaxEnt) principle [2] , for obtaining best-estimate mean values and correlations for model responses and parameters, which was designated by the acronym 2nd-BERRU-PMD. The attribute “2nd” indicates that this methodology incorporates second-order uncertainties (means and covariances) and second-order sensitivities of computed model responses to model parameters. The acronym BERRU stands for “Best-Estimate Results with Reduced Uncertainties” and the last letter (“D”) in the acronym indicates “deterministic,” referring to the deterministic inclusion of the computational model responses.

Alternative to the 2nd-BERRU-PMD methodology, this work presents the 2nd-BERRU-PMP methodology for obtaining best-estimate mean values and correlations for model responses and parameters. The 2nd-BERRU-PMP methodology is also based on the MaxEnt principle but includes the computational model responses probabilistically-through a MaxEnt representation-in contradistinction to its deterministic inclusion within the 2nd-BERRU-PMD methodology. This (contra) distinction is indicated the last letter (“P”) in the acronym, which refers to the “probabilistic” inclusion of the computed model responses.

This work is structured as follows: Section 2 presents the second-order MaxEnt probabilistic representation of the computational model. Section 3 presents the “second order predictive modeling methodology with probabilistically included computed responses” (2nd-BERRU-PMP) methodology. Although both the 2nd-BERRU-PMP and the 2nd-BERRU-PMD methodologies yield expressions that include second (and higher) order sensitivities of responses to model parameters, the respective expressions for the predicted responses, calibrated predicted parameters and their predicted uncertainties (covariances) are not identical to each other, although they encompass, as particular cases, the results produced by the extant data assimilation [3] [4] and data adjustment procedures [5] [6] [7] [8] [9] . Comparisons with the results produced by the first-order BERRU-PM methodology [10] [11] [12] are also presented. The advantages of the 2nd-BERRU-PMP methodology over the results produced by the data assimilation method [3] [4] are presented in Section 5. The discussion presented in Section 6 concludes this work. Illustrative applications of the 2nd-BERRU-PMP and the 2nd-BERRU-PMD methodologies to forward and inverse problems are currently in progress.

2. Second-Order MaxEnt Probabilistic Representation of the Computational Model

The modeling of a physical system requires consideration of the following modeling components:

1) A mathematical model comprising linear and/or nonlinear equations that relate the system’s independent variables and parameters to the system’s state (i.e., dependent) variables. In this work, matrices will be denoted using capital bold letters while vectors will be denoted using either capital or lower-case bold letters. The symbol “ ” will be used to denote “is defined as” or “is by definition equal to.” Transposition will be indicated by a dagger ( ) superscript. The equalities in this work are considered to hold in the weak (“distributional”) sense.

2) Nominal values and uncertainties that characterize the model’s parameters.

3) One or several computational results, customarily referred to as model responses (or objective functions, or indices of performance), which are computed using the mathematical model; and

4) External to the model: experimentally measured responses that correspond to the computed responses, with their respective experimentally determined nominal (mean) values and uncertainties (variances, covariances, skewness, kurtosis, etc.). Occasionally, measurements of correlations among the measured responses and the model parameters, as well as externally performed measurements of model parameters (in addition to the information about parameters used in the model-computations) might be available.

The model parameters usually stem from processes that are external to the physical system (and, hence, model) under consideration and their precise values are seldom, if ever, known. The known characteristics of the model parameters may include their nominal (expected/mean) values and, possibly, higher-order moments (i.e., variance/covariances, skewness, kurtosis), which are usually determined from experimental data and/or processes external to the physical system under consideration. Occasionally, just the lower and the upper bounds may be known for some model parameters. Without loss of generality, the model parameters will be considered in this work to be real-valued scalars, and will be denoted as α 1 , , α T P , where the quantity “TP” denotes the “total number of model parameters.” Mathematically, these parameters are considered as components of a TP-dimensional vector denoted as α ( α 1 , , α T P ) D α T P , defined over a domain D α , which is included in a TP-dimensional subset of the T P . The components of the TP-dimensional column vector α T P are considered to include imprecisely known geometrical parameters that characterize the physical system’s boundaries in the phase-space of the model’s independent variables. The model parameters can be considered to be quasi-random scalar-valued quantities which follow an unknown multivariate distribution denoted as p α ( α ) . The mean values which will be called “nominal” values in the context of computational modeling of the model parameters will be denoted as α i 0 ; the superscript “0” will be used throughout this work to denote “nominal values.” These nominal values are formally defined as follows:

α i 0 D α α i p α ( α ) d α , i = 1 , , T P . (1)

The expected values of the measured parameters will be considered to constitute the components of a “vector of nominal values” denoted as α 0 [ α 1 0 , , α i 0 , , α T P 0 ] .

The covariance, cov ( α i , α j ) , of two model parameters, α i and α j , is defined as follows:

cov ( α i , α j ) D α ( α i α i 0 ) ( α j α j 0 ) p α ( α ) d α ; i , j = 1 , , T P . (2)

The covariances cov ( α i , α j ) are considered to be the components of the parameter covariance matrix, denoted as C α α [ cov ( α i , α j ) ] T P × T P .

The results computed using a mathematical model are customarily called “model responses” (or “system responses” or “objective functions” or “indices of performance”). Consider that there are a total number of TR such model responses, each of which can be considered to be a component of the “vector of model responses” r ( r 1 , , r k , , r T R ) . Each of these model responses is formally a function (implicit and/or explicit) of the model parameters α , i.e., r k = r k ( α ) ; k = 1 , , T R . The uncertainties affecting the model parameters α will “propagate” both directly and indirectly, through the model’s dependent variables, to induce uncertainties in the computed responses, which will therefore be denoted as r k c ( α ) . Each computed response can be formally expanded in a multivariate Taylor-series around the parameters’ mean values. In particular, the fourth-order Taylor-series of a system response, denoted as r k ( α ) , around the expected (or nominal) parameter values α 0 has the following formal expression:

r k ( α ) = r k ( α 0 ) + j 1 = 1 T P { r k ( α ) α j 1 } α 0 δ α j 1 + 1 2 j 1 = 1 T P j 2 = 1 T P { 2 r k ( α ) α j 1 α j 2 } α 0 δ α j 1 δ α j 2 + 1 3 ! j 1 = 1 T P j 2 = 1 T P j 3 = 1 T P { 3 r k ( α ) α j 1 α j 2 α j 3 } α 0 δ α j 1 δ α j 2 δ α j 3 + 1 4 ! j 1 = 1 T P j 2 = 1 T P j 3 = 1 T P j 4 = 1 T P { 4 r k ( α ) α j 1 α j 2 α j 3 α j 4 } α 0 δ α j 1 δ α j 2 δ α j 3 δ α j 4 + ε k . (3)

In Equation (3), the quantity r k ( α 0 ) indicates the computed value of the response using the expected/nominal parameter values α 0 ( α 1 0 , , α T P 0 ) . The notation { } α 0 indicates that the quantities within the braces are also computed using the expected/nominal parameter values. The quantity ε k in Equation (3) comprises all quantifiable errors in the representation of the computed response as a function of the model parameters, including the truncation errors O ( δ α j ) 5 of the Taylor-series expansion, possible bias-errors due to incompletely modeled physical phenomena, and possible random errors due to numerical approximations. The radius/domain of convergence of the series in Equation (3) determines the largest values of the parameter variations δ α j which are admissible before the respective series becomes divergent. In turn, these maximum admissible parameter variations limit, through Equation (3), the largest parameter covariances/standard deviations which can be considered for using the Taylor-expansion for the subsequent purposes of computing moments of the distribution of computed responses.

As is well known, and as indicated by Equation (3), the Taylor-series of a function of TP-variables [e.g., r k ( α ) ] comprises TP 1st-order derivatives, T P ( T P + 1 ) / 2 distinct 2nd-order derivatives, and so on. The computation by conventional methods of the nth-order functional derivatives (called “sensitivities” in the field of sensitivity analysis) of a response with respect to the TP-parameters (on which it depends) would require at least O ( T P n ) large-scale computations. The exponential increase with the order of response sensitivities of the number of large-scale computations needed to determine higher-order sensitivities is the manifestation of the “curse of dimensionality in sensitivity analysis,” by analogy to the expression coined by Bellman [13] to express the difficulty of using “brute-force” grid search when optimizing a function with many input variables. The “nth-order Comprehensive Adjoint Sensitivity Analysis Methodology for Nonlinear Systems” (nth-CASAM-N) conceived by Cacuci [14] and the “nth-order Comprehensive Adjoint Sensitivity Analysis Methodology for Response-Coupled Forward/Adjoint Linear Systems” (nth-CASAM-L) conceived by Cacuci [15] are currently the only methodologies that enable the exact and efficient computation of arbitrarily high-order sensitivities while overcoming the curse of dimensionality.

Uncertainties in the model’s parameters will evidently give rise to uncertainties in the computed model responses r k ( α ) . The computed model responses are considered to be distributed according to an unknown distribution denoted as p c ( r ) . The unknown joint probability distribution of model parameters and responses will be denoted as p c ( α , r ) p α ( α ) p c ( r ) . The approximate moments of the unknown distribution of r k ( α ) are obtained by using the so-called “propagation of errors” methodology, which entails the formal integration over p c ( α , r ) p α ( α ) p c ( r ) of various expressions involving the truncated Taylor-series expansion of the response provided in Equation (3). This procedure was first used by Tukey [16] . Tukey’s results were generalized to 6th-order by Cacuci [15] .

The expectation value, E c ( r k ) , of a computed response r k ( α ) is obtained by integrating formally Equation (3) over p c ( α , r ) , which yields the following expression:

E c ( r k ) r k ( α ) p c ( α , r ) d α d r = r k ( α 0 ) + 1 2 j 1 = 1 T P j 2 = 1 T P { 2 r k ( α ) α j 1 α j 2 } α 0 cov ( α j 1 , α j 2 ) + 1 6 j 1 = 1 T P j 2 = 1 T P j 3 = 1 T P { 3 r k ( α ) α j 1 α j 2 α j 3 } α 0 t j 1 j 2 j 3 σ j 1 σ j 2 σ j 3 + 1 24 j 1 = 1 T P j 2 = 1 T P j 3 = 1 T P j 4 = 1 T P { 4 r k ( α ) α j 1 α j 2 α j 3 α j 4 } α 0 q j 1 j 2 j 3 j 4 σ j 1 σ j 2 σ j 3 σ j 4 + O ( σ j 1 σ j 2 σ j 3 σ j 4 σ j 5 ) . (4)

The expectation values E c ( r k ) , k = 1 , , T R , are considered to be the components of a vector defined as follows: E c ( r ) [ E c ( r 1 ) , , E c ( r k ) , , E c ( r T R ) ] .

The expression of the correlation between a computed responses and a parameter variance, which will be denoted as cor ( α i , r k ) , is presented below:

cor ( α i , r k ) δ α i [ r k ( α ) E c ( r k ) ] p c ( α , r ) d α d r = j = 1 T P { r k ( α ) α j } α 0 cov ( α i , α j ) + 1 2 j 1 = 1 T P j 2 = 1 T P { 2 r k ( α ) α j 1 α j 2 } α 0 t i , j 1 j 2 σ i σ j 1 σ j 2 + 1 6 j 1 = 1 T P j 2 = 1 T P j 3 = 1 T P { 3 r k ( α ) α j 1 α j 2 α j 3 } α 0 q i , j 1 j 2 j 3 σ j 1 σ j 2 σ j 3 σ j 4 + O ( σ j 1 σ j 2 σ j 3 σ j 4 σ j 5 ) . (5)

The correlations cor ( α i , r k ) , i = 1 , , T P , k = 1 , , T R , are considered to be the components of a “parameter-response computed correlation matrix” denoted as C α r c and defined as follows:

C α r c ( cor ( α 1 , r 1 ) cor ( α 1 , r T R ) cor ( α T P , r 1 ) cor ( α T P , r T R ) ) . (6)

The expression of the covariance between two responses r k and r l , denoted as cov ( r k , r l ) , is presented below:

(7)

The covariances cov ( r k , r l ) , k , l = 1 , , T R , are considered to be the components of a “computed-responses covariance matrix” denoted as C r r c and defined as follows:

C r r c ( cor ( r 1 , r 1 ) cor ( r 1 , r T R ) cor ( r T P , r 1 ) cor ( r T P , r T R ) ) . (8)

The information provided in Equations (1)-(8) regarding the mean values and correlations for the model parameters and computed model responses will be used to construct a second-order MaxEnt approximation, denoted as p c ( z | z c , C c ) , to represent the distribution p c ( α , r ) p α ( α ) p c ( r ) of computed model responses and parameters. The distribution p c ( z | z c , C c ) is constructed by following the procedure outlined in Appendix, to obtain the following expression:

p c ( α , r ) p c ( z | z c , C c ) ( 2 π ) ( T R + T P ) / 2 D e t ( C c ) exp [ 1 2 ( z z c ) C c 1 ( z z c ) ] , (9)

where:

C c ( C r r c C r α c C α r c C α α ) ; C r α c ( C α r c ) ; z = ( r α ) ; z c = ( E c ( r ) α 0 ) . (10)

It is important to note that even though the MaxEnt distribution represented in Equation (9) is a multivariate normal distribution characterized by just the first-order moments (mean values) and second-order moments (variance/covariances) of the full but unknown “joint distribution p c ( α , r ) of computed responses and parameters,” the components of the vector z c of mean values and the components of the covariance matrix C c may comprise terms involving third- and higher-order response sensitivities and parameter correlations (e.g., triple and quadruple correlations), if available, as indicated in Equations (4), (5), and (7).

3. 2nd-BERRU-PMP: Second Order Predictive Modeling Methodology with Probabilistically Included Computed Responses

This Section presents the mathematical and physical considerations leading to the development of the second-order predictive modeling methodology in which the computational model is probabilistically incorporated with the measurements into the combined 2nd-order MaxEnt posterior distribution which represents all of the available computational and experimental information. This methodology will be designated using the acronym 2nd-BERRU-PMP, where the last letter (“P”) indicates “probabilistic.” Subsection 3.1 presents the general case, in which measurements (mean values and correlations) for parameters are available−in addition to the information used in the computational model−for incorporation together with the measured responses into the MaxEnt probabilistic representation of the computational model in order to finally obtain the joint posterior distribution that would represent all of the available second-order information. This general case seldom occurs in practice because additional measurements regarding the model parameters (outside of, and in addition to, their use in the model) are rarely available. Usually, only measurements for responses are available for incorporation into the MaxEnt representation of the computational model; this case is analyzed in Subsection 3.2.

3.1. General Case: Measurements for Both Responses and Parameters Are Available for Combination with the MaxEnt Probabilistic Representation of the Computational Model to Obtain Their Joint Posterior Distribution

In the most general case, measurements could be available not only for the responses (results) of interest but also about the parameters used in the model to compute the respective results. In such a case, the second-order MaxEnt distribution which will represent probabilistically the computed model responses and model parameters is to be combined with a second-order MaxEnt distribution which is to represent probabilistically the measured model responses and the additionally measured model parameters (i.e., model parameters measured in addition to, and independently of, the computational model). Since only first- and second-order moments are considered to be available, each of these MaxEnt distributions will be multivariate normal distributions, so their combination will also yield a multivariate posterior MaxEnt distribution (for all of the first- and second-order available moments).

Consider that the total number of experimentally measured responses is TR. The information usually available regarding the distribution of such measured responses comprises the first-order moments (mean values), which will be denoted as r i e , i = 1 , , T R , and the second-order moments (variances/covariances), which will be denoted as cov ( r i , r j ) e , i , j = 1 , , T R , for the measured responses. Occasionally, measured correlations between the model parameters and the measured responses could also be available. The letter “e” will be used either as a superscript or a superscript to indicate experimentally measured quantities. The expected values of the measured responses will be considered to constitute the components of a vector denoted as r e ( r 1 e , , r T R e ) . The covariances of the measured responses are considered to be components of the T R × T R -dimensional covariance matrix of measured responses, which will be denoted as C r r e [ cov ( r i , r j ) e ] T R × T R . In principle, it is also possible to obtain correlations between some measured responses and some model parameters. When such correlations between measured responses and measured model parameters are available, they are defined as cor ( α i , r j ) e , i = 1 , , T P ; j = 1 , , T R , and can formally be considered to be elements of a rectangular correlation matrix which will be denoted as C α r e [ cov ( α i , r j ) e ] T P × T R . In the most general situation, it is possible to have not only measured responses but also new measurements for the parameters, in addition to and independently of, the parameters used in the mathematical/computational model (i.e., where they are considered to have nominal values α 0 and covariance matrix C α α ). These measured parameters will be considered to have a vector of mean values denoted as α e ( α 1 e , , α T P e ) , and a covariance matrix denoted as C α r e [ cov ( α i , r j ) e ] T P × T R .

The MaxEnt principle can now be applied, as described in Appendix, to construct the least informative (and hence, most conservative) distribution using the available experimental information, to obtain the following expression:

p e ( z | z e , C e ) = ( 2 π ) ( T R + T P ) / 2 D e t ( C e ) exp [ 1 2 ( z z e ) C e 1 ( z z e ) ] , (11)

where:

C e ( C r r e C r α e C α r e C α α e ) ; z = ( r α ) ; z e = ( r e α e ) . (12)

The joint posterior probability distribution of all computed and experimentally measured quantities, which will be denoted as p p ( z | z p , C p ) where the subscript “p” indicates “posterior,” is obtained as the properly normalized product of the distributions p e ( z | z e , C e ) and p c ( z | z c , C c ) . Since both p e ( z | z e , C e ) and p c ( z | z c , C c ) are normally distributed, it follows that the posterior probability p p ( z | z p , C p ) will also be normally distributed and will be given by the following expression:

p p ( z | z p , C p ) p c ( z | z c , C c ) p e ( z | z e , C e ) = K exp [ 1 2 Q ( z | z p , C p ) ] , (13)

where the normalization constant K and the quadratic form Q ( z p , C p ) have the following expressions, respectively:

K ( 2 π ) N / 2 [ D e t ( C c + C e ) ] 1 / 2 exp [ 1 2 ( z c z e ) ( C c + C e ) 1 ( z c z e ) ] ; (14)

Q ( z | z p , C p ) ( z z p ) C p 1 ( z z p ) ; (15)

C p ( C c 1 + C e 1 ) 1 = C c C c ( C c + C e ) 1 C c = C e C e ( C c + C e ) 1 C e ; (16)

z p C p ( C c 1 z c + C e 1 z e ) . (17)

The expression obtained in Equation (13) provides the exact first-order (mean values) and second-order (variances and co-variances) of the most comprehensive combined distribution of computations and measurements of responses and parameters. In practice, however, such a comprehensive amount of experimental information is highly unlikely to be available. Furthermore, even if such massive amount of experimental information were available, the inversion of the matrix ( C c + C e ) 1 may be impractical since it would require massive computational resources when the physical systems involve many (thousands of) parameters.

3.2. Practical Case: Only Response Measurements Are Available for Combination with the MaxEnt Probabilistic Representation of the Computational Model to Obtain Their Joint Posterior Distribution

In practice, the information (mean values and covariances) about the model parameters indicated in Equations (1) and (2) is obtained and assessed prior to using this information in the mathematical/computational model used for computing responses. Thus, all of the information available regarding the model parameters would be used to construct the components of the vector α 0 ( α 1 0 , , α T P 0 ) of parameter nominal values and the components of the parameter covariance matrix C α α . In practice, only information about measured responses would become additionally available, i.e., only the vector r e ( r 1 e , , r T R e ) and the covariance matrix C r r e [ cov ( r i , r j ) e ] T R × T R would become available for combination with the computational information. In such a case, the MaxEnt distribution corresponding to Equation (11) will reduce to the following expression:

p e ( r | r e , C r r e ) = ( 2 π ) T R / 2 [ D e t ( C r r e ) ] 1 / 2 exp [ 1 2 ( r r e ) ( C r r e ) 1 ( r r e ) ] . (18)

Furthermore, when only the experimental information represented by the distribution p e ( r | r e , C r r e ) is available, the posterior joint probability distribution of the computed and measured quantities, which will be denoted as p p ( r , α ) will have the following form:

p p ( r , α ) = N exp [ 1 2 Q ( r , α ) ] ; N [ exp [ 1 2 Q ( r , α ) ] d α d r ] 1 ; Q ( r , α ) ( z z c ) C c 1 ( z z c ) + ( r r e ) ( C r r e ) 1 ( r r e ) . (19)

The evaluation of the moments of the posterior distribution represented by Equation (19) will involve the evaluation of ratios of integrals having the following form:

I = [ exp [ g ( z ) ] d z ] 1 f ( z ) exp [ g ( z ) ] d z . (20)

The evaluations of expressions such as shown in Equation (20) can be performed to a high degree of accuracy, with a controlled error, by employing the saddle-point (Laplace) method, see e.g., [17] [18] , which yields the following result:

I = f ^ 1 2 f ^ i 1 g ^ i 2 i 3 i 4 g ^ i 1 i 2 g ^ i 3 i 4 + 1 2 f ^ j 1 j 2 g ^ j 1 j 2 + O ( f ^ j 1 j 2 j 3 ) , (21)

where:

1) The derivative of a function with respect to a component of z is denoted using a subscript, e.g., f i f / z i , f i j 2 f / z i z j , i , j = 1 , , T I , where TI denotes the total number of independent variables;

2) The superscripts denote the respective component of the inverse Hessian of the respective function, e.g., f i j denotes the ( i , j ) -element of the inverse Hessian matrix [ f i j 1 ] ;

3) An index that appears as a subscript and a superscript implies a summation over all possible values of that index;

4) The “hat” denotes that the respective quantity is to be evaluated at the saddle point of exp [ g ( z ) ] , which is defined as the point at which the gradient g z ( z ) ( g / z 1 , , g / z T I ) of g ( z ) vanishes, i.e., g z ( z ) = 0 .

The saddle-point of the normalization integral N will be denoted as ( r s , α s ) and is defined by the following relations:

Q ( r , α ) r = 0 , Q ( r , α ) α = 0 , at ( r , α ) = ( r s , α s ) . (22)

To obtain the partial gradients (differentials) shown in Equation (22), it is convenient to write the matrix C c 1 in the form C c 1 = ( C 11 C 12 C 21 C 22 ) , and use this form together with the definitions provided in Equations (10) and (12) in Equation (19) to expand the functional Q ( r , α ) into the following form:

Q ( r , α ) = [ r E c ( r ) ] C 11 [ r E c ( r ) ] + [ r E c ( r ) ] C 12 ( α α 0 ) + ( α α 0 ) C 21 [ r E c ( r ) ] + ( α α 0 ) C 22 ( α α 0 ) + ( r r e ) ( C r r e ) 1 ( r r e ) . (23)

Taking the partial differentials of the expression in Equation (23) yields the following equation for determining the coordinates of the saddle point ( r , α ) = ( r s , α s ) :

( r s E c ( r ) α s α 0 ) = ( C r r c C r α c C α r c C α α ) ( ( C r r e ) 1 ( r s r e ) 0 ) . (24)

Solving Equation (24) leads to the following expressions for the coordinates r s and α s of the saddle point:

r s = r e + C r r e ( C r r e + C r r c ) 1 [ E c ( r ) r e ] , (25)

α s = α 0 C α r c ( C r r e + C r r c ) 1 [ E c ( r ) r e ] . (26)

The best-estimate predicted mean values for the responses and parameters are defined below:

r b e p D p r p p ( α , r ) d α d r , (27)

α b e p D p α p p ( α , r ) d α d r . (28)

The superscript “bep” has been used in Equations (29) and (30), respectively, to indicate that the expressions obtained for the respective predicted responses and parameters are “best estimates with probabilistic computational model.” The saddle-point method is applied to evaluate the integrals represented by Equations (27) and (28), respectively, to obtain the following expressions for the optimally-predicted “best-estimate” values for the responses and calibrated parameters:

r b e p = r s = r e + C r r e ( C r r e + C r r c ) 1 [ E c ( r ) r e ] , (29)

α b e p = α s = α 0 C α r c ( C r r e + C r r c ) 1 [ E c ( r ) r e ] . (30)

Since the components of the vector E c ( r ) , and the components of the matrices C r r c and C α r c can contain arbitrarily high-order response sensitivities to model parameters, the expressions presented in Equations (29) and (30) generalize the previous formulas of this type found in data adjustment/assimilation procedures published to date (which contain at most second-order sensitivities). The best-estimate parameter values are the “calibrated model parameters” which can be used for subsequent computations with the “calibrated model.”

The second-order moments of the posterior distribution p p ( r , α ) comprise the covariances between the best estimated response, which will be denoted as C r r b e p , the covariances between the best-estimate parameters, which will be denoted as C α α b e p , and the correlations between the best-estimate parameters and responses, which will be denoted as C α r b e p . The expression of the “best-estimate probabilistic model” posterior parameter covariance matrix, C r r b e p , for the best-estimate responses r b e p is derived by using the result given in Equation (29) to obtain the following expression:

C r r b e p D p ( r r b e p ) ( r r b e p ) p p ( α , r ) d α d r = C r r e C r r e ( C r r e + C r r c ) 1 C r r e . (31)

The following important result has been used to obtain the expression provided in Equation (31):

[ E c ( r ) r e ] [ E c ( r ) r e ] = [ r r e r + E c ( r ) ] [ r r e r + E c ( r ) ] = C r r e + C r r c . (32)

As indicated in Equation (31), the initial covariance matrix C r r e for the experimentally measured responses is multiplied by the matrix [ I ( C r r e + C r r c ) 1 C r r e ] , which means that the variances contained on the diagonal of the best-estimate matrix C r r b e p will be smaller than the experimentally measured variances contained in C r r e . Hence, the incorporation of experimental information reduces the predicted best-estimate response variances in C r r b e p by comparison to the measured variances contained a priori in C r r e . Since the components of the matrix C r r b e p contain high-order sensitivities, the formula presented in Equation (31) generalizes the previous formulas of this type found in data adjustment/assimilation procedures published to date.

The expression of the “best-estimate” posterior parameter covariance matrix C α α b e p for the best-estimate parameters α b e p is derived by using the result given in Equation (30) to obtain:

C α α b e p D p ( α α b e p ) ( α α b e p ) p p ( α , r ) d α d r = C α α C α r c ( C r r e + C r r c ) 1 C α r c . (33)

The matrices C α α and C α r c ( C r r e + C r r c ) 1 C α r c are symmetric and positive definite. Therefore, the subtraction indicated in Equation (33) implies that the components of the main diagonal of C α α b e p must have smaller values than the corresponding elements of the main diagonal of C α α . In this sense, the combination of computational and experimental information has reduced the best-estimate parameter variances on the diagonal of C α α b e p . Since the components of the matrices C α α , C α r c , and C r r c contain high-order response sensitivities, the formula presented in Equation (33) generalizes the previous formulas of this type found in data adjustment/assimilation procedures published to date.

The expressions of the “best-estimate” posterior parameter correlation matrix C α r b e p and/or its transpose C r α b e p , for the best-estimate parameters α b e p and best-estimate responses r b e p , are derived by using the results given in Equation (29) and Equation (30) to obtain the following expressions:

C α r b e p D p ( α α b e p ) ( r r b e p ) p p ( α , r ) d α d r = C α r e C α r c ( C r r e + C r r c ) 1 C r r e ; (34)

C r α b e p D p ( r r b e p ) ( α α b e p ) p p ( α , r ) d α d r = C r α e C r r e ( C r r e + C r r c ) 1 C r α c = ( C α r b e p ) . (35)

Since the components of the matrices C α r c and C r r c contain high-order sensitivities, the formulas presented in Equations (34) and (35) generalize the previous formulas of this type found in data adjustment/assimilation procedures published to date.

It is important to note from the results shown in Equations (29)-(35) that the computation of the best estimate parameter and response values, together with their corresponding best-estimate covariance matrices, only involves a single matrix inversion when computing ( C r r e + C r r c ) 1 , which entails the inversion of a matrix of size T R × T R . This is computationally very advantageous, since T R T P , i.e., the number of responses is much less than the number of model parameters in the overwhelming majority of practical situations.

Using Equations (24), (29) and (30) in Equation (23) yields the following expression for the minimum value, Q min = Q ( α b e p , r b e p ) , of the quadratic form Q ( α , r ) :

Q min = ( r b e p E c ( r ) α b e p α 0 ) ( C r r c C r α c C α r c C α α ) 1 ( r b e p E c ( r ) α b e p α 0 ) + ( r b e p r e ) ( C r r e ) 1 ( r b e p r e ) = [ E c ( r ) r e ] ( C r r e + C r r c ) 1 [ E c ( r ) r e ] . (36)

As the expression obtained in Equation (36) indicates, the quantity Q min represents the square of the length of the vector [ E c ( r ) r e ] , measuring (in the corresponding metric) the deviations between the experimental and nominally computed responses. The quantity Q min is independent of calibrating (or adjusting) the original data, so it can be evaluated directly from the given data (i.e., model parameters and computed and measured responses, together with their original uncertainties) after having computed the matrix ( C r r e + C r r c ) 1 . As the dimension of the vector [ E c ( r ) r e ] indicates, the number of degrees of freedom characteristic of the calibration under consideration is equal to the number TR of experimental responses.

4. Inter-Comparison: 2nd-BERRU-PMP vs. 2nd-BERRU-PMD

In this section, the results obtained for the best-estimate mean values for the responses and parameters, together with their corresponding best estimate covariances/correlations will be compared to the corresponding results produced by the 2nd-BERRU-PMD methodology, as well as to the corresponding results produced by the 1st-BERRU-PM [10] [11] [12] .

4.1. Inter-Comparison of Expressions for the Best-Estimate Predicted Mean Values for Responses

Recall that the 2nd-BERRU-PMD (best-estimated values with deterministically incorporated computed responses) methodology incorporates deterministically (into the second-order MaxEnt representation of the experimentally measured responses) the following second-order Taylor-expansion of the computed responses:

r = r c ( α 0 ) + [ S ( α 0 ) ] ( α α 0 ) + p ( α ; α 0 ) ; p ( α ; α 0 ) [ p 1 ( α ; α 0 ) , , p T R ( α ; α 0 ) ] ; p k ( α ; α 0 ) 1 2 ( α α 0 ) [ Q k ( α 0 ) ] ( α α 0 ) . (37)

In Equation (37), the components of the vector r c ( α 0 ) [ r 1 c ( α 0 ) , , r T R c ( α 0 ) ] represent the values of the model responses computed at the nominal parameter values α 0 , and the other vectors and matrices contain first-order and second-order sensitivities of responses with respect to the model parameters, evaluated at the nominal parameter values α 0 , as defined below:

S ( α ) ( s 11 s 1 , T R s T R , 1 s T R , T R ) ( r 1 ( α ) α 1 r 1 ( α ) α T P r T R ( α ) α 1 r T R ( α ) α T P ) ; Q k ( α ) ( q 11 ( k ) q 1 , T P ( k ) q T P , 1 ( k ) q T P , T P ( k ) ) ( 2 r k c ( α ) α 1 α 1 2 r k c ( α ) α 1 α T P 2 r k c ( α ) α T P α 1 2 r k c ( α ) α T P α T P ) . (38)

The expressions of the end-results produced the “best-estimated values with deterministically incorporated computed responses” (2nd-BERRU-PMD) methodology will be designated using the superscript “bed.” The expression obtained using the 2nd-BERRU-PMD methodology for the best-estimate responses is as follows [1] :

r b e d = r e + ( C r r e C r α e S ) θ ( h , n + 1 ) , (39)

where the vector θ ( h , n + 1 ) ( θ 1 ( h , n + 1 ) , , θ k ( h , n + 1 ) , , θ T R ( h , n + 1 ) ) is the solution of the following iterative (Newton’s method) equation

θ ( n + 1 ) = θ ( n ) [ A ( θ ( n ) ) ] 1 w ( θ ( n ) ) ; n = 0 , 1 , . (40)

In Equation (40), the components/elements A k , n ( θ ) of the Jacobian matrix A ( θ ) have the following expressions:

A k n ( θ ) = D k n V k n ( θ ) , (41)

where:

D k j ( C r r e + S C α α S C r α e S S C α r e ) k j , D [ D k j ] T R × T R ; (42)

V k n ( θ ) j = 1 T R θ j [ M n j ( k ) + M k j ( n ) + M k n ( j ) ] , V ( θ ) [ V k n ( θ ) ] T R × T R . (43)

M ( k ) [ M i j ( k ) ] T R × T R ( C α r e C α α S ) [ Q k ( α 0 ) ] ( C α r e C α α S ) = [ M ( k ) ] . (44)

w ( θ ) [ w 1 ( θ ) , , w k ( θ ) , , w T R ( θ ) ] d + D θ + u ( θ ) . (45)

d r c ( α 0 ) r e ; d ( d 1 , , d T R ) ; d 1 r i c ( α 0 ) r i e ; i = 1 , , T R . (46)

u ( θ ) [ u 1 ( θ ) , , u T R ( θ ) ] ; u k ( θ ) 1 2 θ M ( k ) θ i = 1 T R θ i j = 1 T R M k j ( i ) θ j . (47)

The initial iterate for starting the iterations for obtaining solution of Equation (40) can be chosen [1] as follows:

θ ( h , 0 ) ( θ 1 ( h , 0 ) , , θ j ( h , 0 ) , , θ T R ( h , 0 ) ) ; θ j ( h , 0 ) = 2 D j j / 3 M j j ( j ) ; j = 1 , , T R . (48)

A k n ( θ ( h , 0 ) ) = D k n j = 1 T R θ j ( h , 0 ) [ M n j ( k ) + M k j ( n ) + M k n ( j ) ] . (49)

w k ( θ ( h , 0 ) ) = u k ( θ ( h , 0 ) ) + j = 1 T R D k j θ j ( h , 0 ) d k , k = 1 , , T R ; (50)

u k ( θ ( h , 0 ) ) = 1 2 i = 1 T R j = 1 T R θ i ( h , 0 ) M i j ( k ) θ j ( h , 0 ) i = 1 T R θ i ( h , 0 ) j = 1 T R M k j ( i ) θ j ( h , 0 ) 4 9 i = 1 T R j = 1 T R ( D i i M i i ( i ) ) ( D j j M j j ( j ) ) [ 1 2 M i j ( k ) + M k j ( i ) ] , k = 1 , , T R . (51)

The first-iterate θ ( h , 1 ) of the solution of Equation (40) has the following expression:

θ ( h , 1 ) = θ ( h , 0 ) ( D V ( h , 0 ) ) 1 ( u ( h , 0 ) + D θ ( h , 0 ) d ) ; V ( h , 0 ) V ( θ ( h , 0 ) ) ; u ( h , 0 ) u ( θ ( h , 0 ) ) . (52)

Using the first-iterate, θ ( h , 1 ) , in Equation (39) yields the following expressions for the “first-iterate best-estimate” values for the responses:

r ( 1 ) b e d = r e + r ( 0 ) ; r ( 0 ) ( C r r e C r α e S ) [ θ ( h , 0 ) ( D V ( h , 0 ) ) 1 ( u ( h , 0 ) + D θ ( h , 0 ) d ) ] . (53)

It is noteworthy that the “vector of deviations” d can be considered to be a (quasi-) random variable; it has been shown in [1] that the “vector of mean values of deviations” d ¯ [ d ¯ 1 , , d ¯ T R ] and the covariance matrix D [ D i j ] T R × T R of the vector of deviations d have the following expressions (up to second-order sensitivities):

d ¯ = [ Q 1 ( α 0 ) C α α , , Q T R ( α 0 ) C α α ] ; (54)

D = C r r e + S C α α S C r α e S S C α r e . (55)

Comparing the expression of the first-iterate, cf., Equation (53) produced by the 2nd-BERRU-PMD methodology with the corresponding expression produced by the 2nd-BERRU-PMP methodology, cf. Equation (29), indicates that both expressions include the second-order sensitivities of the responses with respect to the parameters, but in distinct ways. Furthermore, the expression produced by the 2nd-BERRU-PMP methodology, cf. Equation (29), can include all of the response sensitivities and parameter correlations of order higher than two, if available. Such an inclusion of higher-order sensitivities and parameter correlations is not provided by the result, cf. Equation (53), obtained using the 2nd-BERRU-PMD methodology.

The expression provided in Equation (29) coincides with the expression provided in Equation (53) only if the following relationship holds:

r b e p = r ( 1 ) b e d iff : C r r e ( C r r e + C r r c ) 1 [ E c ( r ) r e ] = ( C r r e C r α e S ) [ θ ( h , 0 ) ( D V ( h , 0 ) ) 1 ( u ( h , 0 ) + D θ ( h , 0 ) d ) ] . (56)

The equality in Equation (56) can hold if the following conditions are simultaneously satisfied: C r α e = 0 , θ ( h , 0 ) = 0 , V ( h , 0 ) = 0 , u ( h , 0 ) = 0 , C r r c = S C α α S , d = [ E c ( r ) r e ] , in which case we also have the relation D = C r r e + S C α α S = C r r e + C r r c . These conditions are satisfied if all of the sensitivities higher than first-order are ignored, and if the experimental responses are uncorrelated to the model parameters. In this particular case, the results produced by the 2nd-BERRU-PMP and 2nd-BERRU-PMD methodologies also coincide with the results produced by the first-order 1st-BERRU-PM methodology.

4.2. Inter-Comparison of Expressions for the Best-Estimate Predicted Mean Values for Parameters

The expression obtained using the 2nd-BERRU-PMD methodology for the best-estimate parameters is as follows [1] :

α b e d = α 0 + ( C α r e C α α S ) θ ( h , n + 1 ) . (57)

Using the first-iterate θ ( h , 1 ) in Equation (57) yields the following first-iterate expression for the best-estimate parameters:

α ( 1 ) b e d = α 0 + β ( 0 ) ; β ( 0 ) ( C α r e C α α S ) [ θ ( h , 0 ) ( D V ( h , 0 ) ) 1 ( u ( h , 0 ) + D θ ( h , 0 ) d ) ] . (58)

Comparing the expression provided in Equation (58), namely the first-iterate produced by the 2nd-BERRU-PMD methodology, with the corresponding expression produced by the 2nd-BERRU-PMP methodology, cf. Equation (30), indicates that both expressions include the second-order sensitivities of the responses with respect to the parameters, but in distinct ways. Furthermore, the expression produced by the 2nd-BERRU-PMP methodology, cf. Equation (30), can include all of the response sensitivities and parameter correlations of order higher than two, if available. Such an inclusion of higher-order sensitivities and parameter correlations is not possible with the result obtained using the 2nd-BERRU-PMD methodology, cf. Equation (58).

The expression provided in Equation (30) coincides with the expression provided in Equation (58) only if the following relationship holds:

α b e p = α ( 1 ) b e d iff : C α r c ( C r r e + C r r c ) 1 [ E c ( r ) r e ] = ( C α r e C α α S ) [ θ ( h , 0 ) ( D V ( h , 0 ) ) 1 ( u ( h , 0 ) + D θ ( h , 0 ) d ) ] . (59)

The equality in Equation (59) can hold if the following conditions are simultaneously satisfied: C α r e = 0 , θ ( h , 0 ) = 0 , V ( h , 0 ) = 0 , u ( h , 0 ) = 0 , C r r c = S C α α S , d = [ E c ( r ) r e ] , in which case we also have the relation D = C r r e + S C α α S = C r r e + C r r c . These conditions are satisfied if all of the sensitivities higher than first-order are ignored. In this particular case, the results produced by the 2nd-BERRU-PMP and 2nd-BERRU-PMD methodologies also coincide with the results produced by the first-order 1st-BERRU-PM methodology.

4.3. Inter-Comparison of Expressions for the Best-Estimate Predicted Response Covariances

The expression obtained using the 2nd-BERRU-PMD methodology [1] for the “first-iterate best-estimate” covariance matrix C r r ( 1 ) b e d for the responses is as follows:

C r r ( 1 ) b e d = C r r e C r d ( 1 ) Y r + X r θ ( h , 0 ) ( θ ( h , 0 ) ) X r X r θ ( h , 0 ) d ¯ Y r + X r θ ( h , 0 ) ( u ( h , 0 ) ) Y r Y r ( C r d ( 1 ) ) Y r d ¯ ( θ ( h , 0 ) ) X r + Y r D Y r Y r d ¯ ( u ( h , 0 ) ) Y r + Y r u ( h , 0 ) ( θ ( h , 0 ) ) X r Y r u ( h , 0 ) d ¯ Y r + Y r u ( h , 0 ) ( u ( h , 0 ) ) Y r . (60)

where:

C r d ( 1 ) = C r r e C r α e S ; X r C r d ( 1 ) [ ( D V ( h , 0 ) ) 1 D I ] ; Y r C r d ( 1 ) ( D V ( h , 0 ) ) 1 . (61)

Comparing the 2nd-BERRU-PMD-expression provided in Equation (60) with the corresponding expression obtained using the 2nd-BERRU-PMP methodology, cf. Equation (31), indicates that both expressions include the second-order sensitivities of the responses with respect to the parameters, but in distinct ways. Furthermore, the expression produced by the 2nd-BERRU-PMP methodology, cf. Equation (31), can include all of the response sensitivities and parameter correlations of order higher than two, if available. Such an inclusion of higher-order sensitivities and parameter correlations is not possible with the result obtained using the 2nd-BERRU-PMD methodology, cf. Equation (60).

The expression provided in Equation (31) coincides with the expression provided in Equation (60) only if the following relationship holds:

C r r b e p = C r r ( 1 ) b e d iff : C r r e ( C r r e + C r r c ) 1 C r r e = C r d ( 1 ) Y r + X r θ ( h , 0 ) ( θ ( h , 0 ) ) X r X r θ ( h , 0 ) d ¯ Y r + X r θ ( h , 0 ) ( u ( h , 0 ) ) Y r Y r ( C r d ( 1 ) ) Y r d ¯ ( θ ( h , 0 ) ) X r + Y r D Y r Y r d ¯ ( u ( h , 0 ) ) Y r + Y r u ( h , 0 ) ( θ ( h , 0 ) ) X r Y r u ( h , 0 ) d ¯ Y r + Y r u ( h , 0 ) ( u ( h , 0 ) ) Y r . (62)

The equality shown in Equation (62) can hold only if all of the second-order sensitivities are neglected, and if the experimental responses are uncorrelated to the model parameters, in which case the following simplifications occur: X r = 0 , Y r = C r d ( 1 ) D 1 , and C r d ( 1 ) = C r r e , in which case we also have the relation D = C r r e + S C α α S = C r r e + C r r c . In this particular case, the results produced by the 2nd-BERRU-PMP and 2nd-BERRU-PMD methodologies also coincide with the results produced by the first-order 1st-BERRU-PM methodology.

4.4. Inter-Comparison of Expressions for the Best-Estimate Predicted Parameter Covariances

The expression obtained using the 2nd-BERRU-PMD methodology [1] for the “first-iterate best-estimate” covariance matrix C α α ( 1 ) b e for the model parameters is as follows:

C α α ( 1 ) b e d = C α α e C α d ( 1 ) Y α + X α θ ( h , 0 ) ( θ ( h , 0 ) ) X α X α θ ( h , 0 ) d ¯ Y α + X α θ ( h , 0 ) ( u ( h , 0 ) ) Y α Y α ( C α d ( 1 ) ) Y α d ¯ ( θ ( h , 0 ) ) X α + Y α D Y α Y α d ¯ ( u ( h , 0 ) ) Y α + Y α u ( h , 0 ) ( θ ( h , 0 ) ) X α Y α u ( h , 0 ) d ¯ Y α + Y α u ( h , 0 ) ( u ( h , 0 ) ) Y α . (63)

were:

C α d ( 1 ) = C α r e C α α e S ; X α C α d ( 1 ) [ ( D V ( h , 0 ) ) 1 D I ] ; Y α C α d ( 1 ) ( D V ( h , 0 ) ) 1 . (64)

Comparing the 2nd-BERRU-PMD-expression provided in Equation (63) with the corresponding expression obtained using the 2nd-BERRU-PMP methodology, cf. Equation (33), indicates that both expressions include the second-order sensitivities of the responses with respect to the parameters, but in distinct ways. Furthermore, the expression produced by the 2nd-BERRU-PMP methodology, cf. Equation (33), can include all of the response sensitivities and parameter correlations of order higher than two, if available. Such an inclusion of higher-order sensitivities and parameter correlations is not possible with the result obtained using the 2nd-BERRU-PMD methodology, cf. Equation (63).

The expression provided in Equation (33) coincides with the expression provided in Equation (63) only if the following relationship holds:

C α α b e p = C α α ( 1 ) b e d iff : C α r c ( C r r e + C r r c ) 1 C α r c = C α d ( 1 ) Y α + X α θ ( h , 0 ) ( θ ( h , 0 ) ) X α X α θ ( h , 0 ) d ¯ Y α + X α θ ( h , 0 ) ( u ( h , 0 ) ) Y α Y α ( C α d ( 1 ) ) Y α d ¯ ( θ ( h , 0 ) ) X α + Y α D Y α Y α d ¯ ( u ( h , 0 ) ) Y α + Y α u ( h , 0 ) ( θ ( h , 0 ) ) X α Y α u ( h , 0 ) d ¯ Y α + Y α u ( h , 0 ) ( u ( h , 0 ) ) Y α . (65)

The equality shown in Equation (65) can hold only if all of the second-order sensitivities are neglected, and if the experimental responses are uncorrelated to the model parameters, in which case the following simplifications occur: X α = 0 , Y α = C α d ( 1 ) D 1 , in which case we also have the relation D = C r r e + S C α α S = C r r e + C r r c . In this particular case, the results produced by the 2nd-BERRU-PMP and 2nd-BERRU-PMD methodologies also coincide with the results produced by the first-order 1st-BERRU-PM methodology.

4.5. Inter-Comparison of Expressions for the Best-Estimate Predicted Correlations between Parameters and Responses

The expression obtained using the 2nd-BERRU-PMD methodology [1] for the “first-iterate best-estimate” correlation matrix C α r ( 1 ) b e d for the parameters and responses is as follows:

C α r ( 1 ) b e d = C α r e C α d ( 1 ) Y r + X α θ ( h , 0 ) ( θ ( h , 0 ) ) X r X α θ ( h , 0 ) d ¯ Y r + X α θ ( h , 0 ) ( u ( h , 0 ) ) Y r Y α ( C r d ( 1 ) ) Y α d ¯ ( θ ( h , 0 ) ) X r + Y α D Y r Y α d ¯ ( u ( h , 0 ) ) Y r + Y α u ( h , 0 ) ( θ ( h , 0 ) ) X r Y α u ( h , 0 ) d ¯ Y r + Y α u ( h , 0 ) ( u ( h , 0 ) ) Y r . (66)

Comparing the 2nd-BERRU-PMD-expression provided in Equation (66) with the corresponding expression obtained using the 2nd-BERRU-PMP methodology, cf. Equation (34), indicates that both expressions include the second-order sensitivities of the responses with respect to the parameters, but in distinct ways. Furthermore, the expression produced by the 2nd-BERRU-PMP methodology, cf. Equation (34), can include all of the response sensitivities and parameter correlations of order higher than two, if available. Such an inclusion of higher-order sensitivities and parameter correlations is not possible with the result obtained using the 2nd-BERRU-PMD methodology, cf. Equation (66)

The expression provided in Equation (34) coincides with the expression provided in Equation (66) only if the following relationship holds:

C α r b e p = C α r ( 1 ) b e d iff : C α r c ( C r r e + C r r c ) 1 C r r e = C α d ( 1 ) Y r + X α θ ( h , 0 ) ( θ ( h , 0 ) ) X r X α θ ( h , 0 ) d ¯ Y r + X α θ ( h , 0 ) ( u ( h , 0 ) ) Y r Y α ( C r d ( 1 ) ) Y α d ¯ ( θ ( h , 0 ) ) X r + Y α D Y r Y α d ¯ ( u ( h , 0 ) ) Y r + Y α u ( h , 0 ) ( θ ( h , 0 ) ) X r Y α u ( h , 0 ) d ¯ Y r + Y α u ( h , 0 ) ( u ( h , 0 ) ) Y r . (67)

The equality shown in Equation (67) can hold only if all of the second-order sensitivities are neglected, and if the experimental responses are uncorrelated to the model parameters. In this particular case, the results produced by the 2nd-BERRU-PMP and 2nd-BERRU-PMD methodologies also coincide with the results produced by the first-order 1st-BERRU-PM methodology.

5. Fundamental Advantages of 2nd-BERRU-PMP over Second-Order Data Assimilation

This section summarizes the decisive advantages of the 2nd-BERRU-PM methodology over the 2nd-Order Data Assimilation methodology [3] [4] . The 2nd-Order Data Assimilation methodology [3] [4] relies on using a 2nd-order procedure to minimize a user-defined functional which is meant to represent, in a chosen norm (usually the energy-norm), the discrepancies between computed and experimental results (“responses”). Mathematically, the 2nd-Order Data Assimilation method involves the following steps [3] [4] :

1) Consider that the physical system is represented, as in [1] , by the operator equations N m [ v ( x ) , α ] = Q m ( x , α ) . Consider that the vector of measured responses (“observations”), denoted by the vector z ( z 1 , , z T R ) , is a known function of the vector of state-variables v ( x ) [ v 1 ( x ) , , v T D ( x ) ] and of the vector of errors w ( w 1 , , w T R ) , having the following expression: z = h ( v ) + w , where h ( v ) [ h 1 ( v ) , , h T R ( v ) ] denotes a known vector-function of v . The error term, w , is considered here to include “representative errors” stemming from sampling and grid interpolation; the mean value of w corresponds to r e ( r 1 e , , r T R e ) and the covariance matrixof w corresponds to C r r e . As described in [3] [4] , w is often considered to have the characteristics of “white noise,” in which case z ~ N [ h ( v ) , C r r e ] is a normal distribution with mean h ( v ) and covariance C e r r . In addition, it is assumed that the prior “background” information is also known, being represented by a multivariate normal distribution with a known mean, denoted as v b , and a known covariance matrix denoted as B , i.e., v ~ N [ v b , B ] . The posterior distribution, p ( z | v ) , is obtained by applying Bayes’ Theorem to the above information, which yields the result p ( z | v ) ~ exp [ J ( v ) ] , where:

J ( v ) 1 2 { [ z h ( v ) ] ( C r r e ) 1 [ z h ( v ) ] + ( v v b ) B 1 ( v v b ) } . (68)

The maximum posterior estimate is obtained by determining the minimum of the functional J ( v ) , which occurs at the root (s) of the following equation:

0 = J ( v ) = B 1 ( v v b ) [ D h ( v ) ] ( C r r e ) 1 [ z h ( v ) ] . (69)

where D h ( v ) denotes the Jacobian matrix of h ( v ) with respect to the components of v .

The “first-order data assimilation” procedure solves Equation (69) by using a “partial quadratic approximation” to J ( v ) , while the “second-order data assimilation” procedure solves Equation (69) by using a “full quadratic approximation” to J ( v ) , as detailed in [3] [4] , to obtain the “optimal data assimilation solution,” which is here denoted as v D A , as the solution of Equation (69). The following fundamental differences become apparent by comparing the “Data Assimilation” result represented by Equation (69) and the 2nd-BERRU-PMP results.

2) Data assimilation (DA) is formulated conceptually [3] [4] either just in the phase-space of measured responses (“observation-space formulation”) or just in the phase-space of the model’s dependent variables (“state-space formulation”). Hence, DA can calibrate initial conditions as “direct results” but cannot directly calibrate any other model parameters. In contradistinction, the 2nd-BERRU-PMP methodology is formulated conceptually in the most inclusive “joint-phase-space of parameters, computed and measured responses.” Consequently, the 2nd-BERRU-PMP methodology simultaneously calibrates responses and parameters, thus simultaneously providing results for forward and inverse problems.

3) If experiments are perfectly well known, i.e., if C r r e = 0 , Equation (69) indicates that the DA methodology fails fundamentally. In contradistinction, Equations (33)-(35) indicate that the 2nd-BERRU-PMP methodology does not fail when C r r e = 0 because, in any situation, C r r c 0 .

4) The DA methodology also fails fundamentally when the response measurements happen to coincide with the computed value of the response, i.e., when z = h ( v ) at some point in the state-space. In such a case, the DA’s Equation (69) yields the trivial result v D A = v b . In contradistinction, the 2nd-BERRU-PMP methodology does not yield such a trivial result when the response measurements happen to coincide with the computed value of the response, i.e., when r e = r k ( α 0 ) , because the difference E c ( r ) r e , which appears on the rights sides of Equations (29) and (30), remains non-zero due to the contributions of the second- and higher-order sensitivities of the responses with respect to the model parameters, as shown in Equation (4), i.e.,:

{ E c ( r k ) r k e } r k e = r k ( α 0 ) = i = 1 T P j = 1 T P { 2 r k ( α ) / α i α j } α 0 cov ( α i , α j ) / 2 + 0 ,

for k = 1 , , T R . This situation clearly underscores the need for computing and retaining (at least) the second-order response sensitivities to the model parameters. Although a situation when r e = r k ( α 0 ) is not expected to occur frequently in practice, there are no negative consequences (should such a situation occur) if the 2nd-BERRU-PMP methodology is used, in contradistinction to using the DA methodology.

5) The 2nd-BERRU-PMP methodology only requires the inversion of the matrix ( C r r e + C r r c ) of size T R × T R . In contradistinction, the solution of the “1st-order DA” requires the inversion of the Jacobian D h ( v ) of h ( v ) , while the solution of the “2nd-order DA” also requires the inversion of a matrix-vector product involving the Hessian matrix of h ( v ) ; these matrices are significantly larger [3] [4] than the matrix ( C r r e + C r r c ) . Hence, the 2nd-BERRU-PM methodology is significantly more efficient computationally than DAE.

6) The DA methodology is practically non-extendable beyond “second-order.” A “3rd-order DA” would be computationally impractical because of the massive sizes of the matrices that would need to be inverted. In contradistinction, the 2nd-BERRU-PM methodology presented herein already comprises 4th-order sensitivities of responses to parameters and can be readily extended/generalized to include even higher-order sensitivities and parameter correlations.

All of the above advantages of the 2nd-BERRU-PM methodology over the DA methodology stem from the fact that the 2nd-BERRU-PM methodology is fundamentally anchored in physics-based principles (thermodynamics & information theory) formulated in the most inclusive possible phase-space (namely the combined phase-space of computed and measured parameters and responses), whereas the DA methodology is fundamentally based on the minimization of a subjective user-chosen functional.

6. Discussion and Conclusions

This work has presented the “second order predictive modeling methodology with probabilistically included computed responses” (2nd-BERRU-PMP) methodology, as underscored by the letter “P” in the acronym (which indicates the “probabilistic” inclusion of the computed model responses). This methodology is a companion to the alternative 2nd-BERRU-PMD methodology, in which the computational model is included deterministically, as underscored by the letter “D” in this acronym. Both methodologies are fundamentally based on the MaxEnt principle. Although both the 2nd-BERRU-PMP and the 2nd-BERRU-PMD methodologies yield expressions that include second (and higher) order sensitivities of responses to model parameters, it is shown in this work that the respective expressions for the predicted responses, calibrated predicted parameters and their predicted uncertainties (covariances) are not identical to each other. Nevertheless, these second-order methodologies encompass, as particular cases, the results produced by the extant data assimilation [3] [4] , data adjustment procedures [5] [6] [7] [8] [9] , and the first-order BERRU-PM methodology [10] [11] [12] .

Notably, the 2nd-BERRU-PMP methodology enables the use of sensitivities and parameter correlations beyond second-order, if available; this opportunity is not available within the 2nd-BERRU-PMD methodology. If it is imperative to combine experimental information for both the responses and parameters with the computational information produced by the model, then the complete Gaussian expression of the 2nd-BERRU-PMP should be used, if it is possible to accommodate the massive computational requirements underlying the inversion of the matrices that arise within this methodology. On the other hand, the simplest to implement computationally is the simplified 2nd-BERRU-PMP, which should be used when experimental information about the model parameters in addition to the information used in the computational model is not important. The 2nd-BERRU-PMD methodology offers an intermediary alternative, being easier to implement computationally than the complete version of the 2nd-BERRU-PMP methodology but involving more computations than the simplified 2nd-BERRU-PMP methodology.

Illustrative applications of the 2nd-BERRU-PMP and the 2nd-BERRU-PMD methodologies to various paradigm forward and inverse problems are currently in progress.

Appendix: Construction of the Second-Order Maximum Entropy Distribution for Responses and Parameters

When an unknown distribution p ( α , r ) , defined on a domain D D α D r (where D α denotes the domain of definition of the parameters and D r denotes the domain of definition of the responses) needs to be reconstructed from a finite number of its known moments, the principle of maximum entropy (MaxEnt) originally formulated by Jaynes [2] provides the optimal compatibility with the available information, while simultaneously ensuring minimal spurious information content. In particular, when only the first-order and second-order moments of the joint distribution of model parameters and responses p ( α , r ) are known, the corresponding second-order MaxEnt distribution p ( α , r ) is constructed by following the procedure outlined below.

1) Known means and covariances for responses:

r i 0 D r i p ( α , r ) d r d α , i = 1 , , T R ; (70)

cov ( r i , r j ) D e ( r i r i 0 ) ( r j r j 0 ) p ( α , r ) d r d α ; i , j = 1 , , T R . (71)

2) Known means and covariances for parameters:

α j 0 D e α j p ( α , r ) d r d α , j = 1 , , T P ; (72)

cov ( α i , α j ) D e ( α i α i 0 ) ( α j α j 0 ) p ( α , r ) d r d α ; i , j = 1 , , T P ; (73)

cov ( α i , r j ) D e ( α i α i 0 ) ( r j r j 0 ) p ( α , r ) d r d α ; i = 1 , , T P ; j = 1 , , T R . (74)

According to the MaxEnt principle, the probability density p ( α , r ) would satisfy the “available information” provided in Equations (70)-(74), without implying any spurious information or hidden assumptions, if:

1) p ( α , r ) maximizes the Shannon [19] information entropy, S, as defined below:

S = D p ( α , r ) ln [ p ( α , r ) ] d α d r , (75)

2) p ( α , r ) satisfies the “moments constraints” defined by Equations (70)-(74);

3) p ( α , r ) satisfies the normalization condition:

D p ( α , r ) d α d r = 1. (76)

The MaxEnt distribution p ( α , r ) is obtained as the solution of the constrained variational problem H ( p ) / p = 0 , where the entropy (Lagrangian functional) H ( p ) is defined as follows:

H ( p ) = D p ( α , r ) ln [ p ( α , r ) ] d α d r λ 0 [ D p ( α , r ) d α d r 1 ] k = 1 T R λ k ( 1 ) [ D r k p ( α , r ) d α d r r k 0 ] i = 1 T P λ i ( 2 ) [ D α i p ( α , r ) d α d r α i 0 ]

1 2 k = 1 T R l = 1 T R λ k l ( 11 ) [ D r k r l p ( α , r ) d α d r c o v ( r k , r l ) r k 0 r l 0 ] k = 1 T R i = 1 T P λ k i ( 12 ) [ D r k α i p ( α , r ) d α d r c o v ( r k , α i ) r k 0 α i 0 ] 1 2 i = 1 T P j = 1 T P λ i j ( 22 ) [ D α i α j p ( α , r ) d α d r cov ( α i , α j ) α i 0 α j 0 ] . (77)

In Equation (77), the quantities λ k ( 1 ) , λ k ( 2 ) , λ k l ( 11 ) , λ k l ( 12 ) , and λ k l ( 22 ) denote the respective Lagrange multipliers, and the factors 1/2 have been introduced for subsequent computational convenience.

Solving the equation H ( p ) / p = 0 yields the following expression for the resulting MaxEnt distribution p ( z ) :

p c ( z ) = 1 Z ( b , Λ ) exp ( b z 1 2 z Λ z ) , (78)

where the various vectors and matrices are defined as follows:

z = ( r α ) ; b ( λ ( 1 ) λ ( 2 ) ) ; λ ( 1 ) ( λ 1 ( 1 ) · λ T R ( 1 ) ) ; λ ( 2 ) ( λ 1 ( 2 ) · λ T P ( 1 ) ) ; (79)

Λ ( Λ ( 11 ) Λ ( 12 ) [ Λ ( 12 ) ] Λ ( 22 ) ) ; Λ ( 11 ) ( λ 11 ( 11 ) · λ 1 , T R ( 11 ) · λ k l ( 11 ) · λ T R , 1 ( 11 ) · λ T R , T R ( 11 ) ) ; Λ ( 12 ) ( λ 11 ( 12 ) · λ 1 , T P ( 12 ) · λ k l ( 12 ) · λ T R , 1 ( 12 ) · λ T R , T P ( 12 ) ) ; Λ ( 22 ) ( λ 11 ( 22 ) · λ 1 , T P ( 22 ) · λ k l ( 22 ) · λ T P , 1 ( 22 ) · λ T P , T P ( 22 ) ) . (80)

The normalization constant Z ( b , Λ ) in Equation (104) is defined as follows:

Z ( b , Λ ) = D exp ( b z 1 2 z Λ z ) d z ; d z d α d r . (81)

In statistical mechanics, the normalization constant Z is called the partition function (or sum over states) and carries all of the information available about the possible states of the system, while the MaxEnt distribution p ( z ) is called the canonical Boltzmann-Gibbs distribution. The integral in Equation (107) can be evaluated explicitly by conservatively extending the computational domain D to the entire multidimensional real vector space N , where N T R + T P , to obtain the following expression:

Z c ( b , Λ c ) = N exp ( b z 1 2 z Λ z ) d z = ( 2 π ) N / 2 D e t ( Λ ) e ( 1 2 b Λ 1 b ) . (82)

The Lagrange multipliers are determined in terms of the known information (means and covariances of parameters and responses) by differentiating the “free energy” F ( b , C ) ln Z ( b , C ) with respect to the components of the vector b ( λ ( 1 ) , λ ( 2 ) ) to obtain the following expressions:

F ( b , C ) λ k ( 1 ) = 1 Z D r k e ( b z 1 2 z Λ z ) d z = D c r k p ( α , r ) d z = r k 0 ; k = 1 , , T R ; (83)

F ( b , C ) λ i ( 2 ) = 1 Z D α i e ( b z 1 2 z Λ z ) d z = D α i p ( α , r ) d α d r = α i 0 ; i = 1 , , T P . (84)

The results obtained in Equations (83) and (84) can be collectively written in vector-matrix form as follows:

F ( b , C ) b = z 0 ; z 0 ( r 0 , α 0 ) ; r 0 ( r 1 0 , , r T R 0 ) . (85)

On the other hand, it follows from Equation (82) that:

F ( b , C ) b = [ ln Z ( b , C ) ] b = Λ 1 b . (86)

The relations obtained in Equations (85) and (86) imply the following relation:

b = Λ z 0 . (87)

Differentiating a second time the relation provided in Equation (83) or (84) yields the following relations:

2 F ( b , C ) λ j ( 1 ) λ k ( 1 ) = 1 Z 2 Z λ j ( 1 ) r k 0 + 1 Z D ( r j ) r k e ( b z 1 2 z Λ z ) d z = cov ( r j , r k ) ; j , k = 1 , , T R ; (88)

2 F ( b , C ) λ i ( 2 ) λ k ( 1 ) = cov ( α i , r k ) ; i = 1 , , T P ; k = 1 , , T R ; (89)

2 F ( b , C ) λ i ( 2 ) λ j ( 2 ) = cov ( α i , α j ) ; i , j = 1 , , T P . (90)

The results obtained in Equations (88)-(90) can be collectively written in vector-matrix form as follows:

2 F ( b , C c ) b b = C ; C ( C r r C r α C α r C α α ) ; C r r [ cov ( r j , r k ) ] T R × T R ; C r α [ cov ( r k , α i ) ] T R × T P = ( C α r ) ; C α α [ cov ( α i , α j ) ] T P × T P . (91)

On the other hand, it follows from Equation (86) that:

2 F ( b , C ) b b = Λ 1 . (92)

The relations obtained in Equations (91) and (92) imply the following relation:

Λ 1 = C . (93)

Introducing the results obtained in Equations (87) and (93) into Equation (78) and (82) yields the following expression for the MaxEnt distribution p ( z | z , C ) :

p ( z | z , C ) = ( 2 π ) N / 2 D e t ( C ) exp [ 1 2 ( z z 0 ) C 1 ( z z 0 ) ] . (94)

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1] Cacuci, D.G. (2023) Second-Order MaxEnt Predictive Modeling Methodology. I: Deterministically Incorporated Computational Model Responses (2nd-BERRU-PMD). American Journal of Computational Mathematics.
[2] Jaynes, E.T. (1957) Information Theory and Statistical Mechanics. Physical Review Journals Archive, 106, 620-630.
https://doi.org/10.1103/PhysRev.106.620
[3] Lewis, J.M., Lakshmivarahan, S. and Dhall, S.K. (2006) Dynamic Data Assimilation: A Least Square Approach. Cambridge University Press, Cambridge.
https://doi.org/10.1017/CBO9780511526480
[4] Cacuci, D.G., Navon, M.I. and Ionescu-Bujor, M. (2014) Computational Methods for Data Evaluation and Assimilation. Chapman & Hall/CRC, Boca Raton.
[5] Rowlands, J., et al. (1973) The Production and Performance of the Adjusted Cross-Section Set FGL5. Proceeding of International Symposium on Physics of Fast Reactors, Tokio.
http://www.cern.ch/root
[6] Gandini, A. and Petilli, M. (1973) A Code Using the Lagrange’s Multipliers Method for Nuclear Data Adjustment. CNEN-RT/FI(73)39, Comitato Nazionale Energia Nucleare, Rome.
[7] Kuroi, H. and Mitani, H. (1975) Adjustment to Cross-Section Data to Fit Integral Experiments by Least Squares Method. Journal of Nuclear Science and Technology, 12, 663-680.
https://doi.org/10.1080/18811248.1975.9733172
[8] Dragt, J.B., Dekker, J.W.M., Gruppelaar, H. and Janssen, A.J. (1977) Methods of Adjustment and Error Evaluation of Neutron Capture Cross Sections. Nuclear Science and Engineering, 62, 117-129.
https://doi.org/10.13182/NSE77-3
[9] Barhen, J., Cacuci, D.G., Wagschal, J.J., Bjerke, M.A. and Mullins, C.B. (1982) Uncertainty Analysis of Time-Dependent Nonlinear Systems: Theory and Application to Transient Thermal Hydraulics. Nuclear Science and Engineering, 81, 23-44.
https://doi.org/10.13182/NSE82-3
[10] Cacuci, D.G. (2014) Predictive Modeling of Coupled Multi-Physics Systems: I. Theory. Annals of Nuclear Energy, 70, 266-278.
https://doi.org/10.1016/j.anucene.2013.11.027.
[11] Cacuci, D.G. (2019) BERRU Predictive Modeling: Best-Estimate Results with Reduced Uncertainties. Springer, Berlin.
https://doi.org/10.1007/978-3-662-58395-1
[12] Cacuci, D.G. (2023) Perspective on Predictive Modeling: Current Status, New High-Order Methodology and Outlook for Energy Systems. Energies, 16, Article 933.
https://doi.org/10.3390/en16020933
[13] Bellman, R.E. (1957) Dynamic Programming. Rand Corporation, Princeton University Press, Princeton.
[14] Cacuci, D.G. (2022) The nth-Order Comprehensive Adjoint Sensitivity Analysis Methodology for Nonlinear Systems (nth-CASAM-N): Mathematical Framework. Journal of Nuclear Engineering, 3, 163-190.
https://doi.org/10.3390/jne3030010
[15] Cacuci, D.G. (2022) The nth-Order Comprehensive Adjoint Sensitivity Analysis Methodology: Overcoming the Curse of Dimensionality. Volume I: Linear Systems. Springer Nature Switzerland, Cham.
https://doi.org/10.1007/978-3-030-96364-4
[16] Tukey, J.W. (1957) The Propagation of Errors, Fluctuations and Tolerances. Technical Reports No. 10-12, Princeton University, Princeton.
[17] Shun, Z. and McCullagh. P. (1995) Laplace Approximation of High Dimensional Integrals. Journal of the Royal Statistical Society: Series B (Methodological), 57, 749-760.
https://doi.org/10.1111/j.2517-6161.1995.tb02060.x
[18] Tang, Y. and Reid, N. (2021) Laplace and Saddle Point Approximations in High Dimensions. arXiv: 2107.1088.
[19] Shannon, C.E. (1948) A Mathematical Theory of Communication. The Bell System Technical Journal, 27, 379-423.
https://doi.org/10.1002/j.1538-7305.1948.tb01338.x

Copyright © 2023 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.