Low Default Portfolios—A Proposed Rule to Identify Differences between Imprudence, Conservatism, and Exaggeration

Abstract

Internal models may be used by banks to calculate their regulatory capital for credit risk. There are a variety of methodologies for estimating default probabilities, which leads to major differences in credit provisions and capital requirements. Using either a classical or a Bayesian technique, the computation of default probabilities can be ensured. Reduced form models are a choice. These models, however, might not be used to quantify economic capital because they assume independence among default events. Banks are compelled to employ structural models since defaults in the real world of banking are not solely due to exogenous causes. Because of the diversification effects between credit losses for one obligor and credit losses for other obligors in each bank’s portfolio, total unexpected losses do not equal the sum of individual unexpected losses. Those two types of models—reduced form and structural—are provided in either a theoretical or a numerical format. This paper covers both the classical and Bayesian techniques, with the latter employing a broader set of prior functions that offer considerably different probabilities. Distinguishing between imprudence, conservatism, and exaggeration might be difficult in the context of low default portfolios with scarce data. A realistic rule is proposed for finding the minimum and maximum bounds and therefore assessing the required conservatism margin by comparing classical and Bayesian probabilities.

Share and Cite:

Dinis, D. (2022) Low Default Portfolios—A Proposed Rule to Identify Differences between Imprudence, Conservatism, and Exaggeration. Journal of Financial Risk Management, 11, 1-40. doi: 10.4236/jfrm.2022.111001.

1. Introduction

Banks may use internal models to assess their credit risk exposure within an internal ratings-based framework under the Basel Accords issued by the Basel Committee on Banking Supervision, according to capital requirements rules adopted by banking institutions and transposed into the legal system across the European Union1. The default probability is the same for different internal rating systems, whether foundation internal rating-based or advanced internal rating-based—nomenclature used in such accords.

Loans with no defaults over a lengthy period of time are a utopian concept. Indeed, during an economic downturn, it is projected that unemployment will grow and, as a result, higher default rates will emerge, projections that are heightened by the contagion risk. It will be impossible to generate accurate and realistic estimations of default probabilities if there is no correlation among individual exposures within portfolios.

Therefore, the zero-default assumption for all risk classes will not be presented here. In any case, if the observed default number is nil, the formulas in this paper can be immediately converted to the zero-presume assumption.

A utopia would likewise imply independence of default events foreseen in reduced form models. This independence will be studied in order to compare the results to those obtained from structural models (which account for the existence of a dependence structure among borrowers in the same risk class) and to assess the significance of the asset correlation component.

The methodology used in this document is explained in Chapter 2, which is divided into four sections. Section 2.1 addresses general considerations (specifically, asset correlation) that are necessary for each axis of credit risk models applied to low default portfolios—reduced form models and structural models—and for each axis of statistical approaches used in credit risk assessment—classical and Bayesian approaches. The classical technique is covered in Section 2.2, which has two subsections: one for reduced form models and the other for structural models. The Bayesian technique is covered in Section 2.3, which is broken into two subsections: prior distributions and posterior distributions, the latter of which encompasses both reduced form models and structural models. In Section 2.4, a reasonable criterion is used to connect the classical and Bayesian approaches, allowing one to distinguish between imprudence, conservatism, and exaggeration in terms of default probability.

The outputs of such models and approaches are provided in Chapter 3. In the last section of this chapter, the main conclusions drawn from the comparison of those models and approaches are presented.

Finally, some closing remarks are made. They address subjects like uncertainty, mix of prior functions, and other open issues that need to be investigated further.

2. Formulae and Other Technical Issues

2.1. General Considerations

Reduced form models are based on the default independence hypothesis, which indicates that asset correlation, “ρ”, is assumed to be zero. The use of these models to assess banking regulatory capital for credit risk should not be acceptable because the reality follows a completely different pattern, defined by interdependence among default occurrences.

Because loss defaults are not independent, reduced form models must be replaced with structural models. In these last models, the default events are assumed to be positively connected. The stronger the link to the systematic risk, the larger “ρ”. When ρ > 0 the default probabilities are computed using an integrating algorithm, which requires a stochastic treatment or a simulation procedure for the classical (or frequentist) approach. The algorithm takes into account all possible values “y” of a standard and normally distributed random variable “Y” that represents the systematic risk’s realization range. Under the Bayesian (or subjective) approach, the stochastic treatment or the simulation procedure must be doubled: one for “y” and the other for “λ”, the default probability.

The frequentist approach and the subjective approach are used to discuss both reduced form models and structural models. The trapezoidal rule for numerical integration approximation is used to obtain outputs when an analytical solution is not attainable.

For each risk class, the binomial and Poisson distributions are used to simulate the default probability random variable. Nonetheless, unlike the binomial distribution, using the Poisson distribution to represent the default probability is not entirely accurate because the size of the risk class, “n”, is not fixed in this type of distribution.

The posterior distribution, according to Bayesian inference, corresponds to the conditional distribution of the default probability random variable, “Λ”, given a set number of borrowers, “n”, and a fixed number of defaults, “k”, as well as the previous distribution of the default probability. The posterior density of the default probability is obtained by matching the likelihood function and the prior function.

The Bayesian approach provides another conceptual distinction. In the classical approach, there are frequentist confidence intervals, while in the Bayesian approach, there are posterior credible intervals. Despite the fact that the latter are commonly conceived of as a Bayesian variant of confidence intervals used in classical probability, they have different meanings. The highest density interval existing a unimodal posterior density2 is related to the shortest possible interval, determined by numerical calculation. Because “n”, “k”, and “ρ” are regarded as constants, there is a ( 100 δ ) % probability that the true value of the unidimensional parameter, “λ”, falls inside the credible interval, being “δ” the risk level (equivalent to the significance level of classical probabilities)3.

2.2. Classical Approach

2.2.1. Classical Approach with Reduced Form Models

Being “K” the random variable that represents the number of defaults, “k” the number of defaults, “n” the number of obligors, “λ” the default probability and “ω” the confidence level; and assuming that “K” follows a standard binomial distribution, the probability of having no more than “k” defaults inside a risk class with “n” obligors is given by:

P ( K k ) = i = 0 k ( n i ) λ i ( 1 λ ) n i 1 ω (1)

If a standard Poisson distribution is used to represent “K”, the probability is:

P ( K k ) = i = 0 k e n λ ( n λ ) i / i ! 1 ω (2)

When default events are completely independent (ρ = 0%), “upper confidence bounds”4 for both binomial and Poisson distributions can be computed analytically through beta and gamma approximations or numerically. They provide the same outputs because the binominal distribution is proportional to the beta distribution and the Poisson distribution is proportional to the gamma distribution:

Binomial ( n , k ) Beta ( α , β ) , α = k + 1 and β = n k ( n k ) λ k ( 1 λ ) n k Γ ( α + β ) / [ Γ ( α ) Γ ( β ) ] λ α 1 ( 1 λ ) β 1 (3)

and

Poisson ( n λ ) Gamma ( α , β ) , α = k + 1 and β = 1 / n e n λ ( n λ ) k / k ! β α / Γ ( α ) e β λ λ α 1 (4)

2.2.2. Classical Approach with Structural Models

The theoretical environment must be drastically changed by positive asset correlation values, “ρ”. Instead of basic and unrealistic reduced form models that rely on default independence, complex and adequate structural models should be used.

J ( λ , y , ρ ) represents the probability function of the sample data resulting from the binomial function of “λ” or the Poisson function of “λ”, ϕ ( y ) represents the standard normal probability density function of “Y”, Φ ( . ) represents the standard normal cumulative distribution function, and Φ 1 ( λ ) represents the inversed standard normal cumulative distribution function for “λ”. The meanings of “K”, “k”, “n”, “λ”, and “ω” are the same above mentioned.

Therefore, the probability of having no more than “k” defaults inside a risk class with “n” obligors is provided by:

P ( K k ) = + [ i = 0 k J ( λ , y , ρ ) i ] ϕ ( y ) d y 1 ω , (5)

with

J ( λ , y , ρ ) i = ( n i ) [ G ( λ , y , ρ ) ] i [ 1 G ( λ , y , ρ ) ] n i (6)

or

J ( λ , y , ρ ) i = e n G ( λ , y , ρ ) [ n G ( λ , y , ρ ) ] i / i ! , (7)

if the probability function of the sample data follows a binomial distribution or a Poisson distribution, respectively, and

G ( λ , y , ρ ) = Φ { [ Φ 1 ( λ ) y ρ ] / 1 ρ } (8)

The function G ( λ , y , ρ ) distinguishes the formulae of reduced form models from the structural models. A brief explanation of that crucial function and its origin can be found in Appendix A.

2.3. Bayesian Approach

2.3.1. Prior Distributions

The default probability of the posterior distribution is derived from the likelihood and prior densities matching, as previously indicated. Multiple prior functions, ranging from less prudent to over conservative, are identified in this subsection. They reflect various elements or beliefs about the effective default probability.

When there is no understanding of the behavior of posterior default probability, it is common to use a non-informative prior. The most non-informative prior is a flat prior, concretely a uniform distribution between 0 and 1. However, prior functions are useful in most circumstances since they represent the default risk profile, in which case distributions need to be parametrized.

Because of its versatility in expressing the uncertainty of the default probability, the beta distribution is a widely used parametrized prior distribution. Let “Λ” be the random variable of the default probability “λ”. Assuming “Λ” follows a beta distribution, Λ ~ Beta ( α , β ) , with α , β > 0 and λ ] 0 , 1 [ , it is simple to adapt this distribution to subjective information about the mean (or average) and variance of default probability using the hyperparameters “α” and “β”. The mean and variance of a beta distribution are respectively:

E ( Λ ) = α / ( α + β ) (11)

V ( Λ ) = α β / [ ( α + β ) 2 ( α + β + 1 ) ] (12)

When no (objective or subjective) information about the default probability is available, a beta distribution with α = β = 1 can be taken because it represents a uniform distribution between 0 and 1, making it a non-informative prior.

The set of prior functions, f ( λ ) , addressed in this document is listed below.

1) Uniform distribution

f ( λ ) = 1 / u , λ ] 0 , u ] , (13)

being “u” the maximum limit of “λ”. Four possibilities of “u” are tested: 1, 0.25, 0.1, and 0.01—the same values evaluated by Dirk Tasche in his work5. These values are also utilized in two other types of priors: linear growth and linear decrease, as shown below. A tighter representation should be used because when u = 1 the condition λ < u must be met, not λ u . This note is also applicable to prior functions with linear growth and linear decrease.

2) Linear growth

f ( λ ) = λ , λ ] 0 , u ] (14)

3) Linear decrease

f ( λ ) = 1 λ / u , λ ] 0 , u ] (15)

4) Conservative6

f ( λ ) = 1 / ( 1 λ ) , λ ] 0 , 1 [ (16)

5) Immoderate

f ( λ ) = 1 / λ , λ ] 0 , 1 [ (17)

6) Expert judgement

6.1) Base scenario (linear growth and linear decrease)

f ( λ ) = { 2 ( λ m ) / [ ( M m ) ( M o m ) ] λ ] m , M o ] 2 ( λ M ) / [ ( M m ) ( M o M ) ] λ ] M o , M ] 0 others λ , (18)

where “m” stands for the minimum of “λ”, “Mo” stands for the mode of “λ”, and “M” stands for the maximum of “λ”, as assumed by the expert who defines the prior distribution. The values for “m”, “Mo”, and “M” were initially set to be conservative: m = 0.01, Mo = 0.025, and M = 0.0757.

6.2) Beta distribution as a proxy

For the beta distribution, α = 6.67 and β = 175.31 were used. These two parameters were set to ensure that the beta distribution’s mean and variance matched those of the base scenario, as decided by expert opinion—prior 6.1—, resulting in μ = 0.03667 and σ2 = 0.00019, respectively.

6.3) Normal distribution as a proxy

To check that the normal distribution’s mean is equal to the mean of the prior 6.1 and the variance is equal to 0.03667/1.645, one used μ = 0.03667 and σ2 = 0.02229 for normal distribution8.

7) Beta distribution based on empirical default rate

The prior’s mean is assumed to be the observed default rate for each combination of “n” and “k”, and the prior’s variance is expected to be equal to the ratio between that rate (and thus the mean) and the number 1.645. These mean and variance, on the one hand, and Equation (11) and Equation (12), on the other hand, are used to compute the beta distribution’s parameters9. This prior will only be used in Section 3.4 to compare the classical and Bayesian approaches.

2.3.2. Posterior Distributions

1) Bayesian Approach with Reduced Form Models

Let’s use “K” to represent the random variable of the default number once more. For any potential default probability, “λ”, each value “y” of a standard and normally distributed random variable, “Y”, and a specified asset correlation, “ρ”, the posterior probability computation based on the Bayes’ theorem10 of the occurrence exactly “k” and no more than “k” as for Equation (1) and Equation (2) defaults is as follows:

P ( K = k ) = H ( λ , y ) = f ( λ ) + ( n k ) λ k ( 1 λ ) n k ϕ ( y ) d y 1 / [ 0 u f ( λ ) + ( n k ) λ k ( 1 λ ) n k ϕ ( y ) d y d λ ] , (19)

if the likelihood function follows a binomial distribution, or

P ( K = k ) = H ( λ , y ) = f ( λ ) + e n λ ( n λ ) k / k ! ϕ ( y ) d y 1 / [ 0 u f ( λ ) + e n λ ( n λ ) k / k ! ϕ ( y ) d y d λ ] , (20)

if the likelihood function follows a Poisson distribution.

It’s worth noting that “u” is the upper limit of “λ”11, f ( λ ) is the prior probability density function of “λ”as described in 2.3.1, and ϕ ( y ) is the standard normal probability density function of “Y”.

The H ( λ , y ) denominator, also known as prior predictive distribution, normalizes the posterior distribution function.

Analytical outputs can be generated if no correlation of defaults among borrowers is assumed—reduced form models—, as mentioned in Subsection 2.2.1. Only a few circumstances in Bayesian estimation have an explicit analytical solution: when there is also independence among borrowers, and simultaneously when joining the prior function with the likelihood function yields a standard distribution.

Concretely, analytical forms occur in the following special cases: when the likelihood is a binomial distribution and the prior is a beta distribution, on the one hand, and when the likelihood is a Poisson distribution and the prior is a gamma distribution, on the other hand. The beta-binomial distribution and the gamma-Poisson distribution derive from those joining.

2) Bayesian Approach with Structural Models

It should be noted that the existence of correlation requires numerical outputs from stochastic treatments or simulation procedures. Being ρ > 0, the formulae stated in the preceding subsection must be changed:

P ( K = k ) = H ( λ , y , ρ ) = f ( λ ) + L ( λ , y , ρ ) ϕ ( y ) d y 1 / [ 0 u f ( λ ) + L ( λ , y , ρ ) ϕ ( y ) d y d λ ] , (21)

with

L ( λ , y , ρ ) = ( n k ) [ G ( λ , y , ρ ) ] k [ 1 G ( λ , y , ρ ) ] n k (22)

or

L ( λ , y , ρ ) = e n G ( λ , y , ρ ) [ n G ( λ , y , ρ ) ] k / k ! , (23)

if the likelihood function is represented respectively by a binomial or a Poisson distribution, and G ( λ , y , ρ ) has the meaning expressed in Equation (8).

L ( λ , y , ρ ) denotes the probability of the sample data generated from binomial or Poisson likelihood functions of “λ”, for a risk class with “n” counterparties and “k” defaults. Φ ( . ) stands for the standard normal cumulative distribution function, and Φ 1 ( λ ) is the inversed standard normal cumulative distribution function for “λ” (see 2.2.2).

The H ( λ , y , ρ ) denominator ensures that the posterior distribution function is normalized once again.

2.4. Conservative Zone

The posterior default probabilities differ significantly depending on the priors used. Some of those probabilities are unwise, while others are overblown. The key goal is to establish limits that will allow for the identification of a conservative zone.

Drift and volatility should be included in every estimate of a random variable to explain the uncertainty. The historical default data for the drift and the standard deviation of the likelihood distribution for the volatility are adopted to ensure an acceptable margin of conservatism, avoiding unwanted levels of imprudence and exaggeration. The coefficient of skewness is also included in the rule because default distributions are heavily skewed toward positive asset correlations.

On the one hand, if at least one default occurrence is recorded, the following rule is applied to create imprudent, conservative, and exaggerated zones:

λ < λ 97.5 % Imprudent zone,

λ [ λ 97.5 % , λ 99.9 % ] Conservative zone,

λ > λ 99.9 % Exaggerated zone.

The default probability “ λ c ” is determined as follows for a given confidence level “c”:

λ c = [ k / n + σ L / n Φ 1 ( c ) ] γ L (24)

The ratio k/n represents the empirical default experience, σ L represents the standard deviation of the correlated likelihood function, γ L represents the coefficient of skewness of this function, and Φ 1 ( c ) represents the inversed standard normal cumulative distribution function for “c”. The likelihood function is used to indicate the level of conservatism since that function depicts the group’s intrinsic risk and so eliminates the need for any prior risk data. Furthermore, to improve the required conservatism, volatility is computed using the standard deviation of the likelihood function at 97.5% and 99.9% confidence levels.

The expression inside square brackets in Equation (24) is the result of a binomial test of significance adaptation12. It should be noted that the conventional binomial test assumes mutual independence of events, which is an incorrect assumption in the default probability models. Aside from the standard deviation of the correlated binomial distribution—i.e., the correlated likelihood function—, the conservative margin should also account for the asymmetry of the same distribution.

On the other hand, when no default event is identified, λ = μ L is assumed, reflecting μ L the mean of the correlated likelihood distribution, instead of the conservative zone λ [ λ 97.5 % , λ 99.9 % ] . One keeps in mind that when ρ > 0 the posterior distribution’s mean is much greater than when default events independence is assumed. As a result, when no past defaults have been recorded, the mean will be an overly cautious estimator of default probability.

It is important to note that the suggested practical rule should not be used to calculate default probabilities. Its advantage is that it provides for a more impartial comparison between an expected default probability and a conservative threshold based on the likelihood distribution.

3. The Application of the Methodology

3.1. Risk Classes and Asset Correlation

The figures in the tables of this paper were made for three different risk categories described in Table 113.

The results are provided from two perspectives: an individual approach by credit risk class, in which each class is seen as a separate group; and an integrated approach combining two or more classes14, in which the upper confidence bound concept15 is used assuming that the classes combined have the same rating category. When a unique rating grade is assigned to homogenous classes, it is presumed that default risks for all counterparties within the integrated group are exposed to the same default probability (and the same asset correlation).

As a result of regulatory regimes, banks are frequently subjected to a constant “ρ”. The Basel Committee on Banking Supervision recognizes that major corporations are more dependent on systematic risk than small firms and retail counterparties because they are more exposed to overall economic conditions. Small firms and retail counterparties are less affected by economic cycles, therefore their defaults are more idiosyncratic rather than systematic.

It will always be set to ρ = 12%, as this is one of the standards available to banking regulators.

Four risk criteria are used to calculate the classical confidence intervals and the Bayesian credible intervals: 50%, 25%, 10%, and 5%.

3.2. Classical Approach

The binomial and Poisson distributions describe the counting of defaults. Table 2 and Table 3 show that those distributions produce effectively identical outcomes. In fact, it is expected that the means of both distributions are identical, n λ = k , and the variances are likewise quite similar16. In the tables, the terms “basic binomial” and “basic Poisson” are used to describe the corresponding distributions where default occurrences are assumed to be completely independent. It is also worth noting that the default probability increases as the confidence level rises. The observed default rates are presented in the tables so that the results from those distributions can be quickly compared to these rates.

Assuming ρ = 0%, upper confidence bound computations can be done analytically (through beta and gamma approximations) or numerically, as described in Subsection 2.2.1. For the five scenarios assumed, the largest discrepancy between the numerical simulation method and the analytical alternative approximation is 0.0003%.

Table 1. Number of obligors (n) and defaults (k) for each risk class.

Table 2. Classical default probability for basic binomial (asset correlation = 0%).

Table 3. Classical default probability for basic Poisson (asset correlation = 0%).

When the independence assumption is replaced with the correlation assumption and thus standard distributions are turned into correlated distributions, the greater the confidence level, the greater the difference between reduced form and structural models’ results. Table 4 and Table 5 shows that at a 50% confidence level, from ρ = 0% to ρ = 12%, default probabilities for binomial distribution range from 0.34% - 0.76% to 0.53% - 1.11% (depending on “n” and “k”). At a 95% confidence level, one rises from 0.77% - 1.98% to 2.89% - 5.16%. Therefore, at a 50% confidence level, default probabilities increase by 46% - 59%17, and at a 95% confidence level, they increase by 138% - 274%18. At 75% and 90% confidence levels, the growth ranges by 81% - 130%19, and 115% - 214%20, respectively.

Table 4. Classical default probability for correlated binomial (asset correlation = 12%).

Table 5. Classical default probability for correlated Poisson (asset correlation = 12%).

There are significant capital savings when an integrated approach is used rather than an individual approach21. The greater the confidence level, the greater the spread of savings: from 27%22 at a 50% confidence level to 45% at a 95% confidence level when three risk classes are aggregated as one homogeneous group, and from 16% at a 50% confidence level to 30% at a 95% confidence level when risk classes B and C are aggregated as one homogeneous group. These savings were obtained with ρ = 0%.

When reduced form models are replaced with structural models and asset correlation is assumed to be uniform, with ρ = 12%, the savings are lower: from 23% at a 50% confidence level to 28% at a 95% confidence level, and from 13% at a 50% confidence level to 16% at a 95% confidence level, respectively for three and two aggregated classes.

3.3. Bayesian Approach

3.3.1. Likelihood Functions

Table 6 and Table 7, on the one hand, and Table 8 and Table 9, on the other hand, show that there are no significant differences between the binomial distribution and the Poisson distribution used as the likelihood function of the Bayesian default probability, with the outcome being roughly the same for both the basic and correlated techniques (similar to the classical approach). Nonetheless, the Poisson distribution’s means are marginally higher than the binomial distribution’s because the Poisson distribution is a little skewer than the binomial distribution23. Hence the matching percentiles associated with the mean in the binomial distribution are immaterially higher than the equivalent percentiles in the Poisson distribution.

Table 6. Bayesian default probability—Likelihood: basic binomial (asset correlation = 0%).

Table 7. Bayesian default probability—Likelihood: basic Poisson (asset correlation = 0%).

Table 8. Bayesian default probability—Likelihood: correlated binomial (asset correlation = 12%).

Table 9. Bayesian default probability—Likelihood: correlated Poisson (asset correlation = 12%).

When comparing Table 6 and Table 8 (or Table 7 and Table 9), it is clear that (regardless of “n” and “k”) the larger “ρ”, the greater the likelihood function’s mean and standard deviation. When the asset correlation is introduced, the likelihood function becomes considerably skewer than when no correlation is employed. Furthermore, one verifies the rule that the larger the risk group, the higher the coefficient of skewness and the smaller the mean and the standard deviation24.

The lowering effect on the standard deviation when “n ” grows is explained by the fact that the percentage increase in “n ” is smaller than the modulus of the percentage decrease in “λ”.

There are capital savings with the integration of risk categories, as seen in the classical approach. Savings in the Bayesian context, assuming ρ = 12%, equate to 9% or 18% whether two risk classes (B + C) or three risk classes (A + B + C) are aggregated. For ρ = 0%, corresponding savings increase to 20% or 33%—34% (rather than 33%) regarding that the likelihood function follows a Poisson distribution—if two or three risk classes are aggregated25.

3.3.2. Posterior Distributions

A binomial distribution is suitably better than a Poisson distribution for depicting the number of defaults within a risk class containing a fixed number of obligors, as stated in Section 2.1. As a result, for the sake of simplicity, only outcomes based on the binomial distribution are now reported.

The prior functions discussed in 2.3.1 are used to find a group of statistical values for the posterior distributions: mean, median, mode, standard deviation and coefficient of skewness as well as four highest density intervals. Table 10 displays the outcomes of a 12% asset correlation, while Table 11 in Appendix B shows the results of no correlation26. These tables indicate that different prior functions and asset correlations have a big impact on posterior probabilities27.

Table 10. Default probability distribution—Posterior using the correlated binomial (asset correlation = 12%) as likelihood, for several priors.

Highest density intervals expressed in percentage.

It’s worth noting that, according to the concept of expected value, the mean of the posterior probability of “λ”, μ λ , is computed by:

μ λ = 0 u λ f ( λ ) + L ( λ , y , ρ ) ϕ ( y ) d y d λ 1 / 0 u f ( λ ) + L ( λ , y , ρ ) ϕ ( y ) d y d λ , (25)

having f ( λ ) , L ( λ , y , ρ ) , and ϕ ( y ) the same meaning as before.

The effect described in the last phrase of the penultimate paragraph of 3.3.1 about the growth of “n” is validated for all priors used: the larger the risk group, the smaller the mean and the standard deviation, and the greater the coefficient of skewness.

When the prior function is a uniform distribution spanning from 0 to 1 (i.e., the same figures as the default probability assumes), the values for posterior probabilities are obviously the same as those provided by the likelihood function. This can be seen by comparing Table 8 and Table 10 likewise Table 6 and Table 11. By definition, default probabilities are tiny; empirical rates for the five scenarios range from 0% to 0.57%. As a result, a uniform distribution with values between 0 and 0.25 is expected to yield identical results (about the likelihood function and the posterior with uniform distribution as a prior). Even if the top limit of the uniform distribution is set to 0.1, there are no discernible differences.

Table 10 shows that the posterior with an immoderate prior—the polar opposite of the conservative prior—produces the lowest default probability, with the mean of 4.3 to 11.9 times lower than the posterior resulting from linear growth as a prior (being “u” constrained between 0 and 1) and between 2.4 and 5 times lower than the posterior resulting from the uniform distribution as a prior (being “u” between 0 and 1 too). Therefore, the most cautious or conservative prior is the linear growth function—the theoretical polar opposite of the linear decrease prior—, not the conservative prior itself.

The uniform distribution, linear decrease (both with “u” between 0 and 1), and conservative priors generate comparable posterior default probabilities for all the pairs of “n” and “k” studied. Those three sorts of functions will appear to be the most appropriate and beneficial priors, in contrast to immoderate and linear growth priors. Comments on expert knowledge priors—base scenario as well as beta and normal distributions as proxie—will be addressed later.

One returns to capital savings through risk group integration. Using the linear growth function with “u” between 0 and 1 as a prior and ρ = 12%, savings are substantially equivalent to savings at a 50% confidence level under the classical approach: 13% for n = 850 and 23% for n = 1000. Furthermore, a 95% confidence level comparison is required when ρ = 0%. This is because, at a 95% confidence level, savings obtained using the linear growth function as a prior are similar to those obtained using the classical approach: 28% for n = 850 and 44% for n = 1000, which are close to the 30% and 45% mentioned in 3.2, respectively.

Because the uniform distribution is a non-informative prior, posterior distribution outputs match to likelihood function outputs as aforementioned; corresponding capital savings have already been presented (in the last paragraph of 3.3.1). Being the means of posterior probabilities identical for uniform distribution, linear decrease, and conservative priors (ranging from 0 to 1), the savings are also equivalent. Immoderate prior yields the smallest savings.

Table 10 and Table 11 show a simple rule: the stronger the asset correlation, the wider the tail of the right side of the distribution (or the higher the coefficient of skewness)28. However, this rule is not established in two instances: when u = 0.01 (and in some cases, when u = 0.1 for linear growth prior), and when expert information is the base of the prior function. The first exception is self-evident, as “λ” has a smaller upper limit—0.01 (or 0.1 is some cases)—than the broad range of values that default probabilities can tolerate (based on the binomial distribution).

The second exemption is explained by the nature of the prior subjected to expert judgment. The previous point concerning the 0.01 threshold also applies to priors based on that judgment. These judgments typically specify a much narrower range of values than the likelihood function’s available values29.

It is important to remember that priors that provide information have a stronger influence on the posterior than priors that do not. Furthermore, as previously stated, a prior based on expert opinion is excessively rigid because the posterior probability density is constrained by the range of values for expert knowledge as a prior.

Table 12 compares the cumulative densities of prior and posterior distributions linked to expert information thresholds to the densities of different kinds of priors and their corresponding posterior distributions. The mean is utilized as a reference point for prior and posterior distributions. In the case of employing the normal distribution as a proxy for the base scenario of expert knowledge, μ = 0.03912 rather than 0.03667 because only positive values less than 1 were used30.

Table 12. Prior and posterior distributions—Densities by intervals of default probabilities (λ), for 350 obligors, 2 defaults, and a 12% asset correlation.

(#) minimum of λ = 0.01, mode of λ = 0.025, and maximum of λ = 0.075.

Figure 1 graphically depicts the pattern of eleven types of default rates as a function of the number of obligors, when k = 2 and ρ = 12%: posterior default probabilities for nine prior functions (UD, LG, LD, C, I, EJ-BS, EJ-BD, EJ-ND, and EJ-ABS), probabilities for the likelihood of the default distribution (BL), and historical default occurrences (EDR). The most important topics of that figure are described below.

When expert information is used as a prior, the general rule that the larger the risk group, the less relevant the prior distribution in defining the posterior distribution, and the stronger the convergence of the posterior distribution to the likelihood function is not obvious. That rule and, in the case of expert judgment as a prior, the exception as well as the trend associated with an alternative base scenario for expert judgment (EJ-ABS) are depicted in the figure.

Figure 1. Posterior default probability for different prior distributions (number of defaults = 2; asset corelation = 12%).

In other words, when the size of the risk class, “n”, grows, the ratio between posterior probability with prior based on expert opinion and likelihood probability climbs significantly. The likelihood probabilities decrease greatly when “n” rises, which is not the case with the posterior based on expert information as a prior. By contrast, when posterior probability is calculated from any other prior, the ratio remains roughly constant because both posterior and likelihood probabilities are sensitive to “n” in almost equal proportions. Similarly, the ratio of empirical default rate to likelihood probability is quite stable across “n”.

When “k” and “ρ” are fixed, the figure shows that posterior default probabilities tend to empirical rates as the number of obligors grows, as expected by the law of large numbers. There is no such coherence when posterior probabilities are generated from any prior that relies on expert judgment.

From n = 50 to n = 500, with k = 2 and ρ = 12%, the empirical default rate falls by 360 basis points (bp), from 4% to 0.4%. The likelihood decreases by 770 bp, from 9.77% to 2.07%, corresponding to same drop as the posterior with uniform distribution between 0 and 1 as a prior. For posteriors with linear growth, conservative, linear decrease, and immoderate functions, the respective declines are 1,072, 825, 721, and 441 bp. For posteriors with hypothetical scenarios related to 1%/2.5%/7.5% (EJ-BS) and 0.3%/1%/2% (EJ-ABS)—for the minimum/mode/maximum values of the prior default probability—, the decreases are only 101 and 16 bp, respectively.

Finally, Figure 1 shows that the concavity verified in general for posterior functions of default probabilities does not exist when expert prior knowledge is provided, regardless of the values taken by the base scenario as a prior. To demonstrate the (in)flexibility of expert distributions, one notes that coefficients of variation—the mean and variance of these coefficients are measured via the 10 means (one for each n-value31)—in two aforesaid expert scenarios are vastly smaller: 0.1 or 0.04 regarding respectively EJ-BS or EJ-ABS. The coefficients of variation for posteriors with other priors are substantially greater, ranging from 0.48 (linear growth) to 0.7 (immoderate).

The prior density of the default probability for EJ-BS has a mean of 3.67% and a standard deviation of 1.39%, whereas EJ-ABS as a prior has a mean of 1.1% and a standard deviation of 0.35%. Although these two scenarios are quite different, they both display the same inflexibility: large risk class sizes are insufficient to ensure a substantial convergence of the posterior distribution to the likelihood function, and thus the prior is stronger than the likelihood. In light of the foregoing concerns, selecting a prior based on expert information should be approached with caution.

The 0.01 and 0.1 thresholds are retaken. The two formers of the four thresholds of “u” used for uniform, linear growth, and linear decrease priors—1, 0.25, 0.1, and 0.01—will be sufficient because they reflect the default probability profile. The others, particularly the latter (0.01), have some difficulties due to differences in probability profiles. The following is a quick rundown of these difficulties.

There are a number of inconsistencies when the upper threshold assumed for “λ” is believed to be low. As previously mentioned (after the analysis of Table 10 and Table 11), the stronger the asset correlation, the higher the coefficient of skewness. However, this rule does not work with u = 0.01 and (for linear growth function as a prior) with u = 0.1.

The default distribution is right-skewed, regardless of the asset correlation percentages that are connected. Nevertheless, the coefficient of skewness might be negative when u = 0.01 (either for ρ = 0% or ρ = 12%).

The higher the asset correlation, the broader the range of the highest density interval; this is, as correlation increases, the distance between the upper and lower bounds of the shortest credible interval growths. At times, such a rule is not verified when u = 0.01. Furthermore, in some circumstances with u = 0.01, that range is wider with ρ = 0% than with ρ = 12%.

3.4. Comparison of the Results

Table 13 compares the classical and Bayesian approaches. For each value of the mean linked to the posterior default probability of Table 10 and Figure 1, the corresponding matching percentile was found using the classical binomial approach, assuming a correlation of 12%. That table also provides the matching percentile associated with the beta prior for the empirical default rate as well as the matching percentile related to this empirical rate. It also provides minimum and maximum default probabilities, which are derived using the practical rule described in 2.4.

Table 13. Mean of the Bayesian default distributions and corresponding matching percentiles based on the classical binomial approach (asset correlation = 12%).

(a) minimum of λ = 0.01, mode of λ = 0.025, and maximum of λ = 0.075; (b) minimum of λ = 0.003, mode of λ = 0.01, and maximum of λ = 0.02; EDR - Empirical default rate.

Because the data in Table 13 shows a wide range of percentiles, it is necessary to distinguish between imprudence, conservatism, and exaggeration. Applying the practical rule aforementioned to the figures in the table, one may deduce that the 65th percentile can be used to separate imprudence from conservatism, and the 70th percentile can be used to separate conservatism from exaggeration. Other caps may be considered, in part because evaluating probabilities in low default portfolios requires a great deal of personal judgment. Given this, a 70th minimum percentile would be dangerous for some people, whereas a 75th maximum percentile would not be excessive. Over and above percentiles, the most significant consideration is the need to objectively discern different levels of safety or prudence, which demand the use of a practical rule like the one stated above.

Figure 2 shows the percentiles for all posterior distributions established in this paper (also assuming 12% for asset correlation).

When there are one or more defaults, the 65th and 70th percentiles—dashed vertical lines—correspond to the lower and upper limits of conservatism. When using the practical rule as a reference to choose the model to estimate the default probability, one notes that only the prior related to the alternative base scenario of the expert judgement may be deemed acceptable to establish a prudent approach estimating. In that scenario, it is reliable combining empirical data (integrated into the likelihood function) with expert knowledge (integrated into the prior function). The other priors do not deliver such reliable outputs because they produce either imprudent or exaggerated default probabilities.

When there is no default event (therefore, only for n = 150), the reference corresponds to the 77.91th percentile—solid vertical line. The priors 1.1 and 1.2 (uniform distribution), 3.1 (linear decrease), and 4 (conservative) are suitable since they correspond to the intended conservatism32.

Figure 2. Percentiles of the classical binomial approach corresponding to mean of several Bayesian default distributions (asset correlation = 12%).

Because default distributions are heavily skewed to the right, using the mean to distinguish between conservatism and exaggeration in defaulted portfolios looks to be unnecessarily cautious. On the other hand, if there are no defaults in the portfolios, using the mean of the default distributions is a good strategy. When n = 150 and k = 0, the mean of default probability is 1.87%, which corresponds to the 77.91th percentile in a classical approach (see the prior 1.1 in Table 13)33. The corresponding upper limits of the highest density interval for 75%, 90%, and 95% are 2.43%, 4.51%, and 6.17%, respectively (as shown in Table 10), which are clearly unrealistic default probabilities for a low default portfolio.

Table 14 compares the effective number of observed defaults in each risk class to the projected number of defaults using the practical rule. Even with 65th and 70th percentiles, which would appear to be insufficiently cautious at first glance, default estimates are significantly higher than empirical data, a condition that any risk management tool for capital buffer measurement must take into account.

Table 14. Effective and estimated number of defaults.

The disparities between effective defaults and estimated defaults for risk groups having at least one default (corresponding to the 63.83th - 71.02th percentiles) are larger than for risk group with no default (corresponding to the 77.91th percentile). Adaptations to the standard binomial test result in more prudent results, ensuring that the prudence margin demanded by internal models is sufficient to absorb losses in low default portfolios.

4. Final Thoughts

Both classical and Bayesian techniques have a degree of uncertainty in default probabilities, as seen in the figures. The estimation of default probabilities for low default portfolios is a wide topic with many questions. Each question has a number of possible answers. This diffusion process of answers is mainly intensified in the Bayesian approach because there are a very broad array of possible prior functions, ranging from the non-informative flat prior distribution (where there is no information about the parameter representing the default probability) to the most informative priors based solely on expert opinion (where it will have to be put in place all data about that parameter, both quantitative and qualitative elements).

It is easier to apply a non-informative prior if there is no idea about the default probability. Nonetheless, it is beneficial to make efforts to get information about that probability. Although the prior function is just one of many assumptions in the entire complex model, it is desirable if it reflects knowledge or feelings about the default probability. In low default portfolios with a lack of loss observations, additional information and expertise become particularly crucial.

The more informative the prior function, the worse the convergence of the posterior distribution to the likelihood function. This resistance holds true for both the base scenarios (regardless of how conservative they are) and the theoretical distributions generated by these scenarios, particularly the beta distribution.

The figures also show that the prior distribution can be chosen with a lot of freedom. Thus, it is possible to use a combination of priors, i.e., an average prior rather than simply one as a way of attempting to reflect the uncertainty degree of the default probability. However, differentiations between imprudence, conservatism, and exaggeration must first be made.

Subjectivity is present in both “the Bayesian (or subjective) approach” and “the classical (or frequentist) approach”—terminology stated in the second paragraph of 2.1. Indeed, the classical approach requires the subjective selection of confidence levels (as well as upper confidence bounds), whereas the Bayesian approach requires the selection of prior functions. A realistic rule was proposed to deal with such a wide variety of arbitrary choices. The rule will have the benefit of validating default financial models in general and Bayesian options in particular. As a result, unduly optimistic or overly pessimistic default probabilities are eliminated, allowing banks’ pricing to reflect adequate and consistent levels of provisioning and economic capital.

Practical experts, theoretical academics and wary regulators cannot agree on methods and strategies for estimating default probabilities in portfolios with scarce historical data. Some topics were addressed in this paper; others, such as the model calibrations, and the use of a multi-period method by estimating models, are difficult open issues that require further exploration.

Appendix A. A Brief Summary of Vasicek’s Model (See 2.2.2)

The Basel Committee on Banking Supervision’s structural models for credit risk are based on an approximation of Vasicek’s equations, which use one-factor Gaussian copula to incorporate asset correlation between each pair of counterparties. Merton pioneered that type of models in 1974, when he developed the default theory using the Black-Scholes option pricing and the associated stochastic process.

The value of an asset changes over time and follows a dynamic evolution marked by drift and volatility. The systematic and idiosyncratic features incorporated into Vasicek’s (2002) portfolio value model in 2002 correspond to the Merton model’s concepts of drift and volatility.

A corporation “j” breaks its payment obligation if the probability of a random variable “ X j ”, which reflects its asset value, falls below a specific default threshold “ m j ”, according to this model. In addition, under the same model, over a specified length of time, “t”—usually a one year period of time—, it comes:

X j t = S t ρ + Z j t 1 ρ , (9)

where “S” and “ Z j ”are the systematic and idiosyncratic components of the asset value, respectively, and “ρ” is the constant asset correlation between two separate corporations.

Despite the fact that each corporation “j” has its own individual and idiosyncratic risk (represented by Z j t 1 ρ ), corporations are all exposed to a common and systematic risk, namely the overall status of the economy in which they operate (represented by S t ρ )34.

S” and “ Z j ” are standardized normal random variables that are equicorrelated. Being these variables mutually independent, “X” is also a standardized normal random variable.

As P ( X < m ) = Φ ( m ) = p , it equals:

P [ X < Φ 1 ( p ) ] = P [ s ρ + z j 1 ρ < Φ 1 ( p ) ] = P { z j < [ Φ 1 ( p ) s ρ ] / 1 ρ } = Φ { [ Φ 1 ( p ) s ρ ] / 1 ρ } = G ( p , s , ρ ) , (10)

the main theoretical axis in structural default models.

Appendix B. Posterior Default Probabilities Assuming Independence of Events (See 3.3.2)

Table 11. Default probability distribution—Posterior using the basic binomial (asset correlation = 0%) as likelihood, for several priors.

Highest density intervals expressed in percentage.

NOTES

*The opinions expressed in this work are a purely personal matter.

1Prudential requirements for credit institutions and investment firms—Regulation 575/2013 of the European Parliament and the Council of 26 June 2013 and its several amending.

2If the posterior density is a multimodal distribution, “the highest density region” should be used, instead of “the highest density interval”.

3The true value of “λ ” has a ( 1 δ ) % probability of belonging to the shortest possible interval. If [ λ lower , λ upper ] is the 90% credible interval of “λ ” then P ( λ lower < λ < λ upper ) = ( 100 10 ) % = 90 % .

4This term is tied to Katja Pluto and Dirk Tasche’s “the most prudent estimation” concept that they established in 2005 and applied to the classical default probability. Each risk class contains not only “n ” and “k ” from that particular class, but also “n ” and “k ” from other classes with lower rating grades.

5 Tasche (2012).

6Term used by Tasche (2012).

7In the second stage, another hypothetical scenario with m = 0.003, Mo = 0.01, and M = 0.02 was found considerably more suitable to the real-world issue of low default portfolios.

8The 1.645 denominator corresponds to a 90% confidence level.

9When there are no defaults, it is used at k = 0.000000001 instead of k = 0 to ensure that the analytical solution of the beta distribution is valid.

10For events “X ” and “Z ”, the conditional probability of “X ” given the occurrence of “Z ”, representing “s ” the number of disjoint events, is computed as follows:

P ( X | Z ) = P ( X Z ) / P ( Z ) = P ( Z | X ) P ( X ) / P ( Z ) = P ( Z | X ) P ( X ) / j = 1 s P ( Z | X j ) P ( X j )

If the probability P ( Z | X ) is derived from a statistical model L ( z | x ) that describes the likelihood function and the probability P ( X ) is derived from a prior function ϕ ( x ) , the posterior density function is obtained by:

L ( x | z ) = L ( z | x ) ϕ ( x ) / L ( z | x ) ϕ ( x ) d x

11It should be remembered that four different values of “u ” were tested: 1, 0.25, 0.1, and 0.01.

12It’s worth noting that the rule is an approximation, thus it does not exactly match the conventional binomial test. The mean for drift and the standard deviation for volatility from the same distribution, the ratio of the standard deviation to the number of observations, a two tailed method, and the lack of skewness are all used in that conventional test.

13Similar to Pluto and Tasche (2005).

14 A + B + C = ( 150 , 0 ) + ( 500 , 1 ) + ( 350 , 2 ) = ( 1000 , 3 ) ; and B + C = ( 500 , 1 ) + ( 350 , 2 ) = ( 850 , 3 ) .

15See also footnote 4.

16Because the default probability is believed to be low, the binomial variance, n λ ( 1 λ ) , is marginally lower than the Poisson variance, n λ .

1746% = 1.11% (Table 4)/0.76% (Table 2) – 1, for (350, 2), and 59% = 0.58% (Table 4)/0.37% (Table 2) – 1, for (1000, 3).

18 138% = 4.71% (Table 4)/1.98% (Table 2) – 1, for (150, 0), and 274% = 2.89% (Table 4)/0.77% (Table 2) – 1, for (1000, 3).

19 81% = 1.66% (Table 4)/0.92% (Table 2) – 1, for (150, 0), and 130% = 1.18% (Table 4)/0.51% (Table 2) – 1, for (1000, 3).

20115% = 3.28% (Table 4)/1.52% (Table 2) – 1, for (150, 0), and 214% = 2.10% (Table 4)/0.67% (Table 2) – 1, for (1000, 3).

21For the sake of simplicity, one assumes that the exposure amounts for all loan contracts are always the same, regardless of the rating grade associated with each risk class.

22 27 % = 1 [ 0.37 % ( 150 + 500 + 350 ) ] / [ 0.46 % 150 + 0.34 % 500 + 0.76 % 350 ] .

23The largest deviation in the mean of the default probability (0.014%) occurs with n = 150, k = 0 and ρ = 12%.

24This rule may also be shown comparing the values of the posterior probabilities in the last two columns of Table 10 and Table 11 which will be presented later. They are both tied to the same number of defaults, k = 3.

25When capital savings derived from the classical approach—see the last two paragraphs of 3.2—are compared to capital savings derived from the Bayesian approach, the former are higher (than the latter) at a 50% confidence level when ρ = 12%—13% > 9% and 23% > 18%—, and lower (at the same confidence level) when ρ = 0%—16% < 20% and 27% < 33%.

26Although both priors 6.2 and 7 are connected to beta distribution, only the first one is included in Table 10 and Table 11.

27With ρ = 12%, the means for likelihood distribution—or posterior with uniform distribution and u = 1 as a prior—are three or four times bigger than those with ρ = 0%, depending on the pair “n ” and “k ” considered.

28The second paragraph of 3.3.1 came to a similar conclusion.

29This is clear in the base scenario since the full mass of probability density is fixed between 0.01 and 0.075, the minimum and maximum values of the expert prior function. However, it can be seen in any other priors that are used as a proxy for the base scenario.

30One notes that unlike the beta distribution, which has an incognita that ranges from 0 to 1—the same range as the default probability—, the normal distribution’s domain is between –∞ and +∞ , whereas the prior function’s domain may only allow values between 0 and 1.

31Multiples of 50, from n = 50 to n = 500.

32The probability indicated by the prior 6.3 might also be adequate. It is, nevertheless, connected to an exaggerated base scenario of expert knowledge (i.e., the prior 6.1).

33If there is no default occurrence, one expects that the prudence level given by the percentile is higher than the one when there are historical defaults.

34For instance, an economic index could be a random variable “S ” that reflects the portfolio’s exposure to a common factor.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1] Pluto, K. and Tasche, D. (May 2005). Estimating Probabilities of Default for Low Default Portfolios. The Basel II Risk Parameters 2006—Estimation, Validation and Stress Testing (2006). Springer.
[2] Tasche, D. (November 2012). Bayesian Estimation of Probabilities of Default for Low Default Portfolios. Journal of Risk Management in Financial Institutions, 6, 302-326.
https://doi.org/10.2139/ssrn.2048818
[3] Vasicek, O. A. (December 2002). The Distribution of Loan Portfolio Value. Risk, 15, No. 12.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.