An Analytical Portfolio Credit Risk Model Based on the Extended Binomial Distribution

The binomial distribution describes the probability of the number of successes for a fixed number of identical independent experiments, each with binary out-put. In real life, practical applications like portfolio credit risk management trials are not identical and have different realization probabilities. In addition to the number, the quantitative impacts of the respective outputs are also important. There exist no complete model-side implementations for the expansion of the binomial distribution, especially not in the case of specific quantitative parameters up to now. Here, a solution of this issue is described by the extended binomial distribution. The key for solving the problem lies in the use of bijection between the elementary events of the binomial distribution and the digit sequences of binary numbers. Based on the extended binomial distribution, an analytical portfolio credit risk model is described. The binomial distribution approach minimizes the approximation error in modeling. In particular, the edges of the loss distribution can be determined in a realistic manner. This analytical portfolio credit risk model is especially predestined for management of risk concentrations and tail risks.


Introduction
Many projects are composed of different partial projects. In general, the probabilities of success of the individual projects are not uniform. Furthermore, the individual partial projects have different weighting, in other words different values for the overall project. The aim of this study was to describe a distribution of the values of the overall project.

Extended Binomial Distribution
The simplest of discrete distributions is the Bernoulli distribution. Here, it only has to be checked whether a particular event X was successful X = 1 or has failed X = 0. The probability of X = 1 is ( ) 1 P X p = = and of the complementary event X = 0 it is ( ) 0 1 P X p = = − . In comparison to the Bernoulli distribution the binomial distribution is the hierarchal higher order distribution. Due to the binomial distribution random variables based on the so-called Bernoulli trial scheme are described. To do so, n identical independent trials of a Bernoulli distributed random variable are performed. The number of realizations of the Bernoulli distributed random variable, in which the successful event X occurs, in each case describes a trial outcome of the binomial distributed random variable.
In this paper, an extension of the binomial distribution without the limitation of identical probabilities p of the trials is described. In order to be additionally able to weight the trials differently when required, they are enhanced by specific weighting parameters.
Bernoulli distributed random variables have only two experimental outcomes and can be represented in binary code. This enables to map the Bernoulli trial scheme for n trials one-to-one into a matrix with n columns (number of trials) and 2 n different rows (number of possible combinations of the experimental outcomes) of elements zero and one. This matrix is denoted scenario matrix.
The scenario matrix plays the central role in the description of the following distribution.
is called the extended binomial distribution.
The definition is meaningful only if function (3) is a distribution function. To show this, the criteria of the following theorem should be checked.
Theorem: (Fisz, 1981;Gnedenko, 1987) A real-valued function F(x) is a distribution function if and only if 1) The two conditions F(−∞) = 0 and F(+∞) = 1 satisfy, 2) It is monotonically non-decreasing and 3) It is left-continuous. Equation (3) for all t < B are summations over the empty set. That means F(t) = 0 for all t < B and in particular F(−∞) = 0. Moreover, if all weights are non-negative, F(t) is identical zero for all t < 0 and the condition F(−∞) = 0 is also satisfied.
The proof of F(+∞) = 1 is carried by complete induction. Therefore, in Equation (3) all single probabilities have to be sum up , 1, , 2 n i f i =  . First of all, it must be examined that the statement apply to for the base case n = 1 Without restricting the generality s 11 = 1 and s 21 = 0. This implies The induction assumption F(+∞) = 1 is satisfied for n = k In the inductive step it is to show, that the statement apply to for n = k + 1 Therefore, the products under the summation sign are decomposed The coefficients 1 ik s + consist of 2 k coefficients that equal zero and 2 k coefficients that equal one, each in pairs with identical factor below the product sign.
The coefficients From the induction assumption it follows, that F(∞) = 1 for n = k + 1. So it was verified that the condition (1) of the theorem is accomplished. Journal of Financial Risk Management By increasing t the number of summands in function (3) increases. To demonstrate the monotony it has to be shown, that all summands are non-negative.
The summands are products themselves. The factors have the structure . Distinction has to be made between two cases: s ij = 0: then the factor reduces to (1 − p j ), which is non-negative since 0 1 j p ≤ ≤ . s ij = 1: then the factor only remaining p j , which is non-negative.
So the function (3) also satisfies the condition (2) of the theorem.
The function (3) is a jump continuous function and because of the strict inequality in it is a left-continuous function. So the function (3) also satisfies the condition (3) of the sentence and it is a distribution function.
To illustrate the Definition 1, the probability mass function and the cumulative distribution function are considered for the following example in Table 1 and Figure 1.

Moments and Characteristics of the Extended Binomial Distribution
The extended binomial distribution of a random variable X by Definition 1 is a linear combination of independent Bernoulli distributed random variables , 1, , j X j n =  with probabilities p j and with weights w j as linear coefficients.
Based on these considerations, the moments of the extended binomial distribution are determined in the following.
The expected value of a random variable is generated by the expected value operator. The expected value operator is also linear (Fisz, 1981;Rényi, 1971). It means, for the sum of a finite number of random variables , 1, , The variance of a sum of independent random variables X j and real numbers , 1, , (Rényi, 1971). Bernoulli distributed random variables X j have the variances  Gribakin, 2002). The probability generating function of a Bernoulli distributed random variable X j is given by (Rényi, 1971). For the probability generating function of the extended binomial distributed random variable

A Numerical Approach to Apply the Extended Binomial Distribution on Higher Number of Independent Trials
Computational effort for calculating the extended binomial distribution will be doubled for an additional trial. The limits of the computational feasibility are quickly achieved. Under the additional condition, that the weights are in approximately the same order of size, the computational effort can be reduced significantly. Then the extended binomial distribution also is numerically approximately applicable for problems with a large number of trials.
In Definition 1 it is assumed, that the trials X j are independent from each other. Under this assumption, the extended binomial distribution can be calculated for problems with a large number of trials, by:  Splitting larger trials into partial tests,  Determining the distribution functions and the probability for the partial tests separately and  Finally, aggregating the distributions of the partial tests to the distribution of the complete problem successively.
For aggregation the following calculus is used: Definition 2 (Smirnow & Dunin-Barkowski, 1969): Let D ⊆ Z be a discrete subset of the integers and P 1 and P 2 be two functions with P i : D → R for i = 1, 2. Then ( )( ) ( ) ( ) 1 2 1 2 k D P P X n P X k P X n k ∈ * = = = ⋅ = − ∑ is the discrete convolution of P 1 and P 2 .
In the computational implementation of the convolution of the probability functions of extended binomial distributed random variables is a practical problem. This results from the potentially large number of different quantitative manifestations (3). Moreover, in definition 1 it was not required that the weights w j are integers. To apply the calculus of convolution computational efficiently, the probability functions are approximated for aggregation. For this purpose the quantitative manifestations of the extended binomial distributions have to be projected onto reference points (Figure 2). The projection is done by rounding the quantitative manifestations of integer multiples of a given unit discretization U: The approximation error caused by the projection is low if the weights w j are approximately in the same order of size. In this way the model of the extended binomial distribution is applicable for problems with a large number of trials. Up to this point, the meaning of "approximately the same order of size" was not specified. The key role is played by the discretization unit U.
On the one hand the discretization unit U should not be larger than the smallest weight, since otherwise in the approximation the impact of the trials with smaller weights is neutralized. On the other hand the discretization unit U should not be smaller than a fraction of the greatest weights, since this would have negative effects on the performance of the convolution. Experience shows, that there are no significant performance impairments, if one percent of the largest weight is chosen as discretization unit. From both restrictions it can be derived that in the context above the weights are in approximately the same order of size, when the greatest weight is not significantly higher than one hundred times of the smallest.

Application of the Extended Binomial Distribution
A project in this sense is the investment in a loan portfolio. A loan portfolio consists of a certain number of loans. Each loan has a specific exposure and its own probability of default. An estimation of the expected portfolio loss and the loss distribution is required for the management of the portfolio.
To illustrate the problem, a portfolio of four loans is considered. Usually a tree structure is used to represent the elementary events. The characteristic of the extended binomial distribution, the bijection between the tree structure and the scenario matrix is shown in Figure 3.
Hereinafter concrete values are used in the example ( Table 2). Journal of Financial Risk Management For portfolios with a larger number of loans, the portfolio is divided into partial portfolios as described in section 4. The loss distributions are computed for the sub-portfolios. While doing so, the losses are rounded to integer multiples of a discretization unit U. Next, the loss distributions of the partial portfolios are successively aggregated by convolution until the loss distribution of the complete portfolio is determined, see Figure 4. By disassembling the problem and aggregating the partial results as described above it is possible to determinate the loss distribution for portfolios consisting of a few hundred of loans. In this way, the numerical restrictions are relativized, but not completely eliminated. What does it mean in practical realization?
A partial portfolio of the largest loans is taken from the complete portfolio. For this sub-portfolio the loss distribution is determined by the extended binomial distribution. What should be done with the remaining portfolio?
In the remaining portfolio the largest loan account for only a few per thousand of the complete portfolio. If the remaining portfolio consists of a few loans, its influence on loss distribution of the complete portfolio is marginal. This is not the case if the remaining portfolio consists of many loans. Then because of the large number of small loans, the remaining portfolio is well diversified and heterogeneous in general. The loss distribution in such a portfolio can be well approximated by a Gaussian normal distribution. The parameters μ and σ for the normal distribution approximation are (Fischer, 2012)  Adjusted decomposed probabilities of default for debtors reacting differently sensitive to changes in the systematic factor are shown schematically in the following Table 3 for different scenarios of the systematic factor M. In this case, an a priori probability of default of 0.03 is assumed. Similar to the one-factor model, the approach of decomposed probability of default can be further differentiated extended to several risk factors , 1, , (Gordy, 2002). Through the synchronization of probabilities of default implied correlations respectively dependencies arise, which are observable in the real world.
Not yet considered is the parameter M. The M parameter is used for mapping clusters for relative changes in the economic situation. The economy itself is not measurable. Representative for changes in economy, changes in insolvency frequencies are used as a measurable quantity, see Figure 5. For risk considerations, this substitution should be opportune. From the aggregated distribution function and from the aggregated probability mass function of the loss of the complete loan portfolio, the known risk ratios value at risk or expected shortfall are determined (Albrecht, 2004).

Conclusion
A technique was developed to determine the probability mass function and the cumulative distribution function for the extended binomial distribution or for trials consisting of independent heterogeneous Bernoulli distributed single trials. Additionally, a numerical approach was described for approximating solutions of tasks with a larger number of trials. The extended binomial distribution provides the foundation for a new analytical portfolio credit risk model. The new model expanded the set of analytical portfolio credit risk models, which were previously essentially represented only by the family of CreditRisk+ models. The analytical approach enables identical reproducibility of results. This in turn allows separate analysis with regard to individual risk factors or risk positions. The approach of the extended binomial distribution allows the reduction of the approximation error. This has practical benefits in particular in determining the edges of the loss distribution. Hence the model is predestined for the identification of tail phenomena and for the management of risk concentrations.

Conflicts of Interest
The author reports no conflicts of interest. The author alone is responsible for the content and writing of the paper.