J. Serv. Sci. & Management, 2009, 2: 15-28
Published Online March 2009 in SciRes (www.SciRP.org/journal/jssm)
Copyright © 2009 SciRes JSSM
CDS Evaluation Model with Neural Networks
Eliana Angelini
1
, Alessandro Ludovici
1
1
University “G. d’Annunzio” of Pescara, University “G. d’Annunzio” of Pescara
Email: e.angelini@unich.it, a.ludovici1@tin.it
Received May 6
th
, 2008; revised December 27
th
, 2008; accepted February 5
th
, 2009.
ABSTRACT
This paper provides a methodology for valuing credit default swaps (CDS). In these financial instruments a sequence of
payments is promised in return for protection against the credit losses in the event of default. Given the widespread use
of credit default swaps, one major concern is whether the credit risk has been priced accurately. Credit risk assessment
of counterparty is an area of renewed interest due to the present financial crises.
This article proposes a non parametric model for estimating pricing of the CDS, using learning networks, based on
the structural approach pioneered by Merton [1] as regards the independent variables; he proposed a model for as-
sessing the credit risk of a company by characterizing the company’s equity as a call option on its assets. The model
that we are introducing turns out peculiar not only for the use of the neural network, but also for the use of the implied
volatility of one-year options written on the shares of the analyzed companies, instead of historical volatility: this leads to
a higher capability of getting the signals launched by the market about the future creditworthiness of the firm (historic
volatility, being a medium value, brings in temporal lags in the evaluation). Besides, our analysis differs from the
structural approach for the fact that it considers the 30-month mean-reverting historical series for CDS spreads, and this
turns out to be one of the main advantages of our forward-looking model.
Keywords:
credit derivatives, CDS, neural networks, pricing models, credit spreads, implied volatility
1. Introduction
In recent years, the market for credit derivatives has ex-
panded dramatically. Credit derivatives are flexible and
efficient instruments that enable users to isolate and trade
credit risk. Credit derivatives allow users to isolate credit
risk from other quantitative and qualitative factors asso-
ciated with owing an exposure. Hence, they can be used
to transfer and hedge credit risk in an efficient and flexi-
ble manner, customized to a client’s requirements. This
transfer of credit risk may be complete or partial, and
may be for the life of the asset or for a shorter period.
Credit risk includes not just default or insolvency risk but
also changes in credit spreads and thereby market values,
changes in credit ratings and generic changes in credit
quality. Credit derivatives can be used when a sale in the
cash market is either not efficient or not possible. Even
when cash market alternatives exist, credit derivatives
may be preferred because they do not require funding.
Furthermore, since derivatives are over-the-counter con-
tracts, transactions are confidential. Finally, speed of set-
tlement and liquidity are reasons why credit derivatives
are a better alternative to the reinsurance market. Credit
derivatives are swaps, forward and option contracts, par-
ticularly credit default swaps (CDS); they can be used to
hedge against all these types of credit risk. For a simple
credit default swap, over some time period, one counter-
party (the protection seller) receives a predetermined fee
payment from another counterparty (the protection buyer);
in return, the protection seller agrees that in the case of a
credit event of a reference entity, it will pay the seller the
loss on a bond of the reference entity, that is the bond’s
par value less its recovery.
Nowadays, banks, corporate, hedge funds, insurance
companies and pension funds are hugely exposed as buy-
ers or sellers, or both. By transferring the risk, the CDS
have acted as a kind of insurance and provided incentives
for risk-taking. They are therefore at the heart of the pre-
sent crisis.
Given the
widespread use of credit default swaps, as an
investment or a risk management tool, one major concern
is whether the credit risk has been priced accurately. This
article proposes a non parametric model for estimating
pricing of these credit derivatives, using learning net-
works. The recent application of nonlinear methods, such
as neural networks to credit risk analysis, shows promise
of improving on traditional credit models. Neural net-
works differ from classical credit systems mainly in their
black box nature and because they assume a non-linear
relation among variables. The two main issues to be de-
fined in a neural network application are the network
typology and structure and the learning algorithm. The
connections (links) among neurons have an associated
weight which determines the type and intensity of the
information exchanged. As regards the independent vari-
16 ELIANA ANGELINI, ALESSANDRO LUDOVICI
Copyright © 2009 SciRes JSSM
ables of the model, we start from the typical assumption
of the structural approach based on the theoretical foun-
dation of Merton’s [1] option pricing model: the relevant
information in order to evaluate credit risk can be ob-
tained from the market data of the analyzed companies.
The model developed by Merton views a firm’s equity as
an option on the firm (held by the shareholders) to either
repay the debt of the firm when it is due, or abandon the
firm without paying the obligations. What makes that
model successful is its reliance on the equity market as an
indicator, since it can be argued that the market capitali-
zation of the firm (together with the firm’s liabilities)
reflect the solvency of the firm. Therefore, option pricing
theory is used in order to create a link between the credit
market and the securities market. The model that we are
introducing turns out peculiar not only for the use of
neural networks, but also for the use of the implied vola-
tility of one-year options written on the shares of the
companies, instead of historical volatility: this leads to a
higher capability of getting the signals launched by the
market about the creditworthiness of the firm (historical
volatility, being a medium value, brings in temporal lags in
the evaluation). Besides, our analysis differ from the
structural approach for the fact that it consider the
30-month historical series for CDS spreads, and this turns
out to be one of the main advantage of our forward-
looking model.
The paper is organized as follows. The paper begins,
in Section 1, by stating the implications of credit deriva-
tives in portfolio credit risk management. In Section 2,
we first briefly overview the main principles and charac-
teristics of neural networks, focusing the attention above
all on the concepts that are most useful for the application
to financial instruments; then we describe the pricing
model we developed and tested for credit derivatives.
Section 3 develops the theory underlying our implemen-
tation of Merton’s model. Section 4 describes the data
and we present our results: the effectiveness of neural
network in approximating the evaluation of credit default
swap is illustrated. As regards the sample, it includes 18
American firms, relative to various fields, including fi-
nancial institutions which, operating typically with a high
leverage due both to the activity carried out and to the laws
concerning the capital of banks, usually introduces re-
markable factors of distortion in parametric models. We
shall show that neural networks are not affected by this
problem. The temporal range embraces the period Sep-
tember 2002-March 2006: we have considered the five-
year CDS spread relative to each firm, for a total of 180
observations on a quarterly basis obtained through the
Fitch™ database. As already pointed out, implied volatil-
ity has a determining role among the variables; in fact we
have obtained a positive correlation with CDS spreads
equal to 0.6338. Leverage is another key variable, ob-
tained dividing the face value of the debt of the firm by the
total of its liabilities (including the market capitalization),
getting the data from the Bloomberg™ database. We have
considered the risk free rate equal to one-year constant
maturity Treasury Bills yield, taken from the Federal Re-
serve System database. We then discuss in detail the ex-
perimental settings and the results we obtained, leading to
considerable accuracy in prediction. The architecture of
the neural network is feed-forward, trained for 17000
learning epochs using the back-propagation algorithm,
with two hidden layers of 9 and 10 neurons each: by the
study carried out it turns out obvious that neural networks
are able to totally capture the variability relative to the
market dynamics of credit default swap. The paper ends
evidencing that, as far as this field of the financial markets
is concerned, neural networks constitute a highly valid
instrument of calculation: in fact there still does not exist
in literature a formula of evaluation for the CDS, able to
tie the quoted spreads to the specific underlying variables
of each examined firm, and the neural network can, as will
be shown, satisfy this lack with high effectiveness, facing
the problem of determination of the functional form from a
statistical point of view. As we will show, it is easy to
calculate the sensitivity of the CDS spread to each inde-
pendent variable, in order to determine a statistical pricing
formula for CDS.
The paper concludes with a discussion of advantages
and limitations of the solution achieved.
2. Credit Derivatives: Innovative Financial
Instruments
Credit derivatives are financial instruments used to trans-
fer credit risk of loans and other assets. They are bilateral
financial contracts with payoffs linked to a credit related
event such as a default, credit downgrade or bankruptcy.
There are various types, but the basic structures of all
credit derivatives are swaps, options and forwards. Due to
their high flexibility credit derivatives can be structured
according to the end-users’ needs. For instance, the
transfer of credit risk can be effected to the whole life of
the underlying asset or for a shorter time, and the transfer
can be a complete or a partial one. Delivery can take place
in the form of over the counter contracts or embedded in
notes. Moreover, the underlying can consist of a single
credit-sensitive asset or a pool of credit-sensitive assets [2].
2.1 Credit Derivatives: Products and Structures
The most important and widely used credit derivative is a
credit default swap
1
. It is an agreement in which the one
counterparty (the protection buyer) pays a periodic fee,
typically expressed in fixed basis points on the notional
amount, in return for a contingent payment to the other
counterparty (the protection seller) in the event that a
third-party reference credit defaults. A default is strictly
defined in the contract to include, for example, bankruptcy,
insolvency, and/or payment default. The definition of a
credit event, the relevant obligations and the settlement
mechanism used to determine the contingent payment are
flexible and determined by negotiation between the
1
The credit default swap is also known as credit
default put, credit swap,
default swap, credit put or default put.
ELIANA ANGELINI, ALESSANDRO LUDOVICI 17
Copyright © 2009 SciRes JSSM
counterparties at the inception of the transaction. Since
1991, the International swap and Derivatives association
(ISDA) has made available a standardized letter confir-
mation allowing dealers to transact credit swaps under the
umbrella of an ISDA Master Agreement. The evolution of
increasingly standardized terms in the credit derivatives
market has been a major growth because it has reduced
legal uncertainty that hampered the market’s growth.
The contingent payment in the event of default can be
identified as either:
-a payment of par by the protection seller in exchange
for physical delivery of the defaulted underlying;
-a payment of par less the recovery value of the
underlying as obtained from dealers;
-a payment of a binary, i.e. fixed, amount.
Credit default swaps can be viewed as an insurance
against the default of the underlying or a put option on the
underlying. Figure 1 exhibits the basic structure of a credit
default swap.
Moreover, there is the total return swap, in which one
counterparty (total return payer) pays the other counter-
party (total return receiver) the total return of an asset (the
reference obligation) for receiving a regular floating rate
payment, such as Libor plus a spread. “Total return”
comprises the sum of interest, fees and any change-in
value payments (any appreciation or depreciation) with
respect to the reference obligation.
In contrast to the credit default swap, the total return
swap does not only transfer the credit risk but also the
market risk of the underlying; it effectively creates a
synthetic credit-sensitive instrument. A total return swap
allows an investor to enjoy all of the cash flow benefits of
a security without actually owing the security.
Credit spread option is an option on a reference
credit’s spread in the loan or bond market. In a spread put
option one party pays a premium for the right to sell a
bond to a counterparty at a certain spread at a definite
time in the future. A credit spread option gives the buyer
protection in the event of any unfavourable credit mi-
gration. In a default option, the asset can be put only on
default. The credit spread is the differential yield be-
tween the reference credit and a pre-determined bench-
mark rate. Thus, in credit spread derivatives, payment is
based on the movement of the value of one reference
credit against another.
Figure 1. Credit default swap
that pays out if a specified company’s rating is down-
graded. This kind of option is sometimes embedded in
bond structures.
Finally, credit linked notes are created by embedding
credit derivatives in notes. Credit derivatives have the
advantage that funding is not necessary; whereas credit
linked notes have the benefit of avoiding counterparty risk.
Credit linked notes are frequently issued by special pur-
pose vehicles (corporations or trusts) that hold some form
of collateral securities financed through the issuance of
notes or certificates to the investor. The investor receives a
coupon and par redemption, provided there has been no
credit event of the reference entity. The vehicle enters into
a credit swap with a third party in which it sells default
protection in return for a premium that subsidizes the
coupon to compensate the investor for the reference entity
default risk.
2.2 Fundamental Attractions of Using Credit
Derivatives
In theory, credit derivatives are tools that enable financial
operators to manage their portfolio of credit risks more
efficiently; they enable market participants to devise
flexible personal approaches to the management of credit
risk associated with a variety of underlying financial as-
sets. The promise of these important instruments has not
escaped regulators and policymakers. “Credit derivatives
and other complex financial instruments have contributed
to the development of a far more flexible, efficient, and
hence resilient financial system than existed just a quar-
ter-century ago” [3].
The credit derivatives market offers its users a range of
tools which enable the transfer of credit risk. A brief
review of the available products reveals that in most
cases one party to a transaction receives a fee and com-
mits to provide the other party with a payment should the
credit quality of a third party deteriorate. Whilst the
mechanism contained in these products are easy to un-
derstand, the broad range of applications is not immedi-
ately obvious.
The users of the risk-management benefits of credit
derivatives tend to be quite diverse. An increasingly im-
portant user group includes financial institutions, corpo-
rate and fund managers. Financial institutions have em-
braced the full range of benefits; the use of credit deriva-
tives by banks has been motivated by the desire to improve
portfolio diversification and to improve the management
of credit portfolios. Corporate is also looking to reduce the
credit exposure to key trading partners and specifically
they are interested in using credit derivatives to isolate
credit risks in project financing. For fund managers, al-
though the asset benefits of credit derivatives still suffer
from lack of liquidity, the use of structures that hedge out
spread risk has some appeal.
This paragraph focuses on a range of uses for credit
derivatives and divides them between credit risk man-
agement and asset opportunities
2
[4,5].
2
For more detailed information on the characteristics of credit deriva-
tives see DAS
, S., (1998); TAVAKOLI, J.M., (1998).
18 ELIANA ANGELINI, ALESSANDRO LUDOVICI
Copyright © 2009 SciRes JSSM
2.2.1 Using Credit Derivatives for Managing Credit Risk
The principal feature of these instruments is that they
separate and isolate credit risk facilitating the trading of
credit risk with the purpose of:
-replicating credit risk;
-transferring credit risk;
-hedging credit risk.
In practice, the rationale behind a transaction may relate
to the management of credit lines, to regulatory capital
offsets, to balance sheet optimization, portfolio hedging
and diversification or pure risk reduction itself. Credit
derivatives can be used as a risk management tool by
portfolio managers to:
-Achieve portfolio diversification: credit derivatives
can be used to achieve portfolio diversification by
allowing access to previously unavailable credits.
They can also be used to diversify across a range of
borrowers and to gain exposure to an asset without
owing it.
-Reduce concentration risk: investors can reduce
portfolio credit risk concentrations using derivatives
structures; they can thus manage country and industry
risks. Reducing credit concentration in loan portfolios
is commonly viewed as the main use of credit
derivatives. However, to date credit derivatives are
generally referenced to assets which are widely traded,
i.e. for which market prices are readily available, or for
which a rating by an international agency is at hand.
-Manage exposures while maintaining client relation-
ships. Changes to credit risk management in the
banking sector are an additional factor contributing to
greater use of credit derivatives. Investors can use
credit derivatives to reduce exposures without selling
them. This effectively frees up credit lines, allowing
more business to be done with a customer.
Furthermore, a bank that is concerned about credit
loss on a particular loan can protect itself by
transferring the risk to someone else while keeping
the loan on its books. As part of their credit risk
management, banks are viewing credit derivatives
more and more often as tradable products, which can
be transferred to third parties before the maturity date
[6,7,8].
-Manage regulatory capital: the new supervisory rules
provided for by Basel II are also increasing the
incentives for banks to use credit derivatives. Where
guarantees or credit derivatives are direct, explicit,
irrevocable and unconditional, and supervisors are
satisfied that banks fulfil certain minimum
operational conditions relating to risk management
processes, they may allow banks to take account of
such credit protection in calculating capital
requirements. A guarantee or credit derivative must
represent a direct claim on the protection provider
and must be explicitly referenced to specific
exposures or a pool of exposures, so that the extent of
the cover is clearly defined and incontrovertible.
Other than non-payment by a protection purchaser of
money due in respect of the credit protection contract
it must be irrevocable; there must be no clause in the
contract that would allow the protection provider
unilaterally to cancel the credit cover or that would
increase the effective cost of cover as a result of
deteriorating credit quality in the hedged exposure. It
must also be unconditional; there should be no clause
in the protection contract outside the direct control of
the bank that could prevent the protection provider
from being obliged to pay out in a timely manner in
the event that the original counterparty fails to make
the payment due. There are cases where a bank
obtains credit protection for a basket of reference
names and where the first default among the
reference names triggers the credit protection and the
credit event also terminates the contract. In this case,
the bank may recognise regulatory capital relief for
the asset within the basket with the lowest
risk-weighted amount, but only if the notional amount
is less than or equal to the notional amount of the
credit derivative. In the case where the second default
among the assets within the basket triggers the credit
protection, the bank obtaining credit protection
through such a product will only be able to recognise
any capital relief if first-default-protection has also be
obtained or when one of the assets within the basket
has already defaulted [9].
2.2.2 Asset Opportunities
Credit derivatives have evolved to become an important
financial asset class. As already argued, credit derivatives
enable credit risk to be separated from the funding com-
ponent of its underlying instrument; as it is often the form
of the underlying instrument that creates obstacles for the
investor, this separation of the credit risk creates important
opportunities. The decision to use the asset opportunities
of credit derivatives tends to be based on one of the fol-
lowing needs:
-Access to new markets: investors can create new
assets with a specific maturity not currently available
in the market;
-Obtain tailored investments: credit derivatives can be
used to create instruments with exact risk- return
profile sought. Maintaining diversity in credit
portfolios can be challenging. This is particularly true
when the portfolio manager has to submit with
constraints such as currency denominations, listing
considerations or maximum or minimum portfolio
duration. Credit derivatives are being used to address
this problem by providing tailored exposure to credits
that are not otherwise available in the wished form or
not available at all in the cash market.
-Improve the risk-return profile of portfolios: credit
derivatives offer new possibilities of turning a given
market opinion into an investment strategy. This
particularly entails assumption of specific types of
ELIANA ANGELINI, ALESSANDRO LUDOVICI 19
Copyright © 2009 SciRes JSSM
credit risk without the acquisition of the asset itself.
Instead of purchasing a specific bond, a market
participant who considers some credit risks to be
overvalued can earn an attractive premium as a
protection seller in the credit default swap market.
Premiums are generated without having to tie up any
capital for the purchase of a bond issue (at least as long
as no credit event occurs). On the other hand, market
participants who consider risks to be underestimated
can purchase protection by paying a premium. Owing
to the limited possibilities for short sales in the bond
market, hedge funds are increasingly entering into
positions in credit derivative market to implement
their financial strategies. In particular:
- to hedge dynamic risks: exposures that change
with market movements can be hedged using credit
derivatives;
- to manage illiquid credits: credit derivatives can be
utilized to actively manage risk in large illiquid loans
portfolios;
- to execute short credit positions: credit derivatives
can be employed to execute short credit positions
without the risk of a short squeeze or high financing
costs. Hence, investors can use them to hedge or take
advantage of deteriorating credit qualities;
- to hedge declining credit quality: default and spread
options and swaps can be used to hedge failing credit
qualities. Credit spread options and swaps can be
used to hedge fluctuations in credit spreads without
having to wait for default to get a payout.
3. The Neural Network Model
The general structure of a neural network model consists
of simple processing units called nodes that interact with
each other using weighted connections. Each unit (node)
receives and processes inputs, and delivers a single out-
put. The input can be raw or output of other processing
units. The output can be the final product or an input to
another unit. In processing the inputs, the model assigns a
weight to each input, where weights represent the relative
strength or importance of inputs. A neural net essentially
represents a nonlinear discriminant function as a pattern
of connections between its processing units.
Neural networks have been used in different fields of
study, such as engineering, medicine, physics and others.
Although the relative structures differ remarkably with
one another, it is possible to point out some fundamental
principles regarding essentially the functioning of such
operative instruments. Moreover, it is important to start
the treatment emphasizing that, in order to analyze the
financial dynamics, relatively little complex networks are
effective, at least compared to those of other fields
3
[10,11,12].
Neural networks offer several advantages over the tra-
ditional statistical methods. First, neural networks do not
require the restrictive assumptions imposed by conven-
tional methodologies. Second, neural networks can de-
velop input-output map boundaries that are highly non
linear
4
[13,14]. Third, they have greater fault tolerance
and adaptability. Neural network examines all informa-
tion available and it can incorporate the new information
into the analysis promptly through its memorization of
previous learning; it updates its weighting scheme so that
it continually “learns” from experience. Thus, neural net-
works are flexible, adaptable systems that can in corpo-
rate changing conditions.
3.1 Architecture of Neural Networks
A neural network relates a set of input variables {x
i
},
i=1,2,..k to a set of one or more output variables {y
j
},
j=1,2,..h. An essential characteristic of a neural network,
differently from other methods of approximation, is that
it uses one or more hidden layers, in which the input
variables are transformed by a logistic or logsigmoid
function: this characteristic, as shown later, gives to these
instruments a particular efficiency in modeling nonlinear
statistical processes.
In the feed-forward neural network parallel elaboration
is associated to the typical sequential elaboration of the
linear methods of approximation. In fact while in the se-
quential elaboration particular weights are given to the
input variables through the neurons of the input layer, in
the parallel one the neurons of the hidden layer operate
further transformations in order to improve the predictions.
The connectors (between the input neurons and the neu-
rons in the hidden layers, and between these and the output
neurons) are called synapses. The feed-forward neural
network with a single hidden layer is the simplest and at
the same time the most used network in the economic and
financial field.
Therefore the neurons process the input variables in two
ways: firstly forming linear combinations and lastly
transforming these combinations through a particular
function, typically the logsigmoid function, illustrated in
Figure 2. Logsigmoid function
3
DOLCINO, F., GIANNINI, C., ROSSI, E., (1998).
For a useful
description of the phenomenon in general terms, see
FLOREANO, D.,
., (1993) and GORI, M., (2003).
4
Such feature is important for financial analysis because several studies
have shown that the relation between default risk and financial factor
(variables) are often non linear. See WU and YU (1996); WU (1991).
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.3
0.2
0.1
-
-
2 0 2 4
20 ELIANA ANGELINI, ALESSANDRO LUDOVICI
Copyright © 2009 SciRes JSSM
Figure 2. An essential characteristic of this function is
the threshold behavior near values 0 and 1, which turns
out to be particularly suitable to economic problems,
which usually, for very high (or very low) values of the
independent variables, show little changes in response to
small changes of the variables. At the analytical level,
the neural network can be described by the following
equations [15]:
=
+=
m
itiikktk
xn
1,,0,,
ωω
(1)
tk
n
tktk
e
nLN
,
1
1
)(
,,
+
==
(2)
=
+=
q
ktkkt
Ny
1,0
γγ
(3)
where L(n
k,t
) represents the logsigmoid activation function.
It is a system with m input variables x
i
and q neurons. A
linear combination of these input variables, observed at
time t, with the weights of the input neurons ω
k,i
and the
constant term (bias) ω
k,0
forms the variable n
k,t
. Then this
variable is transformed by the logistic function and be-
comes the neuron N
k,t
at time or observation t. The set of q
neurons at time or observation t is therefore linearly
combined with the coefficient vector k and added to the
constant term ω
k,0
in order to obtain the output y
t
con-
cerning time or observation t, representing the prediction
of the neural network for the analyzed variable. The feed
forward neural network used with the logsigmoid activa-
tion function is often called multi-layer preceptor or MLP
network. A highly complex problem could be treated
widening this structure, and therefore using two (respec-
tively N and P) or more hidden layers [15]:
=
+=
m
itiikktk
xn
1,,0,,
ωω
(4)
tk
n
tktk
e
nLN
,
1
1
)(
,,
+
==
(5)
=
+=
s
ktkklltl
N
1,,0,,
ρρρ
(6)
tl
p
tl
e
,
1
1
,
+
=
ρ
(7)
=
+=
q
ltllt
Py
1,0
γγ
(8)
Adding another hidden layer increases the number of
parameters (weights) to be estimated by the factor (s+1)
(q-1)+(q+1), since the net with a single hidden layer, with
m input variables and s neurons has (m+1)s+(s+1) pa-
rameters, while the same net with two hidden layers and q
neurons in the second hidden layer has (m+1)s+(s+1)q+
(q+1) parameters. However the disadvantage of these
models for complexity does not consist of the number of
parameters, which in any case use up degrees of freedom if
the sample size is limited and requires a longer training
time, but of the greater probability that the net converges
to a local rather than global optimum. Anyway it has been
demonstrated that a neural network with two layers is able
to approximate any nonlinear function [16]. A further
quality of this instrument consists exactly of the fact that it
does not just approximate a phenomenon on the basis of a
presumed functional form to be adapted, but at the same
time it determines the functional form and proceeds to the
evaluation of the weights.
In Figure 3 a net with a multiple number of output
variables is illustrated. A neural network with a hidden
layer and two output variables is described by the fol-
lowing equations:
=
+=
m
itiikktk
xn
1,,0,,
ωω
(9)
tk
n
tktk
e
nLN
,
1
1
)(
,,
+
==
(10)
=
+=
q
ktkkt
Ny
1,,10,1,1
γγ
(11)
=
+=
q
ktkkt
Ny
1,,20,2,2
γγ
(12)
It is possible to observe that adding an output variable
implies the evaluation of (q+1) parameters more, equal to
the number of neurons of the hidden layer increased of one
unit. Therefore adding an output variable implies an in-
creasing number of parameters to be estimated, equal to
the number of the neurons of the hidden layer, not to the
input variables. Using a neural network with multiple
outputs makes sense only if these are closely correlated to
the same set of input variables: as an example we could
mention the temporal structure of the rates of inflation or
of the rates of interest. One of the most common criticisms
made to these instruments is that they are substantially
black boxes: questions regarding the nature of the pa-
rameters, the reasons of the choice of their number, of
Figure 3. Neural network with one hidden layer and two
output neurons
ELIANA ANGELINI, ALESSANDRO LUDOVICI 21
Copyright © 2009 SciRes JSSM
the number of the neurons, of the number of the hidden
layers, the reasons that relate the architecture of the net to
the structure of the underlying problem to be explained do
not find an answer.
The risk, when models are based on a high number of
parameters, is that their extreme flexibility [17], being
able to explain anything and its opposite, ends up in not
carrying any knowledge contribution. However, we must
underline that the same criticism can be made to any sta-
tistical approximation method: therefore not only to neu-
ral networks, but also to linear models, univariate and
multivariate regression and so on. Neural networks, in
particular, are able to explain very irregular processes, on
which it is therefore difficult to identify a precise relation
of cause-effect. Therefore the black box criticism consti-
tutes, paradoxically, also one of the greatest qualities of
neural networks. In any case, the simplicity with which it
is possible to increase the number of the parameters of
the net must never make forget the importance, in any
model, of the clarity of the assumptions.
3.2 Data Scaling
A neural network is not able to analyze data or to give
solutions in absolute value: especially if there are data
of an unusually elevated or reduced value, problems of
overflow or underflow could happen. When instead sig-
moid functions are used, it becomes indispensable to
preprocess data: this family of functions in fact has a
codominy of type [0,1] (or [-1,1] in the case of the log-
sigmoid function), for which the values must be scaled
to these intervals otherwise the output of the net would
become useless, being equal to the superior or inferior
threshold in correspondence of all the different values
higher or lower than a determined limit. In other words,
for a great amount of data not standardize to the interval
the neurons would simply transmit the threshold value,
so a wide part of the information would be lost. As far
as the methods, the linear reduction transforms the se-
ries of values x
k
in the series
k
ˆ
x
x
k
, using the following
formulas:
)min()max(
)min(
ˆ
,
,kk
ktk
tk
xx
xx
x
=
(13)
if the range is between 0 and 1, and
1
)min()max(
)min(
2
ˆ
,
,
=
kk
ktk
tk
xx
xx
x
(14)
if the desired range is between -1 and 1, while the loga-
rithmic reduction uses the formula:
))log(max(
)1log(
ˆ
,
,k
tk
tk
x
x
x+
=
(15)
3.3 Learning Process
After the data have been scaled, we have to deal with the
problem of the evaluation of the parameters (weights)
through the process known as learning (training) of the
neural network. Certainly it is a much more complex
problem than the evaluation of the parameters of a linear
model, as for the nature of high nonlinear complexity of
neural networks. For these reasons numerous optimal
solutions can exist, but they do not minimize the differ-
ence between the predictions of the net and the effective
values to be evaluated. In short, in any non linear model
it is necessary to begin the evaluation of the parameters
on the basis of conditions which represent a guess of the
value of the same. However, as it will be shown, the ca-
pability of the process of evaluation of the parameters to
converge to a global optimum depends on the goodness
of these initial hypothesis: in fact if it is situated near a
local optimum instead of the global one [10], it is likely
that the first one will be reached.
This is illustrated in Figure 3: the initial guess of the
parameters (or weights of the neurons) could accidentally
be situated wherever on the x-axis: if it is near a local
minimum, the training process of the net would lead to-
wards this. Later on, it will be observed that the training
process of the network is completed when a point is
reached in which the derivative of the loss function is
null: we must remember that this condition, beyond the
global optimum, identifies also the local ones and the
saddle points. So it can be anticipated that if the learning
coefficient, which indicates the sensibility of the net to
the training process, is too low, this would lead to the
impossibility of the network to escape from local opti-
mums; while if it is too high, it could carry the training
process to oscillate continuously far away from the opti-
mum point, and therefore the network would diverge. In
analytical terms, it is possible to illustrate the learning
process of a net with two hidden layers, for which it is
therefore necessary to determine the set of parameters
={ω
k,i
, ρ
l,k
,γ
l
}.
The problem consists of
[18] the minimizing of the loss
function, defined as the sum of the squares of the differ-
ences between the observed data sample y and the predic-
tion of the net ŷ:
Figure 4. Example of succession of local and global mini-
mums
ψ
22 ELIANA ANGELINI, ALESSANDRO LUDOVICI
Copyright © 2009 SciRes JSSM
=
−=Ω
T
ttt
yy
1
2
)(
)
ˆ
()(min
ψ
(16)
);(
ˆΩ=
tt
xfy
(17)
in which T is the number of the observations of the out-
put vector y, and f (x
t
;) represents the neural network. Ψ
is a nonlinear function of . All nonlinear optimizations
begin with an initial guess about the solution and try fur-
ther, better solutions until finding the best possible within
a reasonable number of iterations. Different methodolo-
gies have been proposed in order to lead this search:
some make reference to complex results of logical- nu-
merical analysis, e.g. genetic algorithms, in alternative to
the classic method of the reduction of the gradient or
Newton-Raphson method. In any case the chosen algo-
rithm continues until the last iteration n, or in alternative
a tolerance criterion can be set up, stopping the iterations
when the reduction of the error function comes down a
predefined tolerance value. In order to avoid local opti-
mums, a solution could be to determine a first conver-
gence of the process, and then to repeat it with a set of
different initial parameters in order to verify whether the
solution changes. Alternatively, numerous processes
could be carried out to determine the best solution.
However, there are the most important problems when
the number of the parameters increases or the architecture
of the network becomes particularly complex. Paul John
Werbos proposed in the beginning of 1970’s an alterna-
tive to the gradient method called back-propagation
method. It is a very flexible method to avoid the prob-
lems caused by the evaluation of the Hessian matrix in
the reduction of the gradient, and surely it is the most
used method. In the passage from an iteration to the suc-
cessive one in the process of evaluation of the parameters,
the inverse Hessian matrix is in fact replaced by an iden-
tity matrix having dimension equal to the number k of the
parameters, multiplied by the learning coefficient ρ:
00
1
001
)( ZZH
ρ
−=−=Ω−Ω
(18)
In order to avoid oscillations this coefficient is chosen
in the range [0.05,0.5] and it can also be endogenous, that
is it can assume various values when the gradient comes
down and the process seems to converge; or finally dif-
ferent coefficients for the various parameters can be
adopted. However, the problem of the choice of this co-
efficient remains, together with the existence of local
minimums. Moreover, low values of the learning coeffi-
cient, although as anticipated are able to avoid oscilla-
tions, can extend uselessly the convergence of the mini-
mizing process. This can however be accelerated adding
a ‘momentum’ for which at iteration n we will have:
)()(2111 −−−− Ω−Ω+−=Ω−Ω nnnnn Z
µρ
(19)
Therefore, with µ generally equal to 0.9, the calcula-
tion of the parameters moves more fast outside a plateau
in the error surface. Now we will briefly discuss the
methods used to estimate the effectiveness of the output
of the net. Relatively to the evaluation of the goodness of
the predictions of the net, the most common index is
R-squared (goodness of fit) especially as far as the capa-
bility of the net to predict the data with which it has been
trained is concerned, and the root mean squared error
(Rmse) as for the capability to generalize the predictions
outside the data sample used for the training; in other
words, divided the sample into two parts, the first (in
sample) will be used in order to train the net, and the
other (out of sample), in general equal to about 25% of
total data, will be used to estimate the capability of the
net to predict data coming from the same population but
not used for the training.
However, as to the total amount of necessary data
5
[10],
undoubtedly a neural network requires the evaluation of
many more coefficients than, for example, a linear model,
and this leads to the necessity of a wide sample. Surely
the availability of wide samples improves the predictive
abilities of the net, but it also implies longer training
times. Moreover, the availability of a wide sample not
always is a positive aspect, especially in the financial
field where using very old data brings distortions in the
models, because they tend to vary with extreme rapidity
and therefore very remote data are no more in any rela-
tions with the present ones.
4. Credit Risk Approach: Our Assumptions
The recent history of financial markets shows how, to the
impetuous development of the financial innovation proc-
ess, which has invested all the structural components of
the same, has been associated the constant engagement of
the operators in finding more efficient computational
methodologies, able to be an effective dynamic support
of the analysis. Growing concerns about credit risk have
created the need for sophisticated credit risk analysis and
management tools. Credit risk measurement models and
credit risk management tools are both of significant im-
portance in the credit market.
The valuation of credit default swap depends on the
credit quality of the reference entity. The default predic-
tion has long been an important and widely studied topic.
There are two main types of models that attempt to de-
scribe default processes in the credit risk literature:
structural and reduced form models. The first approach is
based on modeling the underlying dynamics of interest
rates and firm characteristics and deriving the default
probability based on these dynamics
6
[1,19,20,21]. So
they use the evolution of firms’ structural variables, such
as asset and debt values, to determine the time of default.
Merton’s Model was the first modern model of default
and is considered the first structural model. In Merton’s
model, a firm defaults if, at the time of servicing the debt,
its assets are below its outstanding debt. In the second
5
F. Dolcino, C. Giannini, and E. Rossi
, where the concepts of
“evaluation error” and “approximation error” are analyzed, 1998.
6
R. C. Merton, 1974; F. Black and J. COX, 1976; F. A. Longstaff and
E. Schwartz, 1995; H. E. Lelan and K. B. Toft, 1996; C. Dufresne and R
.
Goldstein, 2001.
ELIANA ANGELINI, ALESSANDRO LUDOVICI 23
Copyright © 2009 SciRes JSSM
approach, instead of modeling the relationship of default
with the features of a firm, this relationship is learned from
the data. Reduced form models do not consider the rela-
tion between default and firm value in an explicit manner
[22,23,24]. The time of default in intensity models is the
first jump of an exogenously given jump process. The
parameters governing the default hazard rate are inferred
from market data. Structural default models provide a
link between the credit quality of a firm and the firm’s
economic and financial conditions. Thus, defaults are
endogenously generated within the model instead of
exogenously given as in the reduced approach.
The focus of our model is on the structural approach,
pioneered by Merton, with some important integration.
4.1 A Brief Review of the Structural Approach:
Merton’s Model
Merton proposes a simple model of the firm that provides
a way of relating credit risk to the capital structure of the
firm. The firm has issued two classes of securities: equity
and debt. The equity receives no dividends. The debt is a
pure discount bond. The value of the firm’s assets is as-
sumed to obey a lognormal diffusion process with a con-
stant volatility. Merton adopts are the inexistence of
transaction costs, bankruptcy costs, taxes or problems
with indivisibilities of assets; continuous time trading;
unrestricted borrowing and lending at a constant interest
rate r; no restrictions on the short selling of the assets; the
value of the firm is invariant under changes in its capital
structure (Modigliani-Miller Theorem) and that the firm’s
asset value follows a diffusion process.
Merton models equity in this levered firm as a call op-
tion on the firm’s assets with a strike price equal to the
debt repayment amount (D). If at expiration (coinciding
to the maturity of the firm’s short-term liabilities, as-
sumed to be composed of pure discount debt instruments)
the market value of the firm’s assets (V) exceeds the
value of its debt, the firm’s shareholders will exercise the
option to “repurchase” the company’s assets by repaying
the debt. However, if the market value of the firm’s as-
sets falls below the value of its debt (V<D), the option
will expire unexercised and the firm’s shareholders will
default. The probability of default (PD) until expiration is
set equal to the maturity date of the firm’s pure discount
debt, typically assumed to be one year. Thus, the Pd until
expiration is equal to the likelihood that the option will
expire out of the money. To determine the PD, the call
option can be valued using an iterative method to esti-
mate the unobserved variables that determine the value of
the equity call option, in particular, V (the market value
of assets) and σ
V
(the volatility of assets). These values
for V and σ
V
are
then combined with the amount of debt
liabilities D that have to be repaid at a given credit hori-
zon in order to calculate the firm’s distance to default,
defined to be: (V-D)/ σ
V
or the number of standard devia-
tions between current asset values and the debt repay-
ment amount. The higher the distance to default (denoted
DD), the lower the PD. To convert the DD into a PD es-
timate, Merton assumes that asset values are log-nor-
mally distributed.
Define E as the value of the firm’s equity and V as the
value of its assets. Let E
0
and V
0
be the values of E and V
today; in the Merton framework we have:
)()(
2100
dNDedNVE
rt
−=
T
TrDV
d
V
V
σ
σ
)2/()/ln(
2
0
1
++
=
Tdd
V
σ
−=
12
where σ
V
is the volatility of the asset value and r is the
risk free rate of interest, both of which are assumed to be
constant. Define D* = De
-rt
as the present value of the
promised debt payment and let L=D* /V
0
be a measure of
leverage. Because the equity value is a function of the
asset value we can use Ito’s lemma to determine the instan-
taneous volatility of the equity from the asset volatility:
00
V
V
E
E
VE
=
σ
)(
1
dN
V
E=
where σ
E
is the instantaneous volatility of the company’s
equity at time zero. These equations allow V
0
and σ
V
to
be obtained from E
0
, σ
E
, L and T. The risk neutral prob-
ability, P, that the company will default by time T is the
probability that shareholders will not exercise their call
option to buy the assets of the company for D at the time
T. This depends only on the leverage, L, the asset vola-
tility, σ, and the time of repayment T.
4.2 CDS Valuation
In our analysis, we present some extensions because the
model needs to make the necessary assumptions to adapt
the dynamics of the firm’s asset value process.
We suggest a new way of implementing Merton’s
model using implied volatility, instead of historical vola-
tility: this leads to a higher capability of getting the signals
launched by the market about the creditworthiness of the
firm. The historical volatility
is the realized volatility of a
financial instrument over a given time period. Generally,
this measure is calculated by determining the average
deviation from the average price of a financial instrument
in the given time period. Standard deviation is the most
common but not the only way to calculate historical vola-
tility. By definition, historical volatility will always be
backward looking and lag the real-time volatility envi-
ronment. In the current market environment, however,
where both stocks and implied volatility measures are
rising, many measures of historical volatility begin to
seem no more useful.
24 ELIANA ANGELINI, ALESSANDRO LUDOVICI
Copyright © 2009 SciRes JSSM
The implied volatility of an option contract is the
volatility implied by the market price of the option based
on an option pricing model. Implied volatility is a for-
ward-looking measure, and differs from historical vola-
tility that is calculated from known past prices of a secu-
rity.
Historical volatility tells us how volatile as asset has
been in the past. Implied volatility is the markets view on
how volatile an asset will be in the future. To determine
an option's implied volatility, we have to use a pricing
model. We can tell how high/low implied volatility is by
comparing the market price of an option to the options
theoretical fair value. This is why we need to use an op-
tion pricing model - to determine the fair value of an op-
tion and hence know if the market price for the option is
over/under valued.
In our analysis, equity implied volatilities observed in
the equity options market has received much exploration.
Our neural network model is based on using the implied
volatility of one-year options written on the shares issued
by the company. It is an attractive alternative to the tradi-
tional structural approach; this implementation allows to
use a forward-looking model. Otherwise, our model dif-
fers from the structural approach for the fact that it con-
sider the 30-month historical series for CDS spreads: we
show that the use of these credit spreads in addition to
other inputs, provides a significant improvement in the
accuracy of the model.
We use a model that takes these inputs:
·Leverage of the firm: the level of indebtedness is a
significant enterprise-specific determinant of risk.
·Implied volatility: theoretical value designed to
represent the volatility of the security underlying an
option as determined by the price of the option. The
factors that affect implied volatility are the exercise
price, the risk-free rate, the maturity date and the
price of the option.
·Historical CDS spreads serie: a CDS is a derivative
that protects the buyer against default by a particular
company. The CDS spread is the amount paid for
protection and is a direct market-based measure of
the company’s credit risk. CDS spreads contain
information which is significant for estimating the
probabilities of the occurrence of credit events.
·Recovery rate: percentage of notional of the refe-
rence asset repays in the event of default.
·Risk free rate: is the interest rate that it is assumed
can be obtained by investing in financial instruments
with no default risk.
5. Data and Empirical Results
In this section the potentialities of neural networks in the
approximation of the pricing of credit derivatives will be
shown using real market data, collected from Fitch™ and
Bloomberg™ data bases.
Starting from September 2002, we have collected on a
quarterly basis data regarding 5-year maturity CDS
spreads of 18 companies from various economic sectors,
together with data concerning the leverage of the firms,
the implied volatility of 1-year maturity call options
written on the equities of the firms, and the risk free rate
assumed to be equal to the 1-year constant maturity
Treasury Bill yield. As regards the recovery rate, we have
used the most commonly values adopted by the operators
to price CDS, depending on the economic sector to which
the reference entity belongs to. In the following diagrams
we show the sample collected until March 2006, there-
fore covering 14 quarters.
As regards the risk free rate, we must consider that a
portfolio made up of a risky bond with yield equal to i
and a CDS written on it with a spread equal to sp is virtu-
ally free of any credit risk, so its yield must be equal to
the risk free rate; therefore we have the following ap-
proximation:
Table 1. Details of the companies included in the sample
Figure 5. Risk free rate during our study (
Source: Federal
Reserve System)
Sample description
NTickerNameMarket Cap. (bln $)
1AAALCOA Inc.30,18
2BABoeing Company (The)71,91
3CCLCarnival Corporation30,13
4COXCox Communications Inc. *5,9
5CTXCentex Corporation6,15
6CVSCVS Corporation26,96
7CZNCitizens Communications Corporation4,81
8FDFederated Department Stores Inc.23,16
9GPSGap, Inc. (The)16,23
10IBMInternational Business Machines Corporation149,11
11JPMJPMorgan Chase & Co.177,41
12JWNNordstrom Incorporated15,03
13LEHLehman Brothers Holdings Inc.43,46
14LENLennar Corporation6,74
15MARMarriott International, Inc.19,51
16MCDMcDonald's Corporation56,05
17SBCAT&T Inc.233,83
18TXTTextron Financial Corporation12,21
* Company was delisted on December, 9th 2004. This fact does not affect in any way our results.
09/02 03/03 09/0303/04 09/04 03/0509/05
03/06
0
0,5
1
1,5
2
2,5
3
3,5
4
4,5
5
Risk-free rate
Risk-free rate
03/06
ELIANA ANGELINI, ALESSANDRO LUDOVICI 25
Copyright © 2009 SciRes JSSM
Table 2. Recovery rates (
Source: Altman and Kishore (1996))
Figure 6. Relationship between CDS Spread, Lever-
age and Equity volatility in our sample (
Source: our
elaborations)
spir
f
−=
showing an inverse relationship between sp and rf, confirmed
by market data. We have the following correlation values:
Source: our elaborations
Variable Correlation with CDS Spread
Risk-free (Rf) -0,2187
Recovery rate (R) -0,1475
Leverage (L) -0,0485
Equity volatility (V) 0,6338
Of course we can notice a negative correlation with R
(the recovery rate) and a strong positive correlation with
V (the implied volatility which in our study proves to be
very effective in predicting creditworthiness deteriora-
tion). The absence of a correlation with the leverage
should not seem strange: our sample in fact includes fi-
nancial companies too, which typically have a very high
gearing ratio and a low CDS spread due to prudential
regulation: in any case the neural network can solve this
problem very well because of its nonparametric capabili-
ties. Without considering the financial firms, the correla-
tion of leverage and credit spreads would rise to 0.317.
The sample is made up of companies coming from dif-
ferent economic sectors, as it is easy to catch reading the
recovery rates applied: of course we consider only big (or
at least medium)-caps, the only ones for which a liquid
market for CDS exists. In Figure 6 we show the relation-
ship between CDS spread, Leverage and Equity volatility.
It is evident that there is no linear relation between them.
Moreover, only a few data are characterized by a lever-
age of more than 2: of course these can only be banks,
which for prudential regulation can have a high gearing
ratio. In the following part we will show how neural
networks are able to price both industrial and financial
firms at the same time, even if they show a strongly dif-
ferent leverage.
We have used a feed forward neural network, with the
back propagation algorithm; it is a 4-layer network, with
two hidden layers and therefore an output layer of only
one node (the CDS spread).
The input layer consists of 18 nodes: in the first four
nodes we have the risk free rate, the recovery rate, the
leverage and the implied volatility of the firm; in the re-
maining 14 nodes we have the series of quarterly CDS
spreads of the firm. If there is a lack in the data, we just
use the value of the preceding quarter. This approach
merges data coming from the firm with data (the CDS
spreads) coming from the market, giving great effective-
ness to the predictions of the network. Moreover the
power of this approach can be appreciated observing that
in this way the network is able to price CDS with refer-
ence entities coming both from the industrial field (which
usually have low leverages and high CDS spreads) and
from the financial field (which have an extremely high
gearing ratio but are characterized by a history of low
CDS spreads because of the prudential regulation, using
this detail to discriminate between them). Figure 7 shows
the structure of the network. The sample has of course
been shuffled; the learning parameter has been settled to
0.5 and the initial parameters of the neurons have been
chosen in the range [-2,2]. Our study shows that a logarith-
mic reduction is more efficient, because our sample consists
of extremely variable data, so a simple linear reduction
would enhance the distortions brought by the so- called out-
liers, that is data very different from the rest of the sample.
Figure 7. Structure of the neural network (Source: our
elaborations)
Economic sectorRecovery rate
Hotel chains0,26
Department stores0,33
Finance 0,36
Telecommunications 0,37
Constructions 0,39
Metal and mechanic0,42
Food 0,45
CDS_Spread
Leverage
Equity_Volat
800
700
600
500
400
300
200
100
0
0
2
4
6
8
10
12
14
16
9
0
80
70
60
50
40
30
20
10
RF rate
Recov. Rate
Leverage
Equity vol.
CDS SP-1
CDS SP-2
CDS SP-3
CDS SP-4
CDS SP-5
CDS SP-6CDS SP
CDS SP-7
CDS SP-8
CDS SP-9
CDS SP-10
CDS SP-11
CDS SP-12
CDS SP-13
CDS SP-14
INPUT
LAYER
FIRST
HIDDEN
LAYER
SECOND
HIDDEN
LAYER OUTPUT
LAYER
26 ELIANA ANGELINI, ALESSANDRO LUDOVICI
Copyright © 2009 SciRes JSSM
Figure 8. Typical correlogram of a CDS spread time serie
(Source: our elaborations)
In Figure 8 we show as an example the correlogram for
the CDS spread time series of The Boeing Company only,
for the sake of simplicity, but we obtained the same
structure for all the companies included in our sample: in
the first part we can see the correlation between each
value and a delayed value (the delay being expressed on
the x-axis); the second part shows the correlation be-
tween each value and p preceding values, with p on the
x-axis. It is therefore evident that the correlation between
values, even if decreasing, is strong, so the series is auto-
regressive; we can then express each value in terms of the
preceding ones. In this sense a CDS spread is more simi-
lar to an interest rate than to an equity price, so that it
shows a mean reversion process which tends to pull
spreads higher (lower) than some long-run average level
back to this value over time. Obviously we shall have a
negative (positive) drift. The sinusoidal cycle observable
in the correlogram explains this phenomenon: moreover,
it is a consequence of the strict relationship between CDS
spreads and risk-free interest rates already discussed [25].
Figure 9 showing in red the neural network predictions
and in yellow the real market data, confirms the effec-
tiveness of the neural network in predicting CDS spreads.
In Table 3 and 4 the values of R-squared and Rmse are
shown: as it is easy to observe, the results are highly co-
herent. We compare the results from or implementation
with another model: Creditgrades™. We must stress the
point that using traditional models such as Credit-
grades™ we would obtain predictions almost useless,
even excluding banks from the sample; neural networks
surely are a great pricing instrument in order to evaluate
credit spreads. The architecture of the neural network is
feed forward, trained for 17000 learning epochs using the
back propagation algorithm. Therefore it turns out obvi-
ous that neural networks are able to totally capture the
variability relative to the market dynamics of credit de-
rivatives: because of the fact that in literature there is no
unanimity on the determination of the form of the CDS
spread evaluation function, neural networks can therefore
be seen as effective instruments of elaboration able to
satisfy this lack from a statistical point of view.
Figure 10 shows a “delta” for a CDS contract: in fact
we find on the x-axis the leverage, and on the y-axis the
values calculated with the finite differences method, that is:
h
levSPhlevSP
h
)()(
lim
0
+
=∆
In a similar manner we can calculate for a CDS all the
“greek” letters typical of derivative contracts using the
outputs of the neural network with h-10
-6
. It is evident in
Figure 9. Market data (in yellow) and predictions of the
neural network (in red) (
Source: our elaborations)
Figure 10. Relationship between delta and leverage (
Source:
our elaborations
)
Table 3. Approximation of the neural network (
Source: our
elaborations
)
Error Value
R-squared 0,9082
Root mean squared error 14,3988
Table 4. Comparing statistical results (
Source: our elaborations
)
NN Credit Grades
Linear regression
Correlation
0,9636
-0,02 0,9309
Rmse 14,3988
>100 30,86
R-square 0,9086
>1 0,8566
010 20 3040 50 6070
0
25
50
75
100
125
150
175
200
225
250
275
300
325
350
Values and predictions
Values and predictions
0,010,1
0,18
0,27
0,36
0,45
0,54
0,63
0,720,8
0,88
0,97
1,06
1,15
1,24
1,33
1,421,5
1,58
1,67
1,76
1,85
1,94
2,03
2,122,2
2,28
2,37
2,46
2,55
2,64
2,73
2,822,9
2,98
-0,0125
-0,0100
-0,0075
-0,0050
-0,0025
0,0000
0,0025
0,0050
0,0075
0,0100
0,0125
0,0150
Delta(leverage)
ELIANA ANGELINI, ALESSANDRO LUDOVICI 27
Copyright © 2009 SciRes JSSM
Figure 10 shows a “delta” for a CDS contract: in fact
we find on the x-axis the leverage, and on the y-axis the
values calculated with the finite differences method, that is:
h
levSPhlevSP
h
)()(
lim
0
+
=∆
(20)
In a similar manner we can calculate for a CDS all the
“greek” letters typical of derivative contracts using the
outputs of the neural network with h-10
-6
. It is evident in
the diagram that for high leverages “delta” becomes
negative: in fact we must remember that highly leveraged
companies belong usually to the financial sector, so that
they are less risky because of the prudential regulation.
This effect is explained very well by the network, in fact
for low leverages (typical of the industrial field) we see a
direct relationship between leverage and CDS spreads. In
other words, the neural network is able to recognize the
risk of the activity carried out by the company using the
time series of its CDS spread: in the part of our study
covering the correlation, we obtained an average value
for each observation and the preceding one of 0.90, as it
is evident from the correlogram shown above. This cor-
relation, along with the part regarding the independent
variables, typical of the structural approach, explains the
major part of the variability of CDS spreads.
6. Conclusions and Future Work
In this paper we have discussed an innovative approach
to the study of CDS valuation, using neural networks.
Our analysis is based on modeling the underlying dy
Figure 11. Relationship between vega and equity volatility
(Source: our elaborations)
Figure 12. Relationship between gamma and leverage
(Source: our elaborations)
Figure 13. Relationship between omega and leverage
(Source: our elaborations)
namics of interest rates and firm characteristics and de-
riving the default probability based on these dynamics
(the structural approach).
The model that we propose is peculiar for the use of the
implied volatility of one-year options written on the shares
of the analyzed companies, instead of historical volatility.
Besides, the model differs from the structural approach for
the fact that it considers the 30-month historical series for
CDS spreads, including additional market variables. This
implementation allows to use a forward-looking model
and to capture the dynamic behavior of CDS spreads and
equity volatility. This approach merges data coming from
the firm with data (the CDS spreads) coming from the
market, giving great effectiveness to the predictions of
the neural network. Moreover, the power of this model
can be appreciated observing that in this way the network
is able to price CDS with reference entities coming both
from the industrial field (which usually have low lever-
ages and high CDS spreads) and from the financial field
(which have an extremely high gearing ratio but are
characterized by a history of low CDS spreads because of
the prudential regulation, using this detail to discriminate
between them).
We find that the neural network technique is useful for
analyzing the pricing of a credit default swap. Our model
produces a much lower forecasting error than those tradi-
tional models, such as Creditgrades
TM
, indicating a rela-
tively high precision in the neural network prediction. In
particular, in the last part, starting from the high correla-
tion observed between each CDS spread value and the
preceding one in the time series of each company, we
have trained a neural network based both on these time
series and on the structural details of the firms, that is
leverage, option-implied equity volatility and recovery
rates. Our results in terms of R-squared and Rmse are
highly coherent and are confirmed by the empirical data.
Our analysis presents the results that we have achieved
and shows that the neural network model offers an alter-
native to traditional methodologies to deal with compli-
cated issues related to CDS valuation.
Anyway, in this period, the CDS market is particularly
volatile. The impact on the economy of the deflating
0,54710
1418
22
26 30
3438
42
4650
5458
62
66 70
7478
8286
90
9498
-0,0200
-0,0180
-0,0160
-0,0140
-0,0120
-0,0100
-0,0080
-0,0060
-0,0040
-0,0020
0,0000
Vega(vol)
0,01
0,15
0,29
0,43
0,570,7
0,82
0,961,1
1,22
1,361,5
1,62
1,761,9 22,1
2,22
2,362,5
2,62
2,76
2,9 3
-0,0800
-0,0700
-0,0600
-0,0500
-0,0400
-0,0300
-0,0200
-0,0100
0,0000
0,0100
0,0200
Gamma (leverage)
0,01
0,14
0,27
0,40,50,60,70,80,91
1,11,21,31,41,51,61,71,81,92
2,12,22,32,42,52,62,72,82,93
-0,0900
-0,0800
-0,0700
-0,0600
-0,0500
-0,0400
-0,0300
-0,0200
-0,0100
0,0000
0,0100
0,0200
0,0300
Omega (leverage)
28 ELIANA ANGELINI, ALESSANDRO LUDOVICI
Copyright © 2009 SciRes JSSM
housing bubble, the credit crisis in general, have stoked
fear about increasing corporate defaults. This crisis is
about credit risk. A credit bubble has ballooned for years,
being enhanced by the existence of CDS. As credit origi-
nators can pass their risk to other agents, they have been
less careful about the quality of their loans. In that sense,
CDS have given an incentive for distributing more credit
to more risky borrowers. As banks and all financial insti-
tutions and companies have committed themselves in the
CDS market, they are now highly dependent on market
continuity and on its smooth functioning. The failure of a
major participant (bankruptcies of Bear Sterns, then those
of AIG and Lehman Brothers) can put at stake all the
others; the faith in the reliability of the market has been
deeply shaken by these events.
In any case, some aspects of the proposed evaluation
methodology require additional research: the possible next
step for the research community is to improve the models
in the case of catastrophic circumstances (the so-called
LFHI (low frequency-high impact) events); another in-
teresting case of study would regard the analysis of the
recent financial crisis when more reliable information
regarding financial companies will be available.
REFERENCES
[1] R. C. Merton, “On the pricing of corporate debt: The risk
structure of interest rate,” The Journal of Finance, 29 1974.
[2] S Henke, H. P. Burghof, and B. Rudolph, “Credit
securitization and credit derivatives: Financial instruments
and the credit risk management of middle market
com-mercial loan portfolios”, CFS Working paper Nr,
July 1998.
[3] A. Greenspan, “Economic flexibility,” Speech to HM
Treasury Enterprise Conference, London, UK, 2004.
[4] S. DAS, “Credit derivatives: Trading & Management of
Credit & Default Risk,” John Wiley & Sons, Chicago,
1998.
[5] J. M. Tavakoli, “Credit derivatives: A guide to instruments
and applications,” John Wiley & Sons, Chicago, 1998.
[6] G. R. Duffee and C. Zhou, “Credit derivatives in banking:
useful tools for managing risk?” Journal of Monetary
Economics, No. 48, 2001.
[7] R. Stultz, “Risk management and derivatives,” South-
Western Publishing, 2003.
[8] B. A. Minton, R. Stultz, and R.Williamson, “How much
do bank use credit derivatives to reduce risk?” Working
Papers, 2005.
[9] Bank for international settlement, “International convergence
of capital measurement and capital standards,” Basel Committee
on Banking Supervision, A Revised Framework, Update
November 2005.
[10] F. Dolcino, C. Giannini, and Rossi, E, “Reti neurali artificiali
per l’analisi e la previsione di serie finanziarie,” Collana
studi del Credito Italiano, 1998.
[11] D. Floreano and S. Nolfi, “Reti neurali: algoritmi di
apprendimento, ambiente di apprendimento, architettura,”
in Giornale Italiano di Psicologia, a. XX, pp. 15-50, febbraio
1993.
[12] M. Gori, “Introduzione alle reti neurali artificiali,” in Mondo
Digitale n. 4, AICA, settembre 2003.
[13] C. Wu and C. H.Yu, “Risk aversion and the yield of
corporate debt,” in Journal of Banking and Finance, No.
20, 1996.
[14] C. Wu, “A certainty equivalent approach to municipal
bond default risk estimation,” in Journal of Financial
Research, 1991.
[15] P. D. Mcnelis, “Neural networks in finance,” Elsevier
Academic Press, 2005.
[16] A. Beltratti, M. Serio, and P. Terna, “Neural networks for
economic and financial modelling,” International Thomson
Computer Press, 1996.
[17] S. Hykin, “Neural networks: A comprehensive foundation,”
Prentice Hall International, 1999.
[18] P. Werbos, “Backpropagation, past and future,” in Proceedings
of the IEEE International conference on neural networks,
IEEE press, 1988.
[19] F. Black and J. Cox, “Valuing corporate securities: Some
effects of bond indenture provisions,” Journal of Finance,
pp. 31, 1976.
[20] H. E. Lelan and K. B. Toft, “Optimal capital structure,
endogenous bankruptcy, and the term structure of credit
spreads,” The Journal of Finance, pp. 51, 1996.
[21] Collin dufresne and P. R. Goldstein, “Do credit spreads
reflect stationary leverage ratios,” Journal of Finance, pp.
52, 2001.
[22] R. A. Jarrow and S. M. Turnbull, “Pricing derivatives on
financial securities subject to credit risk,” The Journal of
Finance, pp. 50, 1995.
[23] R. Jarrow, D. Lando, and S. Turnbull, “A markov model
for the term structure of credit spreads,” Review of
Financial Studies, pp. 10, 1997.
[24] D. Duffie and K. J. Singleton, (1998), “Modelling term
structures of defaultable bonds,” Review of Financial
Studies, pp. 12, 1999.
[25] J. C. Hull, “Opzioni, futures e altri derivati,” Il Sole 24Ore
S. p. A., 2003.
(Edited by Vivian and Ann)