1
Kernel Alternatives to Approximate Operational Seve rity
Distribution: An Empirical Application.
Filippo di Pietro
University of Seville
[email protected] María Dolores Oliver Alfonso
University of Seville
Ana I. Irimia Diéguez
University of Seville
The severity loss distribution is the main topic in operational risk estimation.
Numerous modelling methods have been suggested although very few work for
both high frequency small losses and low frequency big losses. In this paper
several methods of estimation are explored. The good performance of the
double transformation kernel estimation in the context of operational risk
severity is worthy of a special mention. This method is based on the work of
Bolance and Guillen (2009), it was initially proposed in the context of cost of
claims insurance, and it supposes an advance in operational risk research.
Keys words : Operational risk, Basel II, Loss Distribution Approach, Non-
parametric estimation, Kernel smoothing estimation.
JEL Classification : G10, G20, G21, G32, C11, C14, C16
2
1. Introduction
The revised Basel Capital Accord requires banks to meet a capital requirement
for operational risk as part of an overall risk-based capital framework. As
regards the definition aspects, the Risk Management Group (RMG) of the Basel
Committee and industry representatives have agreed on a standardised
definition of operational risk, i.e. “the risk of loss resulting from inadequate or
failed internal processes, people and systems or from external events” (BIS,
2001b).
The proposed discipline establishes various schemes for the calculation of the
operational risk charge, which increased sophistication and risk sensitivity.
The most sophisticated approach is the Advanced Measurement Approaches
(AMA), based on the adoption of internal models of banks. The framework gives
banks a great deal of flexibility in the choice of the characteristics of their
internal models, provided they comply with a set of eligible qualitative and
quantitative criteria and can demonstrate that their internal measurement
systems are able to produce reasonable estimates of unexpected losses. Use
of the AMA is subject to supervisory approval.
The Basel Committee on Banking Supervision (the Committee) recognises that
the selection of approach for operational risk management chosen by an
individual bank will depend on a range of factors, including its size and
sophistication and the nature and complexity of its activities.
As regards the measurement issue, a growing number of articles, research
papers and books have addressed the topic from a theoretical point of view. In
practice, this objective is complicated by the relatively short period over which
operational risk data have been gathered by banks; obviously, the greatest
difficulty is in collecting information on infrequent, but large losses, which, on
the other hand, contribute the most to the capital requirement. The need to
evaluate the exposure to potentially severe tail events is one of the reasons why
the new Capital framework requires banks to supplement internal data with
further sources, (i.e. external data, scenario analysis), in order to compute their
operational risk capital requirement.
3
Unlike the modelling of market and credit risk, the measurement of operational
risk is faced with the challenge of limited data availability. Furthermore, due to
the sensitive nature of operational loss data, institutions are unlikely to freely
share their loss data. Recently, the measurement of operational risk has moved
towards the data-driven Loss Distribution Approach (LDA) and therefore, many
financial institutions have begun collecting operational loss data in order to take
advantage of this approach.
Within the AMA approach, LDA is a tool used by banks to identify and assess
operational risk. LDA is a quantitative approach based on actuarial models
which can provide the risk profile of a financial entity. LDA has emerged as the
best model for statistical treatment of operational risk.
The selection of the severity loss distribution is probably the most important
phase and the one that involves more complication in the estimation of the
capital requirement of operational risk. An incorrect choice of distribution leads
to a severe distortion of the model and a low estimate or overestimates of
regulatory capital with respect to the operational risk. This could have a big
impact to Basel II economic capital requirement. Recent literature on
operational risk has focused attention on the use of a parametric specification of
loss distribution. It is the simplest method to follow, since it attempts to fit
analytical distributions with certain properties. The aim of this approach is to find
a distribution of losses that may feasible to the severity distribution of the losses
of the sample available. Another commonly applied technique in operational risk
is Extreme Value Theory (EVT) which is a good methodology in cases where
the main attention is the tail of the distribution.
The main problems of parametric methodologies is that it is frequently
impossible to represent the operational severity distribution with just one
parametric distribution that give a good fit in the same time in the body and in
the tail of the distribution. Accordingly to the most recent literature (Gustafsson,
J., 2008) on operational risk, one good candidate for an operational risk severity
distribution that can solve the problem of the whole fit is provided be the
modified Champerwone distribution.
4
An alternative way is shown by EVT which attempts to solve this problem fitting
the tail alone, however no objective method exists which decides the threshold
where the tail starts, and this affects greatly the estimation. Embrechts et al.
(2003, 2004) explain that whereas EVT is the natural set of statistical
techniques for estimating high quantiles of a loss distribution, this can be carried
out with sufficient accuracy only when data satisfy specific conditions. Another
problem is that EVT tends to overestimate the capital requirement with respect
to operational risk; this might be particularly relevant in this case where the
database come from a medium-size Spanish Savings Bank. Despite these
problems, EVT is widely used in risk management and especially in operational
risk, as in McNeil (1996), Moscadelli (2004), and Cruz (2002).
Consider this; we take as alternative techniques those that permit the
quantification of operational loss severity by fitting the whole distribution and
that do not require the specification of parametric distribution. To this end, the
kernel estimation is taken as the starting point which has been improved with
the parametric transformation approach given in Wand, Marron and Ruppert
(1991) and recently considered in Bolance, Guillen and Nielsen, (2008), Buch-
Larsen, Nielsen, Bolance and Guillen (2005). The analysis is completed, with
the latest methodology developed by Bolance and Guillen (2009) based on a
double transformation which in our opinion can notably improve the operational
loss severity estimation. In order to explore all possible methods a data sample
of losses is utilized based on operational risk of a medium-sized Spanish
Savings Bank.
In this article, we attempt to find which techniques (parametric and non-
parametric) yield a more appropriate measure of operational risk severity, with a
particular focus on saving Bank cases, demonstrating that non-parametric
estimation with last improvement (especially the double transformation kernel
estimation) is a good alternative method to fit loss severity distribution, that
perform much better than parametric distribution. These methodologies permit
to fit the whole distribution making unnecessary to estimate a threshold.
Another good property of non-parametric techniques respect on EVT is that
seems no overestimate capital requirement. The double transformation kernel
5
estimation was applied in insurance claim field; we think that this methodology
can represent an advance in operational loss severity estimation too.
The remainder paper is organized in the following way:
In the second section, a full description of the parametric and non parametric
methodologies is reported. In the third section the characteristics of data and an
exploratory analysis are presented. In the fourth section the results analysis
and comparison are commented. And the last is the conclusion section.
2. Theoretical Framework
In operational risk and specifically in the dataset considered, most operational
losses are small and extreme losses are rarely observed, although the latter
can have a substantial influence on the capital charge estimation.
In previous studies, i.e. de Fontnouvelle, Rosengren, and Jordan (2005), et. al.,
the authors have tried to find a parametric distribution that could fit the whole
distribution well; however it is a major problem to find a distribution that provides
a good fit for both the body and the tail part of the loss distribution. This method
suffers from another problem since no emphasis is given to the importance of a
good fit in the tail, where the problem of operational risk is focused.
One way to solve such an inconvenient is to analyze small and large losses
separately with the Extreme Value Theory (EVT). This approach involves some
drawbacks: choosing the appropriate parametric model; finding the best way to
estimate the parameter; and the most important, identifying a criterion to
determine the threshold, not to mention that in this case any loss threshold
applied would lose important information and therefore robustness of the results
obtained.
A non-parametric approximation, especially the kernel estimation, can be a
good alternative. The classic kernel estimation of the distribution function is a
simple smoothing of the empirical distribution, and for this reason lack of
sample information imply loss of precision in the estimation of the heavy losses.
The transformed kernel estimation is an alternative that improves the results
obtained with the classic kernel estimation. This approach transforms the
6
variable by using concave function symmetrization on the original data, and
then obtains the classic transformed Kernel variable.
The methodology used in Bolancé y Guillen (2009) is proposed as an
alternative method
They propose new transformation kernel estimation, based on a double
transformation, which can improve the estimation of risk measures. In this paper
this methodology is tested in the field of operational risk in order to compare the
results with those obtained through more traditional methodologies. To this end,
in addition to the non-parametric methodology to fit the data, parametric
distributions such as lognormal, Weibull, and a special case of the
Champernowne distribution are used.
2.1. The parametric loss models.
This section explores three parametric alternatives which capture the severity
losses for operational risk.
The most common distribution in modelling the OR is the lognormal whereby
distribution1, one can say that a random variable X has a lognormal distribution
if its density and distribution are, respectively2:
2
2
2
)(log
2
1)( σ
πσ
ux
ex
xf−−
= (1)
−Φ=σ
uxxF
log)( , x>0 (2)
where Φ (x) is the standard normal distribution.
The parameters )( ∞<<−∞ uu and )0( >σσ are the centre and scale,
respectively, and can be estimated with the maximum likelihood method:
∑=
=n
jjx
nu
1
log1
ˆ , ∑=
−=n
jj ux
n 1
22 )ˆ(log1σ̂ (3)
The inverse distribution is upepF +Φ− −
= σ)(1 1
)( , therefore the lognormal random
variable can be simulated.
1 The lognormal distribution was proposed by the Basel Committee for modelling operational
risk. 2 See: Chernobai. A., Rachev. S., Fabozzi. F. Operational Risk: A guide to Basel requirements, Models, and Analysis (2006).
7
The Weibull distribution is a generalization of the exponential distribution,
whereby two parameters instead of one allow more flexibility and heavier tails.
The density and distribution are: αβααβ xexxf −−= 1)( (4)
αBxexF −−= 1)( (5) where x> 0, with )0( >ββ as the scale parameter and )0( >αα as the shape parameter. There is no inverse of a Weibull random variable in a closed formula. To
generate a Weibull random variable, an exponential random variable Y must be
generated with parameter β and then follow the transformation α1
YX =
follows.
The Champernowne distribution was used in the field of operational risk either
as suitable parametric distribution to fit the severity of operational risk3, or as a
transformation in kernel estimation4. The original Champernowne distribution
has density
0 x)(
2
1)(
2
1(
)( ≥++
=− αα λ
M
x
M
xx
cxf , (6)
where c is a normalizing constant and α, λ and M are parameters. When λ
equals 1 and c equals α2
1 the density of the original distribution is simply
called the Champernowne distribution
)(
)(2
1
αα
αααMx
xMxf
+=
−
, (7)
The Champernowne distribution converges to a Pareto distribution in the tail,
while resembling like a lognormal distribution near 0 when α>1. Its density is
either 0 or infinity at 0 (unless α=1).
3 See:Gustafsonn. J., Thuring. F. (2008) “A Suitable Parametric Model for Operational Risk Applications” 4 See: Bolance. C., Guillen. M., (2009) “Estimación Núcleo Transformada para aproximar el Riesgo de Pérdida”
8
In this work the modified Champernowne distribution function5 is used, with a
new parameter c that ensures the possibility of a positive finite value of the
density at 0 for all α.
The modified Champernowne cdf is defined for 0≥x and has the form
+∈∀−+++
−+= RxccMcx
ccxxT cM
2)()()(
)(,, ααα
αα
α (8)
2.2. Extreme Value Theory
As seen in the conventional inference, the influence of the small/medium-sized
losses in the curve parameters estimate prevents the attainment of models that
fit the tail data accurately. An obvious solution to this problem is to disregard the
body of the distribution, and to focus the analysis only on the large losses.
When only the tail area is considered, several distributions could be adopted
such as lognormal and Pareto, which are often used in insurance to model large
claims. However, in this section the attention is focused on extreme distribution
stemming from the Extreme Value Theory (EVT) 6, and especially on Peak Over
Threshold (POT)7.
As witnessed by Chapelle et al. (2008), this approach enables us to
simultaneously determine the cut-off threshold and to calibrate a distribution for
extreme losses above this threshold. The severity component of the POT
method is based on a distribution (Generalized Pareto Distribution - GPD),
whose cumulative function is usually expressed as the following two-parameter
distribution8:
0 exp1
0 )1(1)(
1
,
=
−−
≠+−=
−
ξσ
ξσ
ξ ξ
σξif
x
ifx
xGPD
(9)
5 See: Buch-Larsen, T., Guillén, M., Nielsen, J.P. and Bolance, C. (2005) “Kernel density estimation for heavy-tailed distributions using the Champernowne transformation”- 6 For a comprehensive source on the application of EVT in finance and insurance see Embrechts et al., (1997), and Reiss and Thomas, (2001) 7 In the peak over threshold (POT) model, the focus of the statistical analysis is put on the observations that lie above a certain high threshold. 8 See: Moscadelli, M., (2004), “The Modelling of Operational Risk: Experience with the Analysis of the Data Collected the Basel Committee”,
9
where: 0000 <−≤≤≥≥ ξξσξ if , if xx and ξ and σ represent the shape
and the scale parameter respectively.
In this work we use the extended version of GPD which includes a location
parameteru :
0 exp1
0 )1(1)(
1
,
=
−−−
≠−+−=
−
ξσ
ξσ
ξ ξ
σξif
ux
ifux
xGPD (10)
The inverse of the GPD distribution has a simple form:
0for /)1()(
and 0for )1log()(1
1
≠−−==−−=
−−
−
ξξβξβ
ξpupF
pupF (11)
Although, several authors (see Dupuis, 2004; Matthys and Beirlant, 2003) have
suggested methods to identify the threshold, there is no single widely accepted
method to select the appropriate cut-off.
2.3. The Kernel Estimation of severity loss distribution
The kernel estimation of the distribution function equals the integral of the
kernel estimation of the density function:
∑=
−=n
i
iX h
Xxk
nhxf
1
1)(ˆ
, (12)
Given by the kernel estimation of the density function, where k(.) is the kernel,
that normally coincides with a symmetric density function centered at zero and
asymptotically bounded or unbounded. Two examples are the Epanechnikov:
>
≤−=
10
11(43
)(
)2
tif
if tttk (13)
And the Gaussian kernel:
+∞≤≤∞−
−= t
ttk
2exp
2
1)(
2
π (14)
10
The parameter h is the bandwidth used to control the degree of smoothing of
the estimation, i.e the higher the value of this parameter, the greater the
smoothing of the estimation. By integrating the core estimation of the density
function and using the change of variable hXut i /)( −= , the kernel estimation of
the distribution function is obtained:
duh
Xuk
nhdu
h
Xuk
nhdu
h
Xuk
nhxF
n
i
h
Lxi
n
i
xi
n
i
ix
X
i
∑∫∑∫∑∫=
−
∞−=
∞−=
∞−
−=
−=
−=111
111)(ˆ
(15)
In the above expression k (.) is the distribution function associated with the
density function that acts as a kernel. The properties of the kernel estimation of
the distribution function were analyzed by Reiss (1981) and Azzalini (1981).
Both authors suggest that when ∞→n , the mean square error of )(ˆ xF X is
approximated as: 2
24
2
)()(2
1)(1)(())(1)((
′+−
−−∫
∫ dttktxfhn
dttKxhf
n
xFxFX
XX
(16)
The first two parts of the above expression correspond to the asymptotic
variance and the third term is the squared asymptotic bias.
2.4. Transformation Kernel Estimation
The classic kernel estimation of the distribution function is a simple smoothing
of the empirical distribution, and for this reason the loss of precision recurs in
the estimation of the heavy losses due to the lack of sample information. The
transformation kernel estimation is an alternative that improves the results
obtained with the classic kernel estimation. This approach transforms the
variable by using concave function symmetrization on the original data, and
then obtains the classic transformed Kernel variable. Let T (.) be a concave
transformation, where y = T (x) and Yi=T(Xi), i=1…..,n are the observed
transformed losses. Therefore, the kernel estimation of the transformed variable
is:
))((ˆ)()(11)(ˆ
11
xTFh
XTxTK
nh
YyK
nyF Y
n
i
in
i
iY =
−=
−= ∑∑
== (17)
It shows that the transformed kernel estimation of )(ˆ xFX equals:
))((ˆ)(ˆ)( xTFxF xTx = (18)
11
In order to obtain the transformed kernel estimation, it is necessary to determine
which transformation to use, the kernel function, and to calculate the bandwith.
In the kernel estimation of density function literature, several methods are
proposed for the calculation of the bandwith9, however very few alternatives are
analyzed in the context of kernel function of distribution estimation.
Boloance and Guillen (2009) propose an adaptation of the method based on the
normal distribution described by Silverman (1986).
This method implies the minimization of the mean integrated squared error
(MISE):
( ) .)(ˆ)())(ˆ(2
−= ∫ dyyFyFEyFMISE YYY (19)
The asymptotic value of MISE is known as A-MISE (Asymptotic Mean
Integreted Squared Error).
By integrating the asymptotic mean square error given in expression (19) and
by taking the distribution function YF estimate into account the A-MISE
becomes:
∫∫∫ ′′+−−− −− ,))((4
1))(1)(())(1)(( 24
211 dyyFhkdttKtKhndyyFyFn YY
(20)
where dyyfdyyFdttktk y
2222 ))((())(( )( ∫∫ ∫ ′=′′=
minimizing (20) by h& , overwhelming the asymptotically optimal smoothing
parameter:
3
13
1
22 ))((
))(1)(( −
′
−=
∫∫ n
dyyfk
dttKtKh
Y
& (21)
Silverman (1986) proposed approximating h& by replacing the terms that depend
on the theoretical density function with the value they would obtain if were
assumed that f is the density of a normal distribution (u ,σ ). By using the kernel
of Epanechnikov:
3
13
1
35900 −
= nh πσ& (22)
A change in the variable is the basis for the transformed kernel estimate and is
known to enhance non-parametric estimation of density which has strong
9 Wand, M., P., and Jones, M.,C., (1995). “Kernel Smoothing”. Chapman & Hall
12
asymmetry. In their work Bolonce and Guillen (2009) review various types of
transformations, both parametric and non-parametric. In this article, we consider
the transformations used by Buch-Larsen et al. (2005), who transformed data
by using distribution of function of a generalization of the distribution of
Champernowne (1958).
2.4.1. The double transformation kernel estimation
In this work as an alternative method the methodology used in Bolancé y
Guillen (2009) is proposed
This estimation was initially applied in Bolancé, Guillen and Nielsen (2008) in
the density function context and later in Bolance and Guillen (2009) for the
estimation of a distribution function. In this article, this methodology applied to
is the estimation of operational loss severity distribution.
Bolancé and Guillen (2009) propose a new transformation kernel estimation,
based on a double transformation, which can improve the estimation of risk
measures.
The A_MISE expression given in (20) shows that to obtain the asymptotically
optimal smoothing parameter it is sufficient to minimize:
∫∫ −−′ − ,))(1)(())((4
1 1242 dttKtKhndyyfhk (23)
This is minimized when the function ∫ ′ dyyf 2))(( is minimal.
The proposed method is based on transforming the variable so that its
distribution is achieved by minimizing the above functional.
Terrell (1990) proves that the density of the Beta (3, 3) defined in the domain (-
1, 1) minimizes ∫ ′ dyyf 2))(( among all densities with a known variance. The
density functions and distribution of Beta (3, 3) are:
.2
1
16
15
8
5
16
3)(
,11,)1(16
15)(
35
22
++−=
≤≤−−=
xxxxH
xxxh (24)
Double transformed kernel estimation involves carrying out a first processing of
the data, by using the distribution function of the generalized Champernowne
with three parameters, and hence the transformed variable has a distribution
13
that is located around a Uniform distribution (0,1). Subsequently, the data is
transformed again by using the inverse of the Beta function (3, 3),
nnt YZHYZH == −− )(,........,)( 11
1 .
The result of this double transformation will have a distribution close to that of
Beta (3, 3). The resulting transformed kernel estimation is:
∑∑=
−−
=
−=
−=
n
i
icMcMn
iY h
XTHxTH
nh
YyK
nyF
1
ˆ,ˆ,ˆ1
ˆ,ˆ,ˆ1
11
))(())((11)(ˆ αα (25)
The smoothing parameter h is estimated by following the methodology
explained above, with the knowledge that 715))(( 2 =′∫ dyyf ,for Beta (3, 3), and
hence by using the Epanechnikov kernel, we obtain:
3
1
3
1
3 33 −== nnh& (26)
3. Empirical Analysis
The selection of the severity loss distribution is probably the most important
phase and the one that involves more complication in the estimation of the
capital requirement of operational risk. An incorrect choice of distribution leads
to a severe distortion of the model and a low estimate or overestimates of
regulatory capital with respect to the operational risk.
In this article, we attempt to find which techniques (parametric and non-
parametric) yield a more appropriate measure of operational risk severity, with a
particular focus on saving Bank cases. To this end, all methodologies presented
in the previous sections are applied to a dataset presented below.
Data available in this work is provided by a medium-sized Spanish Savings
Bank which has compiled internal data in order to make step-by-step advances
in measuring and modelling Operational Risk.
The threshold for collecting the loss data in the saving bank is set to 0 Euros.
Unlike other studies, where the threshold is set at 10,000 Euros, 95% of the
losses are lower than 542.32 Euros. The data provided spans from the year
2000 to 2006. Years 2000, 2001, 2002 and 2003 contain 2, 5, 2, and 50
14
operational loss events, respectively, due to in that year’s bank did not provide
a recompilation system for operational losses; however, in the year 2004 the
bank provided such a system, and hence for the years 2004, 2005 and 2006
these one 6,397, 4,959, 6,580 operational loss events, respectively.
In order to prevent distortion of the estimation of the distribution models, data
from 2000 to 2003 are disregarded. The core business of any Spanish Savings
Bank is retail banking, and for this reason all the data come from this business
line. Our data set is relatively small but is satisfactory enough for operational
risk analysis to the level of whole bank.
Figure 1: Monthly amount of losses
1 2 3 4 5 6 7 8 9 10 11 12 2004
0
100000
200000
300000
400000
500000
600000
2004
2005
2006
15
Figure 2: Monthly number of losses
1 2 3 4 5 6 7 8 9 10 11 12 2004
0
200
400
600
800
1000
1200
1400
2004
2005
2006
Exploratory Data Analysis (EDA) is quantitative detective work, according to
Tukey (1977). In the EDA, data should first be explored without any
assumptions being made about probabilistic models, error distributions, number
of groups, relationships between the variables, and so on. in order to discover
what the data can tell us about the phenomena under investigation.
EDA is also a collection of techniques for the discovery of information about the
data, and the methods for the visualization of the information, in order to expose
any underlying process.
16
In regards to this work, EDA techniques are used in order to describe the main
characteristics of data such as: central tendency, dispersion (i.e. variance and
standard deviation), and to determine how data might be distributed, by
focusing on the evaluation of skewness and kurtosis.
The data is adjusted according to the Consume Price Index (CPI) to prevent
distortion in the work outcome, and 2006 is taken as the baseline year.
In Table 1, all the main features on central tendency, asymmetry and tail-
heaviness of the data set are given.
Table1: Descriptive statistics
Statistics Operational risk
N
Mean
Median
Standard .deviation
Skewnes coefficient
Kurtosis coefficient
17936
254.48
51.35
3602.71
73.48
6924.22
The mean is much higher than the median; this is a feature present in the
skewed distributions, and is confirmed in our case by Skewness coefficients.
The Kurtosis coefficient also shows leptokurtic tails.
The Q-Q plot and the P-P plot show that the data are far from being normally
distributed.
Figure 3: Q-Q plot and P-P plot
Q-Q Plot
Normal
x36000032000028000024000020000016000012000080000400000
Qua
ntile
(M
odel
)
360000
320000
280000
240000
200000
160000
120000
80000
40000
0
P-P Plot
Normal
P (Empirical)10,90,80,70,60,50,40,30,20,10
P (
Mod
el)
1
0,9
0,8
0,7
0,6
0,5
0,4
0,3
0,2
0,1
0
17
The parameter estimation for the whole distribution by parametric
methodologies is carried out in order to advance in the analysis.
Table 2: Parameter estimation for the whole distrib ution
Distributions Parameters
Weibull α 0.59
β 104.42
Lognormal ų 3.92
σ 1.44
Champerwone α 1.27
c 0
M 50
Estimated the parameter for the Weibull, lognormal and Champrwone, the
attention is focused to the estimation of the GPD parameters.
Table 3: Parameter estimation of Generalized Pareto Distribution
Distribution threshold obs ξ u σ
GPD 166.53 3049 0.81 197.90 159.67
To estimate the threshold, the method described in Reiss and Thomas (2007) is
applied.
3.1. Results analysis
In order to analyze the different methods to fit the severity loss distribution we
compare the results of the empirical cumulative distribution function with each
method presented in this work, with particular attention on the tail. We divide
the cdf into two parts, one part for the body that varies from 0 to 0.99
percentiles and the other part for the tail from 0.99 to 1.
18
Figure 3: The cdf for the lognormal distribution co mpared with that of the empirical function
The two plots above show that the lognormal distribution fits the body (left)
reasonably well but is not a good selection for the higher quantiles (right).
Figure 4: The cdf for the Weibul distribution compa red with that of the empirical function
The Weibul distribution has a worse fit than the lognormal distribution in the
body (left) and fits as badly as the lognormal in the tail (right).
19
Figure 5: The cdf for the Champernowne distribution compared with that of the empirical function
The Champernowne distribution provides a better performance in the tail of the
distribution with respect to those previously plotted, but its fit is still not
unsuitable.
Figure 6: The cdf for Generalized Pareto Distributi on compared with that of the empirical function
In the case of GPD only the tail is considered to analyze the fit that this
distribution provides. As it is possible to note from the figure, the GPD present a
better fit on the tail than the parametric distribution that don’t focus the attention
on the tail.
20
Figure 7: The cdf for the transformed kernel estima tion compared with that of the empirical function
PLOT emp Champ
0. 48
0. 50
0. 52
0. 540. 56
0. 58
0. 60
0. 62
0. 64
0. 66
0. 68
0. 70
0. 720. 74
0. 76
0. 78
0. 80
0. 82
0. 840. 86
0. 88
0. 90
0. 92
0. 94
0. 96
0. 981. 00
x
0 1000 2000 3000
PLOT emp Champ
0. 986
0. 987
0. 988
0. 989
0. 990
0. 991
0. 992
0. 993
0. 994
0. 995
0. 996
0. 997
0. 998
0. 999
1. 000
x
0 100000 200000 300000 400000 500000 600000 700000 800000 900000 1000000 1100000
The kernel cdf estimation using the Champernowne distribution provides a good
fit in the body of the distribution but has a serious problem in the tail due to the
lack of information in this part of the distribution.
Figure 8: The cdf for the double transformed kernel estimation compared with that of the empirical function
As shown in figure 7, the double transformation in the kernel estimation function
significantly improves the fit of the cumulative distribution in both the body and
in the tail.
For a numerical confirmation, the quantiles for α=0.95, α=0.99 and α=0.999 are
estimated for each distribution and compared with the empirical quantile
estimation and are shown in table 4.
21
Table 4: Quantile estimation of the whole distribut ions
CDF α=0.95 α=0.99 α=0.999
Empirical 541 2475 23205
Lognormal 533 1427 3718
Weibull 529 1098 2246
Champernowne 316 1126 6150
KTCH 557 10036 +∞
KDTB 560 2535 27233
There is a large gap between the parametric distributions and the empirical
distribution in the higher quantiles. The kernel estimation with only one
transformation fails to estimate the quantile at 99.9% due to lack of information
in the tail and the quantile value estimated at 99% presents the widest gap with
that of the empirical distribution. By focusing attention on the kernel estimation
with double transformation, it can be seen that the quantile estimations confirm
the visual results of the cdf plots. The KDTB is the estimator with the least
difference with respect to the empirical distribution.
Since the quantile values estimated with the non-parametric methodology are
higher than those of the parametric ones we compare these results with the
Generalized Pareto Distribution, presented in Section 3. This is a well-known
extreme value model, widely used to fit operational loss severity distribution.
Table 5: Quantile estimation of GPD and Comparison with Empirical, KTCH and KDTB
CDF α=0.95 α=0.99 α=0.999
GPD 2705 10662 36178
Empirical 541 2475 23205
KTCH 557 10036 +∞
KDTB 560 2535 27233
Looking at the table 5, it is evident that the quantile values estimated with GPD
are higher for all levels with respect to the non-parametric model. The gap
between GPD quantile estimation and empirical quantile estimation increases
when the more extreme quantiles are considered.
4. Conclusions
22
In recent literature on operational risk, the severity loss distribution is the main
topic. Numerous modelling methods have been suggested although very few
work for both high frequency small losses and low frequency big losses. Hence,
common sense suggests a mixture of two distributions. For small losses,
distribution such as lognormal and Weibull are frequently used in combination
with Extreme Value Theory. In these practices questions arise, such as at which
point the data should be split in order to obtain the optimal threshold and which
model should be used.
An alternative distribution in the form of the Champernowne distribution is
explored, whose characteristics promise to provide a good alternative to fit the
whole distribution, however with disappointing results.
Attention is then focused on an alternative one-method-fits-all approach, and
we analyze the transformation kernel estimation method and the double
transformation kernel estimation. The good performance of the latter in the
context of operational risk severity is worthy of a special mention. This method
is based on the work of Bolance and Guillen (2009), it was initially proposed in
the context of insurance. In our opinion, these methodologies, especially the
double transformation kernel estimation can provide an excellent alternative to
estimate an operational risk loss severity distribution. This methodology takes
into account all tail behaviour and also includes data of high frequency small
losses that form the body of the distribution.
However, the outcomes of this study cannot be considered for deeper
understanding for example to estimate economic capital. We can conclude that
the double transformation kernel estimation delivers a good performance in all
severity distributions respect to the classic parametric model, particularly in
case where not many data are available to permit to split it in two parts. This is
the case of Savings Banks which main business is retail banking characterized
for high frequency small losses data. These methodologies permit to fit the
whole distribution making unnecessary to estimate a threshold. Another good
property of non-parametric techniques respect on EVT is that seems no
overestimate capital requirement. The double transformation kernel estimation
was applied in insurance claim field; we think that this methodology can
represent an advance in operational loss severity estimation too. In this sense
23
the next step is to combine the double transformation kernel estimation with a
frequency loss distribution in order to estimate an operational VaR which is
specified as the 99.9th confidence level of the aggregate loss distribution for the
Committee.
Bibliography
Azzalini, A. (1981) “A note on the estimation of a distribution function and
quantiles by a kernel method”, Biometrica 68, 1, 326-328.
Basel Committee on Banking Supervision:
• (1998): Operational risk management, BASEL, Basilea.
• (2001a), Operational Risk-Consultative Document, Supporting Document
to the New Basel Capital Accord.
• (2001b), Working Paper on the Regulatory Treatment of Operational
Risk.
• (2002): Operational risk data collection exercise - 2002, BASEL, Basilea.
• (2003a), Third Consultative Paper, The New Basel Capital Accord.
• (2003b): Sound Practices for the Management and Supervision of
Operational Risk, BASEL, Basilea.
• (2006): Basel II: International Convergence of Capital Measurement and
Capital Standards: A Revised Framework – Comprehensive Version,
BASEL, Basilea.
24
Bolancé, C., Guillén, M. and Nielsen, J.P. (2003) “Kernel density estimation of
actuarial loss functions” Insurance: Mathematics and Economics, 32, 1, 19-36
Bolancé, C., Guillén M. and Nielsen, J.P. (2009): “Transformation Kernel
Estimation of Insurance Claim Cost Distributions” in Corazza, M. and Pizzi, C
(Eds.) Mathematical and Statistical Methods for Actuarial Sciences and
Finance, Springer, 223-231
Bolancé C., Guillén M., and Nielsen J.P. (2008) “Inverse Beta transformation
in kernel density estimation”, Statistics & Probability Letters, 78, 1757-1764.
Bolancé, C., Guillén, M. (2009) “Estimación Núcleo Transformada para
aproximar el Riesgo de Pérdida”, Investigaciones en Seguros y Gestión de
Riesgos: Riesgo 2009, Cuadernos de la Fundación.
Buch-Larsen, T., Guillén, M., Nielsen, J.P. and Bolance, C. (2005) “Kernel
density estimation for heavy-tailed distributions using the Champernowne
transformation”, Statistics, 39, 6, 503-518.
Champernowne, D.G. (1952). “The Graduation of Income Distributions”
Econometrica, 20, 4, 591-615.
Chapelle, A., Crama, Y., Hübner, G., and Peter , J., “Practical methods for
measuring and managing operational risk in the financial sector: A clinical
study” , Journal of Banking & Finance Volume 32, Issue 6, June 2008, Pages
1049-1061
Chernobai A., Menn C., Trϋck S., Rachev T. S., (2004), “A Note on the
Estimation of the Frequency and Severity Distribution Losses” . Applied
Probability Trust.
Chernobai A., Rachev S., Fabozzi F., (2007), Operational Risk: A guide to
Basel 2 Capital Requirements, Models, and Analysis. John Wiley and Son, Inc.
New Jersey.
25
Cruz, M. G. (2002) Modeling, Measuring and Hedging Operational Risk, John
Wiley & Soon, New York, Chichester.
Da Costa, L (2004), Operational Risk with Excel and VBA. John Wiley and Son,
Inc. New Jersey.
Danielsson, J. and de Vries, C.G., (1997), “Tail index and quantile estimation
with very high frequency data”, Journal of Empirical Finance, 4, 241-57.
de Fontnouvelle, P., Deleus-Rueff, V., Jordan, J. and Rosengren, E., (2003),
“Using loss data to quantify operational risk”, Federal Reserve Bank of Boston,
Working Paper.
de Fontnouvelle, P., Jordan, J. and Rosengren, E., (2005),”Implications of
Alternative Operational Risk Modeling Techniques”, Technical report, Federal
Reserve Bank of Boston and Fitch Risk.
Dupuis, D.,(2004), Exceedances over High Thresholds: A Guide to Threshold
Selection. Springer U.S.
Dutta, K., Perry, J., (2007), “An Empirical Analysis of Loss Distribution Models
for Estimating Operational Risk Capital”, Federal Reserve Bank of Boston
working paper.
Ebnother, S., Vanini, P., McNeil, A.J. and Antolinez-Fehr, P., (2001), “Modelling
Operational Risk”, Zurich Cantonal Bank and Swiss Federal Institute of
Technology Zurich (Department of Mathematics), Working Paper.
Embrechts, P., Klϋppelberg, C., and Mikosch, T., (1997) “Modeling Extremal
Events for Insurance and Finance”, Berlin, Germany: Springer-Verlag.
26
Embrechts, P. (2000). Extremes and Integrated Risk Management. London:
Risk Books, Risk Waters Group.
Guillen, M., Gustafsson, J., Nielsen, J.P. and Pritchard, P. (2007). “Using
external data in operational risk”. Geneva Papers, 32, 178-189.
Gustafsson, J., Nielsen, J.P., Pritchard, P. and Roberts, D. (2006). “Quantifying
Operational Risk Guided by Kernel Smoothing and Continuous credibility: A
Practioner’s view”. Journal of Operational Risk, 1, 1, 43-56.
Gustafsson, J., Hagmann, M., Nielsen, J.P. and Scaillet, O. (2008). “Local
Transformation Kernel Density Estimation of Loss Distributions”. Journal of
Business and Economic Statistics
Gustafsson, J., Thuring F., (2008). “A Suitable Parametric Model for
Operational Risk Applications”. Working Paper
Embrechts, P. (2004), ‘‘Extremes in Economics and Economics of Extremes’’, in
B. Finkenst ¨adt and H. Rootz´en, eds, Extreme Values in Finance,
Telecommunications and the Environment, Chapman & Hall, London, pp. 169–
183.
Embrechts, P., Furrer, H., and Kaufmann, R. (2003), ‘‘Quantifying Regulatory
Capital For Operational Risk,’’ Derivatives Use, Trading, and Regulation 9 (3),
pp. 217–233.
Frachot, A and Roncalli, T., (2002), “Mixing Internal and External Data for
Managing Operational Risk,” Groupe de Recherche Opérationnelle, Crédit
Lyonnais.
Klugman, S.A., Panjer, H.A. and Willmot, G.E. (1998). Loss Models: From Data
to Decisions. New York: John Wiley & Sons, Inc.
27
Jobst, A.A. (2007): “Operational Risk - The Sting is Still in the Tail But the
Poison Depends on he Dose”, Journal of Operational Risk, 2, 2.
Matthys G., and Beirlant J., (2003) “ Estimating the Extreme Value Index and
High Quantiles with Exponential Regression Models”. Statistica Sinica
13(2003), 853-880
McNeil, A.J., Frey, R. and Embrechts, P. (2005). Quantitative Risk
Management. Princeton Series in Finance.
McNeil, A. J. (1996), “Estimating the Tails of Loss Severity Distributions Using
Extreme Value Theory”, Technical report, ETH Zurich.
Moscadelli, M., (2004), “The Modelling of Operational Risk: Experience with the
Analysis of the Data Collected the Basel Committee”, Technical report, Bank of
Italy.
Mϋller, H., (2002), “Quantifying Operational Risk in a Financial Institution”,
Master’s thesis, Istitut fϋr Statistik and Wirtschaftstheorie, Universität Karlsruhe,
In Chernobai A., Rachev S., Fabozzi F., (2007), Operational Risk: A guide to
Basel 2 Capital Requirements, Models, and Analysis. John Wiley and Son, Inc.
New Jersey.
Reiss, R., Thomas, M., (2007), Statistical Analysis of Extreme Values with
Applications to Insurance, Finance, Hydrology and Other Fields. Springer
Reiss, R., (1981), “Nonparametric estimation of smooth distribution function”,
Scandivian Journal of Statistics, 8, 116-119.
Reynolds, D., and Syer, D., (2003), “A General Simulation Framework for
Operational Loss Distributions”, In Alexander, C., et al., Operational Risk:
Regulation, Analysis and Management, Prentice Hull, Upper Saddle River, NJ,
pp-193-214.
28
Rosenberg, J., and Schuermann, T., (2004), “A General Approach to Integreted
Risk Management with Skewed, Fat-Tailed Risks, Technical Report, Federal
Reserve Bank of New York.
Tukey, J., W., (1977), “Exploratory Data Analysis”, Reading, MA: Addison-Wesley.
Wand, M.P. and Jones, M.C. (1995). Kernel Smoothing. Chapman & Hall
Wand, M.P., Marron, J.S. and Ruppert, D. (1991). “Transformation in Density
Estimation” Journal of the American Statistical Association, 86, 414, 343-361.
Recommended