TERStatistics of Extremes .pdf



Nom original: TERStatistics_of_Extremes.pdf

Ce document au format PDF 1.5 a été généré par LaTeX with hyperref package / pdfTeX-1.40.16, et a été envoyé sur fichier-pdf.fr le 20/05/2016 à 15:26, depuis l'adresse IP 87.231.x.x. La présente page de téléchargement du fichier a été vue 3554 fois.
Taille du document: 1.5 Mo (28 pages).
Confidentialité: fichier public


Aperçu du document


Université Claude Bernard Lyon 1

Institut de Science Financière et
D’Assurances

Work of Studies and Research

Statistics of Extremes
Theory and Applications with R

Author:

Supervisor:

Mehdi Cherid

Pierre Ribereau

April 26, 2016

Contents
Acknowledgement

3

Introduction

4

1 Basics and Recalled Results
1.1 Order Statistics . . . . . . . . . .
1.2 Cumulative Distribution Function
1.3 Strong Law of Large Numbers . .
1.4 Central Limit Theorem . . . . . .
1.5 Quantile-Quantile Plot . . . . . .

.
.
.
.
.

5
5
5
5
5
6

2 Basics of the Univariate Extreme Value Theory
2.1 Limit Distribution of the Maximum . . . . . . . . . . . . . . . . . . .
2.2 Domain of Attraction . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 Max–Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6
6
9
10

3 Statistics Inference and Estimation
3.1 Nonparametric Estimators of the Tail Index ξ . . . . . . . . . . . . . .
3.1.1 Pickands Estimator . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.2 Hill Estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.3 Moment Estimator . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Parametric Estimation of the GEV Parameters . . . . . . . . . . . . . .
3.2.1 Estimation of the Parameters by the Maximum Likelihood (MLE)
3.2.2 Estimation of the Parameters by the Probability Weighted Moment (PWM) Method . . . . . . . . . . . . . . . . . . . . . . .

10
10
11
11
12
12
13

4 Extreme Value Methods
4.1 Block Maxima Method . . . . . . . . . . . . . . . .
4.2 Peaks-Over-Threshold Method . . . . . . . . . . . .
4.2.1 The Generalized Pareto Distribution (GPD)
4.2.2 Estimation of the Parameters . . . . . . . .
4.2.3 Choice of the Threshold . . . . . . . . . . .

15
15
17
18
19
19

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

14

5 Application
20
5.1 Block Maxima Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.2 Peaks-Over-Threshold Method . . . . . . . . . . . . . . . . . . . . . . . 23
Conclusion

24

Bibliography

25

Annex

26

Acknowledgement

Acknowledgement
First of all, I would like to thank my supervisor Pierre Ribereau for suggested me the
topic of extreme value theory since I loved the statistics lesson during the first semester
and I wanted a work realized in this way. He offered his continuous help and advice
throughout the course of this work.
I am thankful to my brother who lent me his computer to be able to finish this
work and help me to check my typo.
Finally, I take this opportunity to express the profound gratitude from my deep
heart to my beloved parents for their love and continuous support throughout my life.

3

Introduction

Introduction
Emil J. Gumbel says “It’s impossible that the improbable will never happen”. Rare
events are events that have a low probability of occurrence but even if they are discrete occurrences that are statistically “improbable” in that they are very infrequent,
such events are plausible. Rare events are everywhere and encompass natural phenomena (earthquakes, tsunamis, hurricanes,. . . ), financial market crashes, etc. They
are events which recede strongly to the mean or the usual trend and which have consequences that are disastrous to human beings and environment.
Studying the occurrence of these events is of prime importance today for insurance and
financiers. Measurement and risk management have become major issues for financial
market operators, actuaries, etc.
The question is how to model the rare phenomena that lie outside the range of
available observations? Extreme value theory provides a good theoretical foundation
on which we can build statistical models describing extreme events. This modeling
corresponds to the study of the tail of the distribution that will enable us to calculate
or estimate rather extreme quantiles. Indeed, we would like to be able to answer to
these two complementary questions: Calculate the threshold zp which will be exceeded
only with a very small probability p (i.e. the (1 − p)V aR) and what is the probability
p that a disaster exceeds a threshold zp . This is the purpose of our study.
The outline of the present report is as follows. First, we give or recall some definitions and theorems. Then, extreme value theory is introduced and we see how
to estimate the extreme value index and others parameters. Finally, we present two
methods to estimate extreme quantiles: the block maxima method and the peaks-overthreshold method. All these approaches are then applied to a real dataset.

4

1. Basics and Recalled Results

1

Basics and Recalled Results

1.1

Order Statistics

Let X1 , . . . , Xn be a sample, i.e. a family of non-degenerate independent and identically
distributed random variables with a common distribution F . We define the ordered
sample as the family Xn,n ≤ · · · ≤ X1,n so that Xn,n = min(X1 , . . . , Xn ) and X1,n =
Mn = max(X1 , . . . , Xn ).
The random variable Xk,n is called the k-order statistic. Xn,n and X1,n = Mn are
called smallest order statistic and larger order statistic, respectively. The difference
X1,n − Xn,n is called the sample range.

1.2

Cumulative Distribution Function

The cumulative distribution function of a real-valued random variable X is the function
given by F (x) = P(X ≤ x) and the survival function is defined by F = 1 − F (x).
We also define the right endpoint xF of the distribution function F as:
xF = sup{x : F (x) < 1}.
Let X1 , . . . , Xn be independent, identically distributed real random variables with the
common cumulative distribution F . Then we define the empirical distribution function
as:
n
1X
Fˆn (x) =
1{Xi ≤x}
n i=1

1.3

Strong Law of Large Numbers

Let X1 , . . . , Xn be a sequence of independent and identically distributed random variables with expected value E[X1 ] = E[X2 ] = µ < ∞.
The sample mean X = n1 (X1 + · · · + Xn ) → µ when n → ∞.

1.4

Central Limit Theorem

Let X1 , . . . , Xn be a sequence of independent and identically distributed random variables with mean µ and finite, positive variance σ 2 . Then, defining:
Sn =

X 1 + · · · + Xn
,
n

√ (Sn − µ) d
n

− Z
σ
as n → ∞ where Z ∼ N (0, 1).

5

(1.1)

1.5

Quantile-Quantile Plot

The central limit theorem is generally used in statistical applications by interpreting
(1.1) as an approximation for the distribution of the sample mean X for large n.
This theorem is helpful for applications because the approximating distribution of
the sample mean is normal regardless of the parent population of the Xi . Similar
arguments will be use in this paper to obtain approximating distributions for sample
extremes.

1.5

Quantile-Quantile Plot

The Quantile–Quantile(Q-Q) plot is a graph for determining if two data sets come from
populations with a common distribution. This is a plot of the quantiles of the first data
set against the quantiles of the second data set. So it compares empirical quantiles
with theoretical quantiles. This plot should be linear (or at least near diagonal) if the
adjustment is good.

2
2.1

Basics of the Univariate Extreme Value Theory
Limit Distribution of the Maximum

The model focuses on the statistical behavior of Mn . We know that the distribution
of Mn is given by:
FMn = P(Mn ≤ x)
= P(

n
\

(Xi ≤ x))

i=1

=
=

n
Y
i=1
n
Y

P(Xi ≤ x)
F (x) = F n (x)

∀x ∈ IR

i=1

But it is not helpful in practice, since the distribution function F is generally unknown.
An other nonparametric approach would be to use the empirical function distribution
6

2.1

Limit Distribution of the Maximum

but Fˆn (t)=1 when x > Mn so we see that standard approaches are not suitable to
study tails of distributions.
One approach is similar to the usual practice of approximating the distribution of
sample means by the normal distribution, as justified by the central limit theorem. The
arguments we develop here are essentially an extreme value analog of the central limit
theory. The question is can we find sequences of normalization constants {an > 0},
n
→ G?
{bn } and a non-degenerate distribution G such that Mna−b
n
Theorem 2.1. (Fisher-Tippett-Gnedenko)
Let X1 , . . . , Xn be a sequence of i.i.d. variables with common distribution F . If there
exist sequences of constants {an > 0}, {bn } and a non-degenerate distribution function
G such that:
!

x−µ
Mn − bn
≤ x = F n (an x + bn) → Gξ
P
an
σ




as n → ∞

then G belongs to the generalized extreme value (GEV) distribution:





exp

− 1 !

x−µ
− 1+ξ
σ
G(x) =




x

µ


)
exp −exp(−
σ




x−µ
ξ=
6 0 and 1 + ξ
>0
σ


ξ



ξ=0

The GEV family encompasses 3 possible limit distributions of the maxima Mn :
• Gumbel or Type I (ξ = 0):
x−µ
Gµ,σ,ξ (x) = exp −exp −(
)
σ






x ∈ IR

• Fréchet or Type II (ξ > 0):


0

x≤µ

Gµ,σ,ξ (x) = 
x − µ − 1ξ
exp −(
)
σ




x>µ

• Weibull or Type III (ξ < 0):
Gµ,σ,ξ (x) =




x − µ 1ξ

exp − −(
)

x<µ


1

x≥µ

σ

µ ∈ IR, σ > 0 and ξ ∈ IR are the location parameter, the scale parameter and the
shape parameter respectively.
With µ = 0 and σ = 1, we obtain standard expressions of these distributions:

7

2.1
• Λ(x) = exp (−exp(−x))


• Φθ (x) = exp (−x)−θ


Limit Distribution of the Maximum

x ∈ IR (Gumbel)



x > 0, θ > 0 (Fréchet)



• Ψθ (x) = exp −(−xθ )

x ≤ 0, θ > 0 (Negative Weibull)

A lot of classic laws satisfy the hypothesis of the existence of G for the maximum Mn .
It is the case in particular for the following distributions:
• Let F be the cumulative distribution function of the Pareto distribution defined
as F (x) = 1 − x−α , x ≥ 1 and α > 0.
1

Choosing an = n α and bn = 0, we verify that ∀x ∈ IR:
−α

lim F n (an x + bn ) = e−x 1x>0

n→∞

• Let F be the cumulative function of the standard Cauchy distribution, admitting
a density f defined by:
f (x) =
In selecting an =

n
π

1
π(1 + x2 )

∀x ∈ IR

and bn = 0, we verify that ∀x ∈ IR:
−1

lim F n (an x + bn ) = e−x 1x>0

n→∞

• Let F be the cumulative function of the standard exponential distribution defined
by F (x) = 1 − e−x ∀x ≥ 0
Choosing an = 1 and bn = log n, we verify that ∀x ∈ IR:
lim F n (an x + bn ) = e−e

−x

n→∞

• Let F be the cumulative distribution function of the uniform distribution on the
interval [0, 1], defined by F (x) = x for 0 ≤ x ≤ 1
Choosing an =

1
n

and bn = 1, we verify that ∀x ∈ IR:
lim F n (an x + bn ) = ex 1x≤0 + 1x>0

n→∞

8

2.2

Domain of Attraction

Figure 1: Normal, Gumbel, Fréchet and Weibull distributions

We can see that the Fréchet is bounded below and the Weibull is bounded above.

2.2

Domain of Attraction

Theorem 2.2.
Let Gξ be a non-degenerate distribution. The distribution F is in the domain of
attraction of the extreme value Gξ (F ∈ D(Gξ )) if there exist sequences {an > 0} and
{bn } such that:
F n (an x + bn ) → Gξ (x)

as n → ∞

The belonging to a domain of attraction is bound to the form of the tail of distribution
and more specifically at the speed of decrease of F (x) toward 0.

Therefore, we have 3 domains of attraction:
• Gumbel domain of attraction if ξ = 0. This domain encompasses laws having a
survival function with exponential decay. It is the case with normal distributions,
gamma distributions, exponential distributions for example.
• Fréchet domain of attraction if ξ > 0. It contains laws whose survival function
decreases as a power function. They are heavy-tailed laws as is the case with
Pareto distributions, Student’s t-distributions, Cauchy distributions.
9

2.3

Max–Stability

• Weibull domain of attraction if ξ < 0. This domain contains laws on which the
endpoint xF is finite. It is the case for example with uniform distributions and
beta distributions.
In practice, we wish to determine, for a given sample, the shape of the tail of distribution, or at least, to have an idea of the domain of attraction of the distribution.
The quantile-quantile method can be used to provide an indication of the probable
belonging to a possible domain of attraction.

2.3

Max–Stability

Proposition 2.1.
The GEV distribution has the property of max-stability, understood as follows: if
Y1 , . . . , Yn are i.i.d. draws from G, then max(Y1 , . . . , Yn ) also has distribution G,
i.e. Gn (z) = G(an z + bn ) for appropriate sequences {an > 0} and {bn }. In fact, a
distribution is max-stable if and only if it is a member of the GEV family.
For that matter, what about the minima of the sample? The theoretical approach is
the same that for the maximum.

=
=
=
'

3
3.1

P(min(X1 , . . . , Xn ) ≤ x)
P(−max(−X1 , . . . , −Xn ) ≤ x)
P(max(−X1 , . . . , −Xn ) ≥ −x)
1 − P(max(−X1 , . . . , −Xn ) < −x)
−x − µ
1 − Gξ (
)
σ

Statistics Inference and Estimation
Nonparametric Estimators of the Tail Index ξ

We present here three nonparametric estimators of the tail index, both based on the
order statistic Xk,n ≤ · · · ≤ X1,n obtained from the initial series by considering the
k-larger values (or we could consider the smallest), k depends in priori on n even if
we won’t mention it in the notation. The idea is to have k → ∞ when n → ∞ but
without taking too much values in the sample, what leads to impose nk → 0.
That means the question will arise on the optimal choice of k. Indeed, it is essential to
calculate these estimators on the distribution tails. Choosing a too high k generates
the risk of taking into account the values that are not extreme, conversely, a too small
subsample does not allow estimators to reach their level of stability.
Finally, we note that the nonparametric approach is only possible if we have a large
number of observations.

10

3.1
3.1.1

Nonparametric Estimators of the Tail Index ξ

Pickands Estimator

The Pickands Estimator is defined by the statistic:
p
ξˆk,n

1
Xk,n − X2k,n
ln
=
ln 2
X2k,n − X4k,n

!

It presents the interest to be valid whatever is the extreme distribution (Gumbel,
Fréchet or Weibull).
The graphical representation of this estimator, according to the number of considered
observations, k, shows a behavior generally very volatile at first, what harms the
legibility of the graph. Furthermore, it is very sensitive to the size of the selected
sample, what makes it not very robust. It is thus delicate to handle.
We can note that the estimator is asymptotically normal:
p
√ ξˆk,n
−ξ
k
→ N (0, 1)
σ(ξ)

as k → ∞

the asymptotic variance being given by:

ξ 22ξ+1 + 1
σ(ξ) =
2(2ξ − 1) ln 2
3.1.2

Hill Estimator

This estimator assumes the tail index ξ to be positive, so it is usable only for Fréchet
distributions, for whose it supplies an estimator of the tail index more effective than
Pickands estimator.
The Hill Estimator is defined by:
H
ξˆk,n
=

If we choose k, n → ∞ such that

k
n

X
1 k−1
Xj,n
ln
k − 1 j=1
Xk,n

!

→ 0, then we can show that
H

lim ξˆk,n

k→∞

The Hill estimator is also asymptotically normal:
H
√ ξˆk,n
−ξ
k
→ N (0, 1) the convergence being in law.
ξ

However, the Hill estimator has an important bias and for this reason it is delicate to
handle.

11

3.2

Parametric Estimation of the GEV Parameters

Note: it plays another role - that of estimating the threshold (see Peaks-OverThreshold method, lower).
3.1.3

Moment Estimator

This estimator is defined by the statistic:
−1
(1)
(ξˆk,n )2
1
1 − (1) 

2
ξˆk,n



M
ξˆk,n
=1+

(1)
ξˆk,n



where

(i)
ξˆk,n



i

k
1X
Xj,n 
=
)
ln(
k j=1 Xk−1,n

This estimator is convergent and asymptotically Gaussian:
M
√ ξˆk,n
−ξ
k√
→ N (0, 1)
1 + ξ2

As for the Hill estimator, it is usable only for the Fréchet domain of attraction.

3.2

Parametric Estimation of the GEV Parameters

We will see through an example how to estimate the GEV parameters by the maximum likelihood method and the probability weighted moment method. Then, we will
estimate the Value-at-Risk.
Simulated example: We simulate data z1 , . . . , zm which represent for example the
cost of car accidents and which are naturally maximums obtained with M = rgev(100, loc =
9, scale = 2.6, shape = 0.3).

12

3.2
3.2.1

Parametric Estimation of the GEV Parameters

Estimation of the Parameters by the Maximum Likelihood (MLE)

We know these observations are realizations of a GEV distribution (since they are
maximums). We assume in addition they are independent with a common distribution.
So we will estimate parameters of this distribution.
The log-likelihood associated with case where ξ 6= 0 is then:
m
m
X
zi − µ
1 X
zi − µ
)−
1+ξ
l(µ, σ, ξ) = −m log σ − (1 + )
log (1 + ξ
ξ i=1
σ
σ
i=1



− 1
ξ

with 1 + ξ ziσ−µ > 0 for i = 1, . . . , m.
The log-likelihood associated with case where ξ = 0 is different from the fact that the
GEV corresponds to the Gumbel distribution:
l(µ, σ) = −m log σ −

m
X
zi
i=1

m
X
−µ
zi − µ

)
exp −(
σ
σ
i=1







Asymptotically normal properties for estimators is only valid for ξ > − 12 . In
practice, estimators have no explicit expression and many problems can occur if there
are too much data or on the contrary, too few data.
With the function gev.f it, we find:
µ
8.6916369

σ
2.4248499

ξ
0.2882475

Since we have the asymptotic normality of estimators with ξ > − 12 , we have confidence intervals of the form:


CI1−α

q



q

ˆ ξˆ + q1− α V ar(ξ)
ˆ
: ξˆ − q1− α2 V ar(ξ);
2

where q1− α2 is the (1 − α2 )-quantile of the standard normal distribution, i.e.
P(Y ≤ q1− α2 ) = 1 −

α
2

with Y ∼ N (0, 1)

.
Thus, results of the estimate of parameters by maximum likelihood with 95% confidence interval are:
ξ
0.29
[0.11 ,0.46]

µ
8.69
[8.14 ,9.23]

13

σ
2.42
[1.97 ,2.88]

3.2
3.2.2

Parametric Estimation of the GEV Parameters

Estimation of the Parameters by the Probability Weighted Moment
(PWM) Method

It’s a general method that was introduced by Greenwood et al. [1979]. The weighted
moments of a random variable Z having a cumulative function F are defined by:
Mp,r,s = E [Z p (F (Z))r (1 − F (Z))s ]
for p, r, s ∈ IR
The study for the GEV was realized by Hosking et al. [1985]. We obtain for ξ 6= 0,
ξ < 1:
!

M1,r,0

σ
1
µ − [1 − (r + 1)ξ Γ(1 − ξ)]
= βr =
r+1
ξ

where Γ(t) =

Z ∞

xt−1 e−x dx

0

ˆ of (σ, µ, ξ)
If Z1 , . . . , Zm is a sample of a GEV distribution, weighted moments (ˆ
σ, µ
ˆ, ξ)
are the solutions of the system:
β0
=µ − σξ (1 − Γ(1 − ξ))
2β1 − β0 = σξ Γ(1 − ξ)(2ξ − 1)


 3β2 −β0 = 3ξ −1
2β1 −β0
2ξ −1





replacing βr by the unbiased estimator based on the m observations Z1 , . . . , Zm :
m
r
Y
1 X
j−l
βˆr =
Zj,m
m j=1
l=1 m − l

!

(for further details, see Beirlant et al. [2004], chapter 5)
For this method, we can use the function gevF it in the f Extremes package and we
find:
ξ
0.2888079

µ
8.6770687

σ
2.4246313

The asymptotic normality is valid here for ξ < 21 but the domain of validity is restrictive
for many applications. However, this domain of validity can be extended with the
generalized probability weighted moments method. In practice, this method is good
in simulations for small sample size.
We now turn to the calculation of the 99% V aR (V aR(1 − p) = zp ). We have:
i
σh
if ξ 6= 0
zp = µ −
1 − (− log(1 − p))−ξ
ξ
= µ − σ log(− log(1 − p)) if ξ = 0
We will then use the following estimators:
zˆp = µ
ˆ−

i
σ
ˆh
ˆ
1 − (− log(1 − p))−ξ
ξˆ

if ξ 6= 0

= µ
ˆ−σ
ˆ log(− log(1 − p)) if ξ = 0
14

4. Extreme Value Methods
We can also calculate confidence intervals for zp . For that, it is first necessary to
calculate the associated variance.
Using the delta method, we have:
V aR(ˆ
zp ) ' ∇tzp V ∇zp
ˆ and
with V the variance-covariance matrix of (ˆ
µ, σ
ˆ , ξ)
"

∇tzp

∂zp ∂zp ∂zp
=
,
,
∂µ ∂σ ∂ξ

#

ˆ
evaluated in (ˆ
µ, σ
ˆ , ξ).
We find the 99% V aR with a 95% confidence interval: 31.96 ∈ [19.97, 43.94].
Note: We have the V aR stability.
Simulate two different samples of the same distribution:
A = rgev(100, loc = 9, scale = 2.6, shape = 0.3)
B = rgev(100, loc = 9, scale = 2.6, shape = 0.3)
Using this time the function f gev in the evd package, we find:
fgev ( A )$estimate
loc
scale
9.1707366 2.5334243
fgev ( B )$estimate
loc
scale
8.4849578 2.3563261

shape
0.1961111
shape
0.3680578

The function f gev also provides directy the V aR:
quantile
28.1156989
quantile
28.4638800

scale
2.5323606
scale
2.3528401

shape
0.1967211
shape
0.2484344

We can see that estimates of parameters are pretty different but we obtain similar
values of the 99% V aR in both cases.

4

Extreme Value Methods

In extreme value theory, there are two important approaches: the block maxima (BM)
method and the peaks-over-threshold (POT) method. Whereas much theoretical research has gone into the POT method, the BM method has not been studied thoroughly. We will see in this present paper both methods with an application example
and we will also provide a theoretical comparative study of the methods.

4.1

Block Maxima Method

We have considered to this point maximums directly. Now, we suppose we have
independent realizations having a same distribution F of a certain phenomenon of
15

4.1

Block Maxima Method

interest.
One approach to working with extreme value data is to divide the sample in m blocks
of equal length and fit the data to the maximums of each block. Indeed, the block
maxima method involves fitting a GEV distribution to maximum data values from a
fixed time interval of block length.
The delicate point of this method is the “good” choice of the block size as blocks that
are too small can lead to bias and blocks that are too large generate too few block
maxima, which leads to large estimation variance. This choice depends on the context,
on the number of data,. . .
(X1 , X2 , . . . , Xk ), (Xk+1 , Xk+2 , . . . , X2k ), (X(m−1)k+1 , . . . , Xmk )
|

{z

Z1

} |

{z

} |

Z2

{z

Zm

}

We obtain a sample of m realizations (of GEV distribution) on which we can apply
the Fisher-Tippett-Gnedenko theorem and can implement methods seen previously.

Figure 2: Block Maxima method illustrated

However, the block maxima method is wasteful of data as only one data point in each
block is taken. The second highest value in one block may be larger than the highest
16

4.2

Peaks-Over-Threshold Method

of another block and this is generally not accounted for.
The peaks-over-threshold (POT) method is a way to avoid this drawback as it uses
the data more efficiently: the idea is to consider several large values instead of only
the largest one.

4.2

Peaks-Over-Threshold Method

This approach consists in using observations which are above a given threshold, called
exceedances.

Figure 3: Peaks-Over-Threshold method illustrated

Definition 4.1.
Let X be a random variable with a distribution function F and right endpoint xF .
For all u < xF , the function:
Fu (x) = P (X − u ≤ x|X > u) x ≥ 0
is called the distribution function of exceedances above threshold u.
By the conditional probabilities, Fu can also be defined as:

Fu (x) =



 F (u + x) − F (u)


if x ≥ 0



0

else

1 − F (u)

17

4.2

Peaks-Over-Threshold Method

Let Y = X − u for X > u and for n observed variables X1 , . . . , Xn , we can write
Yj = Xi − u such that i is the index of the jth exceedance, j = 1, . . . , nu . The
distribution of the exceedances (Y1 , . . . , Ynu ) can be approximated by a Generalized
Pareto Distribution.
4.2.1

The Generalized Pareto Distribution (GPD)

Definiton 4.2.
Let σu be a strictly positive function and ξ ∈ IR. The Generalized Pareto Distribution
(GPD) is given by:




1 − (1 +

ξy − 1ξ
)
σ0
Gξ,σu (y) =
y



1 − exp(− )
σu

ξ 6= 0
(4.1)
ξ=0

where y ≥ 0 if ξ ≥ 0 and 0 ≤ y ≤ −

σu
ξ

if ξ < 0

.
Theorem 4.1. (Pickands-Balhema-de Haan)
If F is in one of the three domains of attraction of the GEV (Gumbel, Fréchet or
Weibull), then there exist a strictly positive function σu and ξ ∈ IR such that:
lim

sup

u→xF 0≤y≤x −u
F

|Fu (y) − Gξ,σu (y)| = 0

where Gξ,σu is the GPD and Fu the distribution function of the exceedances above u.
Thus, for a threshold u large enough, the distribution of exceedances above u is
approximated by a GPD:
Fu ' Gξ,σu

The parameters of the GPD are uniquely determined by those of the GEV: the shape
parameter ξ is equal to that of the corresponding GEV and the scale parameter σu is
a function of the GEV location and shape parameters: σu = σ + ξ(u − µ).
The notation σu was used to distinguish the scale parameter from the corresponding
parameter of the GEV and to emphasize its dependence on u. From now on, the
distinction will be dropped out for notational convenience.
As with the GEV, the shape parameter of the GPD is dominating in determining the
behavior of the tail. If ξ < 0, the distribution of excesses has un upper bout at u − σξ
while on the contrary, if ξ > 0, the distribution has no upper limit.

18

4.2

Peaks-Over-Threshold Method

Note: The expected value of the GPD with parameters σ and ξ is:
E[Y ] =

σ
1−ξ

provided ξ < 1

(4.2)

and E[Y ] = ∞ when ξ ≥ 1.
4.2.2

Estimation of the Parameters

As for the GEV distribution, the method to estimate σ and ξ which combines theoretical efficiency and gives a general basis for inference is the maximum of likelihood
method (but here as well, several methods exist like Pickands estimator, probability
weighted moments method, moments method,. . . ).
Let y1 , . . . , ynu be a sequence of nu exceedances of a threshold u. For ξ 6= 0, the
log-likelihood is derived from (4.1) as:
nu
yi
1 X
log (1 + ξ )
l(σ, ξ) = −nu log σ − (1 + )
ξ i=1
σ

provided (1 + ξ yσi ) > 0 for i = 1, . . . , nu , otherwise l(σ, ξ) = ∞.
When ξ = 0, the log-likelihood can be derived in a similar way:
l(σ) = −nu log σ −

nu
1X
yi
σ i=1

However, one more time, the maximum likelihood estimator (MLE) is not always valid
and regularity conditions don’t always exist. The MLE is valid for ξ > −1, but the
asymptotically normal properties of the MLE is only valid for ξ > − 12 . When ξ < −1,
maximum likelihood estimators generally don’t exist.
4.2.3

Choice of the Threshold

The choice of the threshold is not straightforward. Indeed, a compromise between bias
and variance has to be found: a high threshold value reduces the bias but however
increases the variance for the estimators of the GPD and a low threshold value results
in the opposite, i.e. a high bias but a low variance of the estimators, since there is
more data with which to estimate the parameters.
A solution is to set a number of excess r and to deduct the threshold u from it. We
−1
look then for u such that P (X > u) = nr . In others words, we estimate u = F ( nr )
by Xn−r+1,n .
This solution, widely used in practice, moves the problem of choosing the threshold u
to the problem of choosing the number of excess r to consider.
The Mean Excess Function (MEF)
The mean excess function or the mean residual life plot is a graphical method for the
selection of the threshold. This method is based on the mean of the GPD. Suppose
19

5. Application
the GPD is valid as a model for the excesses of a threshold u0 generated by a series
X1 , . . . , X n .
By (4.2) we have
E(X − u0 |X > u0 ) =

σu0
1−ξ

The threshold stability property of the GPD means that if the GPD is a valid model for
excesses over some threshold u0 , then it’s valid for excesses over all threshold u > u0 .
Hence, for all u > u0 and ξ < 1:
σu +ξ(u−u0 )
σu
= 0
1−ξ
1−ξ
σu0 − ξu0
ξ
u+
=
1−ξ
1−ξ

E(X − u|X > u) =

Thus, for all u > u0 , the MEF is a linear function of u. Furthermore, E(X − u|X > u)
is the mean of the excesses of the threshold u, and can be estimated by the sample
mean of the threshold excesses.
In practice, we plot the locus of points:
(

nu
1 X
u,
(x(i) − u) , u < xmax
nu i=1

!

)

where x(1) , . . . , x(nu ) consist of the nu observations that exceed u, and xmax is the
largest of the Xi .
It means that for a range of threshold u, we identify the corresponding mean threshold
excess, then plot this mean threshold excess against u, and look for the value u0 above
which we can see linearity in the plot.
Note:
• If the MEF is increasing, it corresponds to the case ξ > 0 and then to the Fréchet
domain of attraction.
• If the MEF is decreasing, it corresponds to the case ξ < 0 and then to the Weibull
domain of attraction.
• If the MEF is constant, it corresponds to the case ξ = 0 and then to the Gumbel
domain of attraction.

5

Application

Our dataset is bmwRet and it is available in the f Extremes package. These data
correspond to daily log-returns of BMW share prices.
In finance we generally don’t have i.i.d. observations but we will suppose it is the case
here (some reasons we won’t develop in this paper as we may assume stationarity of
the underlying times series so that many limit results (such as the strong law of large
20

5.1

Block Maxima Method

numbers) remain valid under general conditions, allow us to interpret these data in a
similar way to the i.i.d. case.
Furthermore, we won’t contrast here positive daily log-returns from the negative ones.

5.1

Block Maxima Method

As we have seen above, the block maxima method models the extreme distribution by
the GEV.

Daily log-returns of BMW share prices for the period January 2, 1973 – July 23, 1996
(n = 6146).

We create then maximas on blocks of length 20.

21

5.1

Block Maxima Method

Diagnostics graphical

We calculate then parameters of the GEV and for each parameter, the 95% confidence
interval. Results of the estimate by maximum likelihood are:

Then for the 99% V aR, we find 0.039 with a 95% confidence interval of [0.036, 0.042].

22

5.2

5.2

Peaks-Over-Threshold Method

Peaks-Over-Threshold Method

We have seen that the POT method models the distribution of excess above a threshold
by the GPD.

According to the graph, we can take the threshold = 0.035 because the locus of points
(x, e(x)) is approximately linear for x > u. In addition, with the f indT hreshold
function in the evir package, with a number of excess equal to 100, we also find a
threshold approximately equal to 0.035.
We then estimate the GPD by the maximum likelihood using the gpd function and we
find a 99% V aR equal to 0.042.

23

Conclusion

Conclusion
We have illustrated extreme value theory and how it can be used to model risk measures such as Value-at-Risk. Our conclusion is that extreme value theory can be useful
for assessing the size of extreme events.
While we focused in this paper on the univariate case, when we look at extreme
events, it is often the simultaneous implementation of several events that matters.
So we have many variables that come into play. But the univariate case cannot be
generalized straightforwardly to the multivariate case. The problem arises when we
leave the univariate case leads us to answer to questions: How to order vectors of
random variables? What is a multivariate extreme event? Is it enough that only one
component is an extreme realization? Or all the components must be extreme? Or do
we need another condition?
To be continued. . .

24

Bibliography

Bibliography
[1] Stuart Coles. An Introduction to Statistical Modeling of Extreme Values. Springer,
2001.
[2] P.Embrechts, C.Klüppelberg, T. Mikosch. Modelling External Events for Insurance
and Finance, Springer-Verlag, 1997.
[3] Pierre Ribereau. Valeurs extremes: applications en R, 2011.
[4] Jan Beirlant, Yuri Goegebeur, Jozef Teugels. Statistics of Extremes. John Wiley
Sons, Ltd, 2004.
[5] Jean-Noël Bacro, Pierre Ribereau. Théorie des valeurs extrêmes appliquée avec R,
2011
[6] Frédéric Planchet. Utilisation de la théorie des valeurs extrêmes dans le contexte
solvabilité 2, 2012 URL http://www.ressources-actuarielles.net/EXT/ISFA/fpisfa.nsf/0/72EE1310B7EBC2A2C1256FD2002E9C76/$FILE/Seance301.pdf?OpenElement
[7] J.-J Droesbeke, M. Maumy-Bertrand, G. Saporta, C. Thomas-Agnan. Approches
statistiques du risque, Technip, 2014. 85-112.
[8] Esther Bommier. Peaks-Over-Threshold Modelling of Environmental Data, 2014.
URL https://uu.diva-portal.org/smash/get/diva2:760802/FULLTEXT01.pdf
[9] Abhinay Sawant. Extreme Value Theory with High-Frequency Financial Data, 2010.
[10] Marco Rocco. Extreme value theory for finance: a survey, 2011. URL
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1998740
[11] Younes Bensalah. Steps In Applying Extreme Value Theory to Finance: A Review,
2000.

25

Annex

Annex
# simulated example
library ( evd )
M = rgev (100 , loc =9 , scale =2.6 , shape =0.3)
plot (M , xlab =" i " , ylab =" maximum cost of car accidents ")
library ( ismev )
# Estimation of the parameters by maximum likelihood
simuGEV = gev . fit ( M )
simuGEV$mle
# and 95 confidence interval for theses parameters
simuGEV$mle -1.96* simuGEV$se
simuGEV$mle +1.96* simuGEV$se
# Calculation of the 99% VaR with 95% confidence interval
zp = simuGEV$mle [1] -( simuGEV$mle [2]/ simuGEV$mle [3])*
+(1 -( - log (0.99))^( - simuGEV$mle [3]))
zp
ypm1 = - log (0.99)
gradzp < - c (1 , - simuGEV$mle [3]^{ -1}*(1 - ypm1 ^{ - simuGEV$mle [3]}) ,
+ simuGEV$mle [2]* simuGEV$mle [3]^{ -2}*
+(1 - ypm1 ^{ - simuGEV$mle [3]}) - simuGEV$mle [2]*
+ simuGEV$mle [3]^{ -1}* ypm1 ^{ - simuGEV$mle [3]}* log ( ypm1 ))
varzp < - t ( gradzp )%*%simuGEV$cov%*%( gradzp )
c ( zp -1.96* sqrt ( varzp ) , zp +1.96* sqrt ( varzp ))
# Estimation of the parameters by PWM
library ( fExtremes )
resgevFit = gevFit (M , type =" pwm ")
resgevFit
# Var Stability
A = rgev (100 , loc =9 , scale =2.6 , shape =0.3)
B = rgev (100 , loc =9 , scale =2.6 , shape =0.3)
fgev ( A )$estimate
fgev ( B )$estimate
fgev (A , prob =0.01)$estimate
fgev (B , prob =0.01)$estimate

26

Annex

# Block Maxima Method
library (" fExtremes ")
library (" ismev ")
data ( bmwRet )
head ( bmwRet )
BMW = as . timeSeries ( bmwRet )
# Construction of maximas on blocks of length 20
B = blockMaxima ( BMW , block = 20)
maxB = B$max . BMW . RET
plot ( maxB , main =" BMW . RET maximums by block of length 20" ,
xlab =" t " , ylab =" BMW . RET ")
# Estimation of the GEV parameters by the PWM
BMWGEV = gevFit ( maxB , type =" pwm ")
BMWGEV
# Estimation of the GEV parameters by the MLE
BMWGEV2 = gev . fit ( maxB )
BMWGEV2
BMWGEV2$mle
# We calculate for every parameter the 95% confidence interval
BMWGEV2$mle -1.96* BMWGEV2$se
BMWGEV2$mle +1.96* BMWGEV2$se
# Calculation of the 99%VaR with 95% confidence interval
ypm1 < -( - log (0.99^20))
zp < - BMWGEV2$mle [1] -( BMWGEV2$mle [2]/ BMWGEV2$mle [3])*(1 - ypm1 ^(
- BMWGEV2$mle [3]))
zp
gradzp < - c (1 , - BMWGEV2$mle [3]^{ -1}*(1 - ypm1 ^{ - BMWGEV2$mle [3]}) ,
+ BMWGEV2$mle [2]* BMWGEV2$mle [3]^{ -2}*(1 - ypm1 ^{ - BMWGEV2$mle [3]})
- BMWGEV2$mle [2]* BMWGEV2$mle [3]^{ -1}* ypm1 ^{ - BMWGEV2$mle [3]}*
log ( ypm1 ))
varzp < - t ( gradzp )%*%BMWGEV2$cov%*%( gradzp )
c ( zp -1.96* sqrt ( varzp ) , zp +1.96* sqrt ( varzp ))
# Calcualtion of the VaR with the VaR function
VaR ( BMW ,0.99 , type =" sample ")
# Peaks - Over - Threshold Method
library (" fExtremes ")
data ( bmwRet )
BMW = as . timeSeries ( bmwRet )
# Mean Excess Distribution
mePlot ( BMW )

27

Annex

# Mean Residual Life Plot
mrlPlot ( BMW )
# We estimate here the threshold in a different way ,
by defining a number of excesses = 100
findThreshold ( BMW ,100)
# Estimation of the GPD by the maximum likelihood
library (" evir ")
T = gpd ( BMW ,0.035 , method =" ml ")
T
# Calculation of the 99% VaR
riskmeasures (T ,0.99)

28


Aperçu du document TERStatistics_of_Extremes.pdf - page 1/28
 
TERStatistics_of_Extremes.pdf - page 3/28
TERStatistics_of_Extremes.pdf - page 4/28
TERStatistics_of_Extremes.pdf - page 5/28
TERStatistics_of_Extremes.pdf - page 6/28
 




Télécharger le fichier (PDF)


TERStatistics_of_Extremes.pdf (PDF, 1.5 Mo)

Télécharger
Formats alternatifs: ZIP



Documents similaires


terstatistics of extremes
7 varomegacopules
nmeth 1371
8 portfoliooptihedge
ecography2005
appalgo

Sur le même sujet..