Asymptotic Extremal Distribution for Non-Stationary, Strongly-Dependent Data ()
1. Introduction
The statistical analysis of extreme values has a wide and vast domain of applications on many disciplines. Extreme wind speeds are a key input for design in Structural Engineering. Maximum levels of traffic are crucial for design and operation in Telecommunications Networks. Maximum tides are essential for any policy concerning coast resources management. Extreme events on chemical, physical or biological conditions may affect dramatically very sensitive ecosystems, etc. [1] - [8].
The classical Fisher-Tippet-Gnedenko theory determines the asymptotic behavior of the maximum of a sample of size n of an iid sequence of random variables, showing that, for n tending to infinity, the distribution limit may be degenerated, and in any other case, it must be an extremal distribution [9] [10].
Since three decades ago, we know that classical theory applies to a stationary and weakly dependent sequence of random variables, thanks to the work of Leadbetter, Lindgreen and Rootzén [11] - [17].
However, in many real examples, a very big collection of maximal registers of a large series of measurements, does not fit to any extremal distribution. This is often associated to phenomena where the observed system may assume different states that produce drastical changes and that may introduce strong dependence on data. Think, for instance, of the classical series of data of the Nile River with its very ancient regimen of annual floods, and other large series of hydrological data, in particular those related to the impact of climate change [18] [19] [20].
At a theoretical level, we will apply in the extremal context a method that has been used for different purposes for non-stationary and strongly dependent data, where a random covariable indicates the state of the system, and the global behavior may be represented by a mixture of models. For instance, that is the case of Compound-Poisson approximation of High-Level Exceedances of time series [21], asymptotic of averages [22], and Nadaraya-Watson regression for functional data [23].
In this paper, we will consider data that depend of two independent components: on one hand, a categorical covariable process that describes the state of the system, that may be neither stationary nor weakly dependent, and that only satisfies that the mean frequency of each state has a (possibly random) limit, and, on the other hand, an iid noise. We will assume that for a given state j on the covariable, the maximal asymptotic distribution is non-degenerated and depends on j. In the main result of the paper, we will show that if the covariable process is stationary and weak-dependent, the maximal asymptotic distribution of our data is still an extremal distribution, consistently with Leadbetter, Lindgreen and Rootzén results. But we will also show that if this covariable process has a strong dependence structure, then this maximal asymptotic distribution is no longer an extremal one, but a mixture of extremal distributions.
The paper is organized as follows: we begin in Section 2 with a very brief summary of classical Fisher-Tippet-Gnedenko theory, in particular the characterization of the Maximal Domains of Attraction of extremal distributions and an elementary Lemma that we will use later. We then present in Section 3 our model, its hypotheses, some examples and the main result. In Section 4, we will fit some simulated strongly dependent data to a mixture of extremal distributions, showing that they do not fit extremal distributions. In particular, we will show the impact on return times of misfitting data to a single extremal distribution. Finally in Section 5, we present the conclusions and some further work in progress.
Therefore, what we provide here is a statistical method that is a step towards a proper extremal values analysis for: 1) data that are suspected to be strongly dependent; 2) data of unknown dependence structure whose extremes do not fit to classical extremal distributions.
2. Brief Review of Classical Extreme Values Theory
The classical Fisher-Tippet-Gnedenko theory states that a distribution F belongs to the Maximal Domain of Attraction of an extremal distribution (Weibull, Gumbel, Fréchet) H if there exists an iid sequence
of random variables with distribution F such that for some deterministic sequences
and
, we have that
(1)
Recall that
.
Then the three Maximal Domains of Attraction (MDA) are fully described as follows.
1) Fréchet of order α: F belongs to the
, where
and
is called the order parameter, if and only if,
and for x tending to infinity,
for some slowly varying functionL. In that case, in (1), the deterministic sequences are
,
.
2) Weibull of order α: F belongs to the
, where
, if and only if
and for x tending to
,
with L a slowly varying function. In this case
and
.
3) Gumbel: F belongs to the
, where
, if and only if there exist an
(which may be finite or infinite), some
and a positive function h, with density
, such that
and, for x tending to
,
is equivalent to c
, and
,
.
3. Main Result
We will use the following elementary result, whose proof follows from simple analytical computations based on the characterizations of the involved sequences given in the previous section [24].
Lemma 1: If we denote by
,
,
,
,
, the deterministic sequences corresponding to each MDA, we then have:
i) If
0, and
0,
0.
ii)
,
0.
iii) If
,
0.
We will assume that the process Y satisfies:
(H1) For any state
, there exists a (possibly random)
such that
a.s.
If
, and
, then, since for any j,
is
-measurable, if
is trivial,
are deterministic, but, if
is not trivial, for some j
may be non-deterministic, corresponding to strong dependence on the process Y. Let us show this in a very simple case.
Example 1:
Let U be a random variable such that
,
. Let
, ...an iid sequence of random variables on
independent of U such that
,
,
,
, with
,
independent for any i,
,
.
Set
.
Thus,
has the same distribution as
(by the Strong Law of Large Numbers).
On the other hand
has the same distribution as
.
Therefore if we assume that
, we have that
Hence,
is not-deterministic and
is not trivial. Similar treatment applies to
.
Indeed, for the sake of simplicity, we will assume:
(H2) for anyj,
may only assume a finite number of values.
Our data will be of the form:
(2)
where f is unknown, and we will assume:
(H3) The three following conditions are fulfilled.
(i)
iid
(ii)
satisfies (H1) and (H2)
(iii) The processes
and
are independent.
Finally, we will assume that the state of the system is observable, and it affects the extremal behavior of our data. For instance, X may be a measure of tide exceedence of a baseline on a given coast, which depends on the type of wind affecting the area, that may be classified into a finite number of categories. Given the category of wind, the extreme tides behave in a significant different manner. Then, Y determines the type of wind on the coast, that may be observed from aerodynamical registers. In addition, X also depends on a series of random effects that may be considered as a white noise.
More precisely, we shall assume:
(H4) There exist an integer
and an integer
,
such that
a) For
the iid process
belongs to
, where
b) For
the iid process
belongs to
c) For
, the iid process
belongs to
, with
Then we have our main result
Theorem 1:
Under (H3) there exists a random variable Z such that
In addition
a) If
is trivial, then the distribution of Z is
.
b) If
is not trivial and
assumes the values
with probabilities
, then the distribution of Z is
(Mixture of Fréchet distributions).
Proof:
Let us consider a fixed real x and set
the space of sequences taking values in
Consider
for any
in
.
Therefore,
(3)
Re-ordering the maximum according to the statej taken by each
, we get
(4)
Among the blocks of maximum values taken with Y fixed on a given state, the MDA is different, and taking into account Lemma 1, and (H4), all the blocks in (4) after the first, tends in probability to zero, and therefore, the limit as n tends to infinity of (4) is the same as the limit, for n tending to infinity of
(5)
where
(a random variable).
Then, the limit as n tends to infinity, of (3) is the same as the limit, for n tending to infinity of
(6)
Using that
are iid, and taking into account (5) and (6), the limit of (3) for n tending to infinity, is the same as
(7)
Since
, and by Dominated Convergence Theorem, the limit in (7) equals
(8)
By Fisher-Tippet-Gnedenko theorem and (H4)
(9)
and therefore, if
is trivial,
is deterministic and (8) equals to
, with
and part a) of Theorem 1 follows.
On the other hand, if
is random, using (H2), part b) of Theorem 1 follows
Remark 1:
It is clear that (H2) may be removed, leading in (b) to an integral with respect to the distribution of
instead of a sum.
Remark 2:
It is easy to obtain a similar result with Gumbel or Gumbel mixtures, if we modify (H4) removing part a) (taking
), and with Weibull or Weibull mixtures if we remove parts a) and b) (taking
).
Example 2:
Following the ideas of Example 1, consider
independent, such that
,
,
,
,
,
and
.
Taking
a sequence of independent copies of
it turns out that if U is a fixed random variable such that
,
,
, then if
, we have that
fulfills (H1),(H2) with
random variables such that
and
Thus, if we assume
and consider two independent sequences
,
,
,
and we set:
a) If
b) If
then, by Example 1 and part b) of Theorem 1,
, with
, a mixture of Fréchet distributions of order
.
Figure 1 shows the difference between a Fréchet model with
(F1) and a mixture with the same
(MF1),
and
It is clear that the tail of MF1 even for moderate values like 5, is smaller than the tail of F1, which means that the F1 leads to wrong pessimistic predictions for high levels. Indeed in Table 1, we present return times of MF1 and F1 for a series of levels, where the wrong pessimist results of F1 are very clear.
![]()
Figure 1. Fréchet model with
(F1) and a mixture with the same
(MF1),
and
.
![]()
Table 1. Return times for Fréchet and Mixture models.
4. Application to Simulated Data
We will now simulate data leading to the mixture model of the preceding example, and see if data fits only to the mixture model.
If we consider a distribution function of the form:
(10)
then
, with
which is a slowly varying function and therefore
.
Its inverse function is
(11)
Consider
a random variable such that
are independent and
,
,
,
, with
,
,
.
We shall take
and
.
Consider
,
and
(with
and
) a matrix of independent copies of
. As seen in Example 1, if for each
we define
, where
iid such that
,
(we will take
), then, for each fixedi,
is a sample of a process that fulfills (H1) and (H2).
We consider now two independent iid matrices
(with
and
) and
(with
and
) such that
(the distribution of (10) for
),
(the distribution of (10) for
), and we finally set for each
,
Hence, for each i,
is a sample where Theorem 1 applies, and therefore,
must be close to the distribution
.
We will now compute the empirical cumulative distribution function (ecdf) of
and see if it fits better to MF1 or to a simple Fréchet model (F1).
First, in Figure 2, we will display both models (MF1 and F1) and ecdf graphically, and afterwards, we will perform a Kolmogorov-Smirnov goodness of fit test (K-S test) to
with respect to MF1 and F1.
It can be seen that the ecdf is closer to the theoretical MF1 function. The K-S test support this result since H0 is retained with respect to MF1 (K-S test statistic = 0.088, critical value = 0.096 for a significance level of 0.05 and n = 200), while it is rejected with respect to F1 (K-S test statistic = 0.177).
![]()
Figure 2. Theoretical cumulative distribution function of MF1 and F1, and ecdf of
.
5. Discussion & Conclusions
Our main interest is to provide theoretical results that guide practitioners trying to perform extremal analysis for data with complex dependence structure.
This paper shows that when the maximum of a large sample does not fit to an extremal distribution, it may be due to a strong-dependent structure that may be solved by fitting data to a mixture of extremal distributions.
Indeed, here we deal with the case when data depends on a covariable Y that may be strongly dependent, but which is observable. This is a reasonable assumption in many cases.
For future research, we also have to consider situations where Y may be hidden, and in a work in progress we are improving methods for estimation and fitting of a mixture of extremal distributions in such cases.
On the other hand, the very simple model used here to represent strongly-dependent samples, may also be applied to other techniques used in statistical analysis of extreme values. In a work in progress, strongly dependent structures are considered for Peaks Over a Manifold (POM), an approach that includes classical Peaks Over Threshold (POT) technique as a particular case [25].
Acknowledgements
Sincere thanks to the participants of the “Jornadas de Estadística Aplicada 2019” in La Paloma-Uruguay, for suggestions on a previous version of these results. We are also grateful to anonymous reviewers for their valuable comments on this paper.