Detecting seasonality changes in multivariate extremes of - - PowerPoint PPT Presentation
Detecting seasonality changes in multivariate extremes of - - PowerPoint PPT Presentation
Detecting seasonality changes in multivariate extremes of climatological time series Philippe Naveau Laboratoire des Sciences du Climat et lEnvironnement, France joint work with Sebastian Engelke (Geneva University) and Chen Zhou (Erasmus
Motivation : heavy rainfall in Brittany
−4 −2 2 4 6 8 42 44 46 48 50
Two regions
Longitudes
Daily rainfall from 1976 to 2015
Seasons in 1976−1977 Rainfall in mmm 10 20 30 40 50
BREST
Seasons in 1976−1977 Rainfall in mmm 10 20 30 40
LANVEOC
Seasons in 1976−1977 Rainfall in mmm 10 20 30 40 50 60 70
QUIMPER
Our climatological objectives Are heavy rainfall dependence structures change from seasons to seasons? Are extreme precipitation dependence structures differ from regions to regions? Our statistical objective Detecting changes in the dependence structure in multivariate time series of extremes
Statistical desiderata Few assumptions and non-parametric models Fast, simple and general tools No complicated MEVT jargon for climatologists
Our tools Strengthen links between the MEVT (Multivariate Extreme Value Theory) and information theory communities by revisiting the Kullback-Leibler divergence Detecting changes in precipitation extremes structure
Our tools Strengthen links between the MEVT (Multivariate Extreme Value Theory) and information theory communities by revisiting the Kullback-Leibler divergence Detecting changes in precipitation extremes structure
Why the Kullback–Leibler divergence? Machine learning Supervised learning = minimizing the KL divergence objective Proper scoring rules in forecast The logarithm score Information theory Information gain for comparing two distributions Causality theory Bayesian causal conditionals Dynamical systems Statistical mechanics (Boltzmann thermodynamic) Climate science Detection & Attribution and compound events
Kullback–Leibler divergence Definition and notation Let X and Y two random variables with pdfs f and g DKL(X||Y) = Ef log {f(X)} − Ef log {g(X)}
Kullback–Leibler divergence Definition and notation Let X and Y two random variables with pdfs f and g DKL(X||Y) = Ef log {f(X)} − Ef log {g(X)} and D(X, Y) = DKL(X||Y) + DKL(X||Y)
Kullback–Leibler divergence Definition and notation Let X and Y two random variables with pdfs f and g DKL(X||Y) = Ef log {f(X)} − Ef log {g(X)} and D(X, Y) = DKL(X||Y) + DKL(X||Y) Properties DKL(X||Y) ≥ 0 DKL(X||Y) = DKL(X1||Y1) + DKL(X2||Y2) if X1 ⊥ X2 and Y1 ⊥ Y2 DKL is convex DKL is not a metric. Still, the total variation distance,δ(X, Y), satisfies δ(X, Y) ≤
- DKL(X||Y)
Kullback–Leibler divergence A simple example If X and Y two Bernoulli random variables with p = P(X = 1) and q = P(Y = 1), then D(X, Y) = (p − q) × log p q 1 − q 1 − p
- 0.2
0.4 0.6 0.8 0.2 0.4 0.6 0.8 p q 2 4 6 8 10 12 Bernoulli Kullback Liebler divergence 0.0 0.2 0.4 0.6 0.8 1.0 2 4 6 probability of sucess Bernoulli Kullback Liebler divergence
Kullback–Leibler divergence A more complicated example If X and Y are two multinomial random variables with p1, p2, . . . , pK and q1, q2, . . . , qK where q1 + · · · + qK = 1 = p1 + · · · + pK, then D := D(p1, . . . , pK; q1, . . . , qK) =
K
- j=1
(pj − qj)(log pj − log qj)
Kullback–Leibler divergence Inference Let X and Y two random variables with pdfs f and g DKL(X||Y) = Ef log f(X) g(X)
- Estimation hurdle
It seems that the densities f and g need to be known to estimate DKL(X||Y). This will be an issue for multivariate extremes.
Univariate case
Univariate regularly varying case P(X > x) = ¯ F(x) = x−αLX(x) and P(Y > x) = ¯ G(x) = x−βLY(x) where x > 0, LX, LY slowly varying and α, β > 0 are the tail indices. Examples : Cauchy, t-distribution, α-stable, Pareto,...
Exceedances above a high threshold u Xu = (X/u | X > u), Yu = (Y/u | Y > u) with respective densities fu and gu on [1, ∞) Symmetric Kullback–Leibler divergence D(Xu, Yu) = E log fu(Xu) gu(Xu)
- + E log
gu(Yu) fu(Yu)
Exceedances above a high threshold u Xu = (X/u | X > u), Yu = (Y/u | Y > u) with respective densities fu and gu on [1, ∞) Symmetric Kullback–Leibler divergence D(Xu, Yu) = E log fu(Xu) gu(Xu)
- + E log
gu(Yu) fu(Yu)
- Extremes for univariate regularly varying functions
lim
u→∞ D(Xu, Yu) = (α − β)2
αβ
Survival functions versus densities?
PN, Guillou and Riestch (2014, JRSSB) The KL divergence has the representation D(Xu, Yu) = −
- 2 + E log
- G(uXu)
G(u)
- + E log
- F(uYu)
F(u)
- + ∆(u)
= L(Xu, Yu) + ∆(u), where ∆(u) → 0, u → ∞, under a second-order condition.
PN, Guillou and Riestch (2014, JRSSB) The KL divergence has the representation D(Xu, Yu) = −
- 2 + E log
- G(uXu)
G(u)
- + E log
- F(uYu)
F(u)
- + ∆(u)
= L(Xu, Yu) + ∆(u), where ∆(u) → 0, u → ∞, under a second-order condition. E ¯ xample for the GPD distribution
PN, Guillou and Riestch (2014, JRSSB) The KL divergence has the representation D(Xu, Yu) = −
- 2 + E log
- G(uXu)
G(u)
- + E log
- F(uYu)
F(u)
- + ∆(u)
= L(Xu, Yu) + ∆(u), where ∆(u) → 0, u → ∞, under a second-order condition.
PN, Guillou and Riestch (2014, JRSSB) The KL divergence has the representation D(Xu, Yu) = −
- 2 + E log
- G(uXu)
G(u)
- + E log
- F(uYu)
F(u)
- + ∆(u)
= L(Xu, Yu) + ∆(u), where ∆(u) → 0, u → ∞, under a second-order condition. Inference only based on cdf’s For two indep. samples, X (1), . . . , X (n) ∼ F and Y (1), . . . , Y (n) ∼ G, Ln(fu, gu) = − 2 + 1 Nn
- X(i)>u
log
- Gn(X (i))
Gn(u)
- + 1
Mn
- Y (i)>u
log F n(Y (i)) F n(u) , where F n and Gn the empirical survival functions.
PN, Guillou and Riestch (2014, JRSSB) + Engelke, PN, Zhou (2020+)
Multivariate case
Classical EVT trick Let X in Rd, we don’t like to define extremes in a full multivariate space, so we condition unto one dimensional vector r(X) in R+
Definition of the tail region 1 Homogeneity condition of order one r(tx) = t × r(x), for any scalar t > 0 and x ∈ Rd
+
Examples : r(x) = max(x1, . . . , xd), r(x) = x1 + . . . xd, r(x) = min(x1, . . . , xd)
- 1. Dombry and Ribatet (2015, Statistics and its Interface)
Cutting the tail region into smaller regions Partition {x ∈ Rd : r(x) > 1} = K
j=1 Aj
Cutting the tail region into smaller regions Partition {x ∈ Rd : r(x) > 1} = K
j=1 Aj
Cutting the tail region into smaller regions Partition {x ∈ Rd : r(x) > 1} = K
j=1 Aj
Our main assumption under some type of partitioning lim
u→∞ pj(u) := lim u→∞
Pr(X ∈ uAj) Pr(r(X) > u) = pj ∈ (0, 1) lim
u→∞ qj(u) := lim u→∞
Pr(Y ∈ uAj) Pr(r(Y) > u) = qj ∈ (0, 1)
A special case with r(x) = max(x1, 0) with X1 = X2 in distribution and Xi > 0 Pr(r(X) > u) = Pr(X1 > u) and Pr(X ∈ uA1) = Pr(X1 > u, X2 > u) with A1 = {min(x1, x2) > 1}
A special case with r(x) = max(x1, 0) with X1 = X2 in distribution and Xi > 0 Pr(r(X) > u) = Pr(X1 > u) and Pr(X ∈ uA1) = Pr(X1 > u, X2 > u) with A1 = {min(x1, x2) > 1} χ the classical extremal dependence coefficient lim
u→∞ p1(u) := lim u→∞ Pr(X2 > u|X1 > u) = χ and
lim
u→∞ p2(u) = 1 − χ
Main objective Two sample hypothesis testing Given two independent samples of X in Rd and Y in Rd, we want to test
H0 : pj = qj, ∀j
with lim
Pr(X∈uAj ) Pr(r(X)>u) = pj ∈ (0, 1) and lim Pr(Y∈uAj ) Pr(r(Y)>u) = qj ∈ (0, 1)
Back to the multinomial Kullback–Leibler divergence Reminder If X and Y are two multinomial random variables with p1, p2, . . . , pK and q1, q2, . . . , qK where q1 + · · · + qK = 1 = p1 + · · · + pK, then D := D(p1, . . . , pK; q1, . . . , qK) =
K
- j=1
(pj − qj)(log pj − log qj)
Multinomial Kullback-Liebler divergence Estimation ˆ D(u, v) :=
K
- j=1
(ˆ pj(u) − ˆ qj(v))(log ˆ pj(u) − log ˆ qj(v)). with ˆ pj(u) = n
i=1 1Xi ∈uAj
n
i=1 1r(Xi )>u
and ˆ qj(v) = n
i=1 1Yi ∈vAj
n
i=1 1r(Yi )>v
.
Our main result
Multinomial Kullback-Liebler divergence Choose two sequences un and vn such that mn = n Pr (r(X) > un) = n Pr (r(Y) > vn) → ∞, and mn/n → 0, and assume some second order conditions of pj(u) towards pj (same for qj(u))
Multinomial Kullback-Liebler divergence Choose two sequences un and vn such that mn = n Pr (r(X) > un) = n Pr (r(Y) > vn) → ∞, and mn/n → 0, and assume some second order conditions of pj(u) towards pj (same for qj(u)) Under H0 : pj = qj mn 2 ˆ D(un, vn) → χ2(K − 1), as n ↑ ∞.
Multinomial Kullback-Liebler divergence Choose two sequences un and vn such that mn = n Pr (r(X) > un) = n Pr (r(Y) > vn) → ∞, and mn/n → 0, and assume some second order conditions of pj(u) towards pj (same for qj(u)) Under H0 : pj = qj mn 2 ˆ D(un, vn) → χ2(K − 1), as n ↑ ∞. If pj = qj for some j D is positive and √mn(ˆ D(un, vn) − D) → N(0, σ2), as n ↑ ∞, where σ2 an explicit function of pi and qj
Dealing with marginals and random thresholds In theory We have mn = n Pr (r(X) > un) = n Pr (r(Y) > vn) and ˆ pj(un) = n
i=1 1Xi ∈uAj
n
i=1 1r(Xi )>un
and ˆ qj(vn) = n
i=1 1Yi ∈vAj
n
i=1 1r(Yi )>vn
Dealing with marginals and random thresholds In theory We have mn = n Pr (r(X) > un) = n Pr (r(Y) > vn) and ˆ pj(un) = n
i=1 1Xi ∈uAj
n
i=1 1r(Xi )>un
and ˆ qj(vn) = n
i=1 1Yi ∈vAj
n
i=1 1r(Yi )>vn
In practice : the distribution of r(X) and r(Y) unknown ˆ pj = 1 mn
n
- i=1
1Xi ∈R(X)
n−mn,nAj and ˆ
qj = 1 mn
n
- i=1
1Yi ∈R(Y)
n−mn,nAj ,
where R(X) and R(Y) denote the ranks
Effect of the marginals and number of sets KNOWN MARGINS UNKNOWN MARGINS
- 2
3 4 5 6 7 8 9 0.0 0.2 0.4 0.6 0.8 nb.sets prob
- Type 1 error, weak dependence
Type 1 error, strong dependence power of test significance level of 5%
Effect of the marginals and number of sets KNOWN MARGINS UNKNOWN MARGINS
- 2
3 4 5 6 7 8 9 0.0 0.2 0.4 0.6 0.8 nb.sets prob
- Type 1 error, weak dependence
Type 1 error, strong dependence power of test significance level of 5%
- 2
3 4 5 6 7 8 9 0.0 0.2 0.4 0.6 0.8 nb.sets prob
- Type 1 error, weak dependence
Type 1 error, strong dependence power of test significance level of 5%
Dealing with marginals and random thresholds Case A (rainfall) : unknown marginals with same RV tail index Under some second RV order conditions on r(X) and r(Y), we still have mn 2 ˆ D(un, vn) → χ2(K − 1), under H0
Seasonality in Brittany rainfall extremes dependence?
# excesses KL 0.00 0.05 0.10 0.15 0.20 200 150 100 50 Spring vs Winter
Seasonality in Brittany rainfall extremes dependence?
# excesses KL 0.00 0.05 0.10 0.15 0.20 200 150 100 50 Spring vs Winter # excesses KL 0.00 0.05 0.10 0.15 0.20 200 150 100 50 Summer vs Winter # excesses KL 0.00 0.05 0.10 0.15 0.20 200 150 100 50 Fall vs Winter # excesses KL 0.00 0.05 0.10 0.15 0.20 200 150 100 50 Fall vs Spring # excesses KL 0.00 0.05 0.10 0.15 0.20 200 150 100 50 Fall vs Summer # excesses KL 0.00 0.05 0.10 0.15 0.20 200 150 100 50 Summer vs Spring
Seasonality in Eastern rainfall extremes dependence?
# excesses KL 0.00 0.02 0.04 0.06 0.08 0.10 0.12 200 150 100 Spring vs Winter in East # excesses KL 0.00 0.02 0.04 0.06 0.08 0.10 0.12 200 150 100 Summer vs Winterin East # excesses KL 0.00 0.02 0.04 0.06 0.08 0.10 0.12 200 150 100 Fall vs Winter in East # excesses KL 0.00 0.02 0.04 0.06 0.08 0.10 0.12 200 150 100 Fall vs Spring in East # excesses KL 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 200 150 100 Fall vs Summer in East # excesses KL 0.00 0.02 0.04 0.06 0.08 0.10 0.12 200 150 100 Summer vs Spring in East
Dealing with marginals and random thresholds Case B (Compound events) : unknown marginals with different tails
- 2. Ledford and Tawn (1996) and Draisma et al. (2004), with transformed standard Pareto
marginals and corresponding partition Aj
Dealing with marginals and random thresholds Case B (Compound events) : unknown marginals with different tails For d = 2, r(x) = min(x1, x2) and under some second order AI 2 condition (η < 1), we still have mn 2 ˆ D(un, vn) → χ2(K − 1), under H0
- 2. Ledford and Tawn (1996) and Draisma et al. (2004), with transformed standard Pareto
marginals and corresponding partition Aj
Dealing with marginals and random thresholds Case B (Compound events) : unknown marginals with different tails For d = 2, r(x) = min(x1, x2) and under some second order AI 2 condition (η < 1), we still have mn 2 ˆ D(un, vn) → χ2(K − 1), under H0 For η = 1 (AD), more complex limiting result.
- 2. Ledford and Tawn (1996) and Draisma et al. (2004), with transformed standard Pareto