Concentration of risk measures: A Wasserstein distance approach 1 Prashanth L. A. ♯ Joint work with Sanjay P. Bhat † ♯ IIT Madras † TCS Research ∗ To appear in the proceedings of NeurIPS-2019.
Introduction
Risk criteria • Conditional Value-at-Risk (Rockafellar, Ursayev 2000) • Spectral risk measures (Acerbi 2002) • Cumulative prospect theory (Tversky,Kahnemann 1992) 2
Open Question ??? Given i.i.d. samples and an empirical version of the risk measure, for a distribution with unbounded support Obtain concentration bounds for each of the three risk measures Idea: Use finite sample bounds for Wasserstein distance between empirical and true distributions 3
Empirical risk concentration: summary of contributions Our work Probability Theory and Related Fields, 2015. 1N. Fournier and A. Guillin. On the rate of convergence in Wasserstein distance of the empirical measure. distributions 1 related to Wasserstein distance between empirical and true Unified approach: For each bound, the estimation error is Our work [Cheng et al. 2018] Cumulative prospect theory Our work Spectral risk measures Our work [Brown et al.], [Gao et al.] Conditional Value-at-Risk Sub-Gaussian Bounded support Risk measure 4 Goal: Bound P [ | ˆ r n − r ( X ) | > ϵ ] ˆ r n → empirical risk using n i.i.d. samples, r ( X ) → true risk
Wasserstein Distance
Wasserstein Distance the amount of mass shipped from a neighborhood d x of x to the neighborhood the optimal shipping plan plan F • The integral above is then the total transportation distance under the shipping 5 inf The Wasserstein distance between two CDFs F 1 and F 2 on R is [ ] ∫ W 1 ( F 1 , F 2 ) = R 2 | x − y | d F ( x , y ) , where the infimum is over all joint distributions having marginals F 1 and F 2 Related to the Kantorovich mass transference problem • Ship masses around so that the initial mass distribution F 1 changes into F 2 • Shipping plan: given by joint distribution F with marginals F 1 and F 2 such that d y of y is proportional to d F ( x , y ) • Wasserstein distance between F 1 and F 2 is the transportation distance under
Wasserstein Distance: Concentration Bounds exp Probability Theory and Related Fields, 2015. 2N. Fournier and A. Guillin. On the rate of convergence in Wasserstein distance of the empirical measure. exp Higher moment bound: exp 6 Exponential moment bound: samples. Then 2 , X → r.v. with CDF F , F n → empirical CDF formed using n i.i.d. P ( W 1 ( F n , F ) > ϵ ) ≤ B ( n , ϵ ) , for any ϵ > 0, ( ( γ | X − E ( X ) | β )) If ∃ β > 1 and γ > 0 such that E < ⊤ < ∞ , then ( ( − cn ϵ 2 ) ( − cn ϵ β ) ) B ( n , ϵ ) = C I { ϵ ≤ 1 } + exp I { ϵ > 1 } ( | X − E ( X ) | β ) If ∃ β > 2 such that E < ⊤ < ∞ , then, for any η ∈ ( 0 , β ) , ( ) ( − cn ϵ 2 ) I { ϵ ≤ 1 } + n ( n ϵ ) − ( β − η ) / p I { ϵ > 1 } B ( n , ϵ ) = C
Conditional Value-at-Risk
VaR and CVaR are Risk-Sensitive Metrics c X v X 1 1 X v X v X X X Conditional Value at Risk: • Widely used in financial portfolio optimization, credit risk X 1 F X v Value at Risk: 0 95) 0 1 (say • Fix a ‘risk level’ • Let X be a continuous random variable assessment and insurance 7
VaR and CVaR are Risk-Sensitive Metrics c X v X 1 1 X v X v X X X Conditional Value at Risk: • Widely used in financial portfolio optimization, credit risk X 1 F X v Value at Risk: 0 95) (say • Let X be a continuous random variable assessment and insurance 7 • Fix a ‘risk level’ α ∈ ( 0 , 1 )
VaR and CVaR are Risk-Sensitive Metrics X X v X 1 1 X v X v X X c • Widely used in financial portfolio optimization, credit risk Conditional Value at Risk: X 1 F X v Value at Risk: • Let X be a continuous random variable assessment and insurance 7 • Fix a ‘risk level’ α ∈ ( 0 , 1 ) (say α = 0 . 95)
VaR and CVaR are Risk-Sensitive Metrics v X v X 1 1 X v X X X • Widely used in financial portfolio optimization, credit risk X c Conditional Value at Risk: Value at Risk: • Let X be a continuous random variable assessment and insurance 7 • Fix a ‘risk level’ α ∈ ( 0 , 1 ) (say α = 0 . 95) v α ( X ) = F − 1 X ( α )
VaR and CVaR are Risk-Sensitive Metrics • Widely used in financial portfolio optimization, credit risk assessment and insurance • Let X be a continuous random variable Value at Risk: Conditional Value at Risk: 1 7 • Fix a ‘risk level’ α ∈ ( 0 , 1 ) (say α = 0 . 95) v α ( X ) = F − 1 X ( α ) c α ( X ) = E [ X | X > v α ( X )] 1 − α E [ X − v α ( X )] + = v α ( X ) +
Defining CVaR 1 1 Value at Risk: For a general r.v. X , Conditional Value at Risk: 8 v α ( X ) = F − 1 X ( α ) c α ( X ) = E [ X | X > v α ( X )] 1 − α E [ X − v α ( X )] + = v α ( X ) + { } , where ( y ) + = max ( y , 0 ) ( 1 − α ) E ( X − ξ ) + c α ( X ) = inf ξ + ξ
CVaR is a Coherent Risk Metric cannot lead to increased risk. Note: VaR is not sub-additive 3 3 P. Artzner et al. ”Coherent measures of risk.” Mathematical finance 9.3 (1999). 9 • Monotonicity: If X ≤ Y , then c ( X ) ≤ c ( Y ) • Sub-additivity: c ( X + Y ) ≤ c ( X ) + c ( Y ) , i.e., diversification • Positive Homogeneity: c ( λ X ) = λ c ( X ) for any λ ≥ 0 . • Translation Invariance: For deterministic a > 0 , c ( X + a ) = c ( X ) − a .
CVaR is a Coherent Risk Metric cannot lead to increased risk. Note: VaR is not sub-additive 3 3 P. Artzner et al. ”Coherent measures of risk.” Mathematical finance 9.3 (1999). 9 • Monotonicity: If X ≤ Y , then c ( X ) ≤ c ( Y ) • Sub-additivity: c ( X + Y ) ≤ c ( X ) + c ( Y ) , i.e., diversification • Positive Homogeneity: c ( λ X ) = λ c ( X ) for any λ ≥ 0 . • Translation Invariance: For deterministic a > 0 , c ( X + a ) = c ( X ) − a .
2. Gaussian Case: Suppose X Examples Q would do and – estimating For these distributions, no separate CVaR estimate is necessary 0 1 Z Z c X • c 1 X • v 2 1 10 1. Exponential Case: Suppose X ∼ Exp ( µ ) ( ) • v α ( X ) = 1 , µ ln 1 − α • c α ( X ) = v α ( X ) + 1 µ (memoryless!)
Examples For these distributions, no separate CVaR estimate is necessary would do and – estimating 10 1 1. Exponential Case: Suppose X ∼ Exp ( µ ) ( ) • v α ( X ) = 1 , µ ln 1 − α • c α ( X ) = v α ( X ) + 1 µ (memoryless!) 2. Gaussian Case: Suppose X ∼ N ( µ, σ 2 ) • v α ( X ) = µ − σ Q − 1 ( α ) • c α ( X ) = µ + σ c α ( Z ) , Z ∼ N ( 0 , 1 )
Examples 1 For these distributions, no separate CVaR estimate is necessary 10 1. Exponential Case: Suppose X ∼ Exp ( µ ) ( ) • v α ( X ) = 1 , µ ln 1 − α • c α ( X ) = v α ( X ) + 1 µ (memoryless!) 2. Gaussian Case: Suppose X ∼ N ( µ, σ 2 ) • v α ( X ) = µ − σ Q − 1 ( α ) • c α ( X ) = µ + σ c α ( Z ) , Z ∼ N ( 0 , 1 ) – estimating µ and σ would do
CVaR estimation: The problem X , estimate Nice to have : Sample complexity O 11 Problem: Given i.i.d. samples X 1 , . . . , X n from the distribution F of r.v. c α ( X ) = E [ X | X > v α ( X )] ( 1 /ϵ 2 ) for accuracy ϵ
12 n CVaR estimate: 1 distribution F , n 1 VaR estimate: following estimates 4 : i v n 1 X i n n v n 4Serfling, R. J. (2009). Approximation theorems of mathematical statistics, volume 162. John Wiley & Sons. c n Empirical distribution function (EDF): Given samples X 1 , . . . , X n from ∑ ˆ F n ( x ) = 1 I { X i ≤ x } , x ∈ R i = 1 Using EDF and the order statistics X [ 1 ] ≤ X [ 2 ] ≤ . . . , X [ n ] , form the v n ,α = inf { x : ˆ ˆ F n ( x ) ≥ α } = X [ ⌈ n α ⌉ ] .
12 following estimates 4 : 4Serfling, R. J. (2009). Approximation theorems of mathematical statistics, volume 162. John Wiley & Sons. n n n 1 CVaR estimate: distribution F , VaR estimate: Empirical distribution function (EDF): Given samples X 1 , . . . , X n from ∑ ˆ F n ( x ) = 1 I { X i ≤ x } , x ∈ R i = 1 Using EDF and the order statistics X [ 1 ] ≤ X [ 2 ] ≤ . . . , X [ n ] , form the v n ,α = inf { x : ˆ ˆ F n ( x ) ≥ α } = X [ ⌈ n α ⌉ ] . ∑ v n ,α ) + ˆ c n ,α = ˆ ( X i − ˆ v n ,α + n ( 1 − α ) i = 1
Concentration bounds for CVaR Estimation exp Sub-Gaussian r.v.s satisfy (C1), while sub-exponential r.v.s satisfy (C2) (C2) X satisfies a higher-moment bound, i.e., or • Need to put some restrictions on the tail distribution to obtain 13 (C1) X satisfies an exponential moment bound, i.e., • Our assumptions: exponential concentration ( ( γ | X − µ | β )) ∃ β > 0 and γ > 0 s.t. E < ⊤ < ∞ , where µ = E ( X ) ( | X − µ | β ) β > 0 such that E < ⊤ < ∞
e X e X c 1 exp 14 0 s.t. Tail dominated by an exponential r.v 0 c 2 X Or b 1 2 2 2 e b Or equivalently, c 0 0 s.t. c 0 A random variable is X is sub-exponential if Tail dominated by a Gaussian A random variable is X is sub-Gaussian if ∃ σ > 0 s.t. [ e λ X ] σ 2 λ 2 2 , ∀ λ ∈ R . ≤ e E Or equivalently, letting Z ∼ N ( 0 , σ 2 ) , P [ X > ϵ ] ≤ c P [ Z > ϵ ] , ∀ ϵ > 0 .
Recommend
More recommend