asymptotic robustness of estimators in rare event
play

Asymptotic Robustness of Estimators in Rare-Event Simulation P. - PowerPoint PPT Presentation

Asymptotic Robustness of Estimators in Rare-Event Simulation P. LEcuyer, Universit e de Montr eal, Canada J. H. Blanchet, Columbia University, USA P. W. Glynn, Stanford University, USA B. Tuffin, IRISA, Rennes, France RESIM 2008,


  1. Asymptotic Robustness of Estimators in Rare-Event Simulation P. L’Ecuyer, Universit´ e de Montr´ eal, Canada J. H. Blanchet, Columbia University, USA P. W. Glynn, Stanford University, USA B. Tuffin, IRISA, Rennes, France RESIM 2008, Rennes

  2. Outline ◮ Rare-event setting and motivation. ◮ Asymptotic robustness properties: Definitions. ◮ Markov chain model and zero-variance approximation. ◮ Examples: Highly-reliable Markovian systems. ◮ Examples: Random walks.

  3. Rare-event setting We want to estimate a small quantity γ = γ ( ε ) > 0, where γ ( ε ) → 0 when ε → 0.

  4. Rare-event setting We want to estimate a small quantity γ = γ ( ε ) > 0, where γ ( ε ) → 0 when ε → 0. We have an unbiased estimator (r.v.) Y = Y ( ε ) ≥ 0.

  5. Rare-event setting We want to estimate a small quantity γ = γ ( ε ) > 0, where γ ( ε ) → 0 when ε → 0. We have an unbiased estimator (r.v.) Y = Y ( ε ) ≥ 0. Example: Y ( ε ) is an indicator function, P [ Y ( ε ) = 1] = γ ( ε ). Then the relative variance (squared relative error) blows up: Var [ Y ] γ 2 ( ε ) = 1 − γ ( ε ) 1 ≈ γ ( ε ) → ∞ when ε → 0 . γ ( ε ) Standard Monte Carlo (MC): estimate γ by ¯ Y n , the average of n i.i.d. copies of Y . For a meaningful estimate, need n = O (1 /γ ( ε )). If γ = 10 − 10 , for example, need n = 10 14 for 1% relative error. Two popular cases, γ ( ε ) ≈ e − η/ε and γ ( ε ) ≈ poly( ε ).

  6. Some applications where this type of problem happens ◮ Expected amount of radiation that crosses a given protection shield. ◮ Probability of a large loss from an investment portfolio. ◮ Value-at-risk (quantile estimation). ◮ Ruin probability for an insurance firm. ◮ Probability that the completion time of a large project exceeds a given threshold. ◮ Probability of buffer overflow, or mean time to overflow, in a queueing system. ◮ Proportion of packets lost in a communication system. ◮ Air traffic control. ◮ Mean time to failure or other reliability or availability measure for a highly reliable system (e.g., fault-tolerant computers, safety systems).

  7. Major techniques to improve this type of situation: importance sampling (IS) and splitting.

  8. Major techniques to improve this type of situation: importance sampling (IS) and splitting. Commonly-used robustness characterizations of Y ( ε ): It has bounded relative error (BRE) (bounded relative variance) if Var [ Y ( ε )] lim < ∞ . γ 2 ( ε ) ε → 0 It is logarithmically efficient (LE) or asymptotically optimal if ln E [ Y 2 ( ε )] lim = 1 . 2 ln γ ( ε ) ε → 0 Means (roughly) that if γ ( ε ) → 0 at an exponential rate, then the standard deviation converges at least at the same exponential rate.

  9. Are there other useful characterizations? To get a meaningful variance estimator when ε → 0, we would like to control the relative error of the empirical variance. This involves the fourth moment of Y ( ε ). If we use a CLT, we may want to show that the quality of the normal approximation remains good when ε → 0, by bounding the Berry-Esseen bound on the approx. error. This involves the third moment. In some settings, we may be interested in bounding the relative moment of order 2 + δ for some small δ > 0. And so on.

  10. BRM- k For k ∈ [1 , ∞ ), the relative moment of order k for Y ( ε ) is m k ( ε ) = E [ Y k ( ε )] /γ k ( ε ) . For any fixed ε , m k ( ε ) is nondecreasing in k (from Jensen).

  11. BRM- k For k ∈ [1 , ∞ ), the relative moment of order k for Y ( ε ) is m k ( ε ) = E [ Y k ( ε )] /γ k ( ε ) . For any fixed ε , m k ( ε ) is nondecreasing in k (from Jensen). Y ( ε ) has a bounded relative moment of order k (BRM- k ) if lim sup m k ( ε ) < ∞ . ε → 0

  12. BRM- k For k ∈ [1 , ∞ ), the relative moment of order k for Y ( ε ) is m k ( ε ) = E [ Y k ( ε )] /γ k ( ε ) . For any fixed ε , m k ( ε ) is nondecreasing in k (from Jensen). Y ( ε ) has a bounded relative moment of order k (BRM- k ) if lim sup m k ( ε ) < ∞ . ε → 0 Some properties: (i) BRE is equivalent to BRM-2. (ii) BRM- k implies BRM- k ′ for 1 ≤ k ′ < k . (iii) For positive real numbers k , ℓ , m , if Y ( ε ) = X ℓ ( ε ) is BRM- mk , then Y ′ ( ε ) = X m ℓ ( ε ) is BRM- k .

  13. LE- k Y ( ε ) has logarithmic efficiency of order k (LE- k ) if ln E [ Y k ( ε )] lim = 1 . k ln γ ( ε ) ε → 0

  14. Example: γ ( ε ) = exp[ − η/ε ] , E [ Y k ( ε )] = q (1 /ε ) exp[ − k η/ε ] for some constant η > 0 and a slowly increasing function q (e.g., a polynomial). Then, we have LE- k but not BRE- k .

  15. Example: γ ( ε ) = exp[ − η/ε ] , E [ Y k ( ε )] = q (1 /ε ) exp[ − k η/ε ] for some constant η > 0 and a slowly increasing function q (e.g., a polynomial). Then, we have LE- k but not BRE- k . Example: q 1 ( ε ) = ε t 1 + o ( ε t 1 ) , γ k ( ε ) = q 2 ( ε ) = ε t 2 + o ( ε t 2 ) , E [ Y k ( ε )] = where t 2 ≤ t 1 . Here, ln E [ Y k ( ε )] = t 2 lim . k ln γ ( ε ) t 1 ε → 0 We have BRM- k iff t 2 = t 1 iff LE- k .

  16. Bounded normal approximation Berry-Esseen Theorem (one version): Y 1 , . . . , Y n i.i.d. r.v.’s with E [ Y 1 ] = 0, Var [ Y 1 ] = σ 2 , E [ | Y 1 | 3 ] = β 3 . Let ¯ Y n and S 2 n be the empirical mean and variance, and F n the distribution function of √ n ¯ Y n / S n . Then, there is an absolute constant a < ∞ such that a β 3 sup | F n ( x ) − Φ( x ) | ≤ σ 3 √ n . n ≥ 2 , x ∈ R Y ( ε ) is said to have Bounded Normal Approximation (BNA) if � | Y ( ε ) − γ ( ε ) | 3 � E lim sup < ∞ σ 3 ( ε ) ε → 0 (Tuffin 1999). This requires that the Berry-Esseen bound remains O ( n − 1 / 2 ) when ε → 0.

  17. BNA is not equivalent to BRM-3, because we divide by σ 3 ( ε ) here. One can have BNA and not BRM-3, or vice-versa. There are more general versions of the Berry-Esseen inequality that require only a bounded moment of order 2 + δ for any δ ∈ (0 , 1] instead of the third moment β 3 ; see, e.g., Petrov (1995). But the bound on | F n ( x ) − Φ( x ) | is only O ( n − δ/ 2 ) instead of O ( n − 1 / 2 ).

  18. Robustness of the empirical variance Let X 1 ( ε ) , . . . , X n ( ε ) be an i.i.d. sample of X ( ε ). We consider the empirical variance Y ( ε ) = S 2 n ( ε ) as an estimator of the variance σ 2 ( ε ). Let γ ( ε ) = E [ X ( ε )].

  19. Robustness of the empirical variance Let X 1 ( ε ) , . . . , X n ( ε ) be an i.i.d. sample of X ( ε ). We consider the empirical variance Y ( ε ) = S 2 n ( ε ) as an estimator of the variance σ 2 ( ε ). Let γ ( ε ) = E [ X ( ε )]. BRM-4 for X ( ε ) and BRE for S 2 n ( ε ) are both linked to E [ X 4 ( ε )], but they are not equivalent. We have Var [ S 2 � E [( X ( ε ) − γ ( ε )) 4 ] n ( ε )] � = Θ σ 4 ( ε ) σ 4 ( ε ) which differs in general from E [ X 4 ( ε )] /γ 4 ( ε ) � � Θ . Proposition: If σ 2 ( ε ) = Θ( γ 2 ( ε )), then BRM-2 k for X ( ε ) implies BRM- k for S 2 n ( ε ), for any k ≥ 1. A similar observation applies to the equivalence between LE-4 for X ( ε ) and LE for S 2 n ( ε ).

  20. Vanishing relative centered moments The relative centered moment of order k , for Y ( ε ), is c k ( ε ) = E [ | Y ( ε ) − γ ( ε ) | k ] . γ k ( ε ) Y ( ε ) has vanishing relative centered moment of order k (VRCM- k ) if lim sup c k ( ε ) = 0 . ε → 0 True if and only if lim sup m k ( ε ) = 1 . ε → 0

  21. Vanishing relative centered moments The relative centered moment of order k , for Y ( ε ), is c k ( ε ) = E [ | Y ( ε ) − γ ( ε ) | k ] . γ k ( ε ) Y ( ε ) has vanishing relative centered moment of order k (VRCM- k ) if lim sup c k ( ε ) = 0 . ε → 0 True if and only if lim sup m k ( ε ) = 1 . ε → 0 It has vanishing relative variance or relative error (VRE), if σ ( ε ) lim sup γ ( ε ) = 0 . ε → 0

  22. Vanishing relative centered moments The relative centered moment of order k , for Y ( ε ), is c k ( ε ) = E [ | Y ( ε ) − γ ( ε ) | k ] . γ k ( ε ) Y ( ε ) has vanishing relative centered moment of order k (VRCM- k ) if lim sup c k ( ε ) = 0 . ε → 0 True if and only if lim sup m k ( ε ) = 1 . ε → 0 It has vanishing relative variance or relative error (VRE), if σ ( ε ) lim sup γ ( ε ) = 0 . ε → 0 VRCM- k implies VRCM- k ′ for 1 ≤ k ′ ≤ k . It also implies BRM- k .

  23. VRCM implies convergence to the zero-variance IS Suppose � γ ( ε ) = E P ε [ Y ( ε )] = Y ( ε, ω ) dP ǫ ( ω ) . Ω Here, we can get zero variance (in principle) by doing importance sampling with the measure Q ∗ ε defined by ε ( ω ) = Y ( ε, ω ) dQ ∗ γ ( ε ) dP ε ( ω ) . Proposition: If Y ( ε ) is VRCM-(1 + δ ) for some δ > 0, then | P ε ( A ) − Q ∗ ε → 0 sup lim ε ( A ) | = 0 . A

  24. Proof: For any measurable set A , | Q ∗ | E P ε [( dQ ∗ sup ε ( A ) − P ε ( A ) | ≤ sup ε / dP ε ) I ( A )] − E P ε [ I ( A )] | A A E P ε | dQ ∗ ≤ ε / dP ε − 1 | � ε / dP ε − 1 | (1+ δ ) � E 1 / (1+ δ ) | dQ ∗ ≤ P ε � | Y ( ε ) /γ ( ε ) − 1 | (1+ δ ) � E 1 / (1+ δ ) ≤ P ε [ c 1+ δ ( ε )] 1 / (1+ δ ) = ε → 0 − → 0 .

Recommend


More recommend