Confidence Bands for Distribution Functions: The Law of the Iterated Logarithm and Shape Constraints Lutz Duembgen (Bern) Jon A. Wellner (Seattle) Petro Kolesnyk (Bern) Ralf Wilke (Copenhagen) November 2014
I. The LIL for Brownian Motion and Bridge II. A General LIL for Sub-Exponential Processes III. Implications for the Uniform Empirical Process III.1 Goodness-of-Fit Tests III.2 Confidence Bands IV. Bi-Log-Concave Distribution Functions V. Bi-Log-Concave Binary Regression
I. The LIL for Brownian Motion and Bridge Standard Brownian motion W = ( W ( t )) t ≥ 0 LIL for BM : ± W ( t ) lim sup � = 1 a.s. 2 t log log( t − 1 ) t ↓ 0 ± W ( t ) lim sup � = 1 a.s. 2 t log log( t ) t ↑∞
Refined half of LIL for BM : For any constant ν > 3 / 2, � W ( t ) 2 � − log log( t + t − 1 ) − ν log log log( t + t − 1 ) lim 2 t t →{ 0 , ∞} = −∞ a.s.
Reformulation for standard Brownian bridge U = ( U ( t )) t ∈ (0 , 1) : � � t (0 , 1) ∋ t �→ logit( t ) := log ∈ R , 1 − t e x R ∋ x �→ ℓ ( x ) := ∈ (0 , 1) . 1 + e x
Refined half of LIL for BB : For arbitrary constants ν > 3 / 2, � � U ( t ) 2 2 t (1 − t ) − C ( t ) − ν D ( t ) < ∞ sup a.s. t ∈ (0 , 1) where � � � 1 1 + logit( t ) 2 / 2 ≈ log log C ( t ) := log t (1 − t ) � � � 1 1 + C ( t ) 2 / 2 D ( t ) := log ≈ log log log t (1 − t ) as t → { 0 , 1 } .
II. A General LIL for Sub-Exponential Processes Nonnegative stochastic process X = ( X ( t )) t ∈T on T ⊂ (0 , 1) . Locally uniform sub-exponentiality: LUSE 0 : For arbitrary a ∈ R , c ≥ 0 and η ≥ 0, � � X ( t ) ≥ η ≤ M exp( − L ( c ) η ) , I P sup t ∈ [ ℓ ( a ) ,ℓ ( a + c )] where M ≥ 1 and L : [0 , ∞ ) → [0 , 1] satisfies L ( c ) = 1 − O ( c ) as c ↓ 0
Refinement for ζ ∈ [0 , 1]: LUSE ζ : For arbitrary a ∈ R , c ≥ 0 and η ≥ 0, � � ≤ M exp( − L ( c ) η ) I P sup X ( t ) ≥ η max(1 , L ( c ) η ) ζ , t ∈ [ ℓ ( a ) ,ℓ ( a + c )] with M and L ( · ) as in LUSE 0 . Example : U ( t ) 2 X ( t ) := 2 t (1 − t ) satisfies LUSE 1 / 2 with L ( c ) = e − c . M = 2 and
Proposition. Suppose that X satisfies LUSE ζ . For any L o ∈ (0 , 1) and ν > 2 − ζ there exists a constant M o = M o ( M , L ( · ) , ζ, L o , ν ) ≥ 1 such that � � I P sup ( X − C − ν D ) ≥ η ≤ M o exp( − L o η ) T for arbitary η ≥ 0.
III. Implications for the Uniform Empirical Process Let U 1 , U 2 , . . . , U n be i.i.d. ∼ Unif [0 , 1]. Auxiliary function K : [0 , 1] × (0 , 1) → [0 , ∞ ], � x � � 1 − x � K ( x , p ) := x log + (1 − x ) log p 1 − p i.e. Kullback-Leibler divergence between Bin (1 , x ) and Bin (1 , p ). Two key properties: ( x − p ) 2 � � K ( x , p ) = 1 + o (1) as x → p . 2 p (1 − p ) �� 2 c p (1 − p ) + c K ( x , p ) ≤ c = ⇒ | x − p | ≤ � 2 c x (1 − x ) + c
Implication 1 : Uniform empirical distribution function n � G n ( t ) := 1 � 1 [ U i ≤ t ] n i =1 Lemma 1. The process X n = ( X n ( t )) t ∈ (0 , 1) with X n ( t ) := n K ( � G n ( t ) , t ) satisfies LUSE 0 with M = 2 and L ( c ) = e − c . Theorem 1. For any fixed ν > 2, � � � � U ( t ) 2 sup X n − C − ν D → L sup 2 t (1 − t ) − C ( t ) − ν D ( t ) . (0 , 1) t ∈ (0 , 1)
Main ingredients for proofs: ◮ � � � G n ( t ) / t t ∈ (0 , 1] is a reverse martingale. ◮ Exponential transform and Doob’s inequality for submartingales. ◮ Analytical properties of K ( · , · ). ◮ Donsker’s invariance for uniform empirical process.
Implication 2 : Uniform order statistics 0 < U n :1 < U n :2 < · · · < U n : n < 1 . i T n := { t n 1 , t n 2 , . . . , t nn } with t ni := I E( U n : i ) = n + 1 . Lemma 2. The process ˜ X n = ( ˜ X n ( t )) t ∈T n with ˜ X n ( t ni ) := ( n + 1) K ( t ni , U n : i ) satisfies LUSE 0 with M = 2 and L ( c ) = e − c . Theorem 2. For any fixed ν > 2, � � � ˜ � U ( t ) 2 sup X n − C − ν D → L sup 2 t (1 − t ) − C ( t ) − ν D ( t ) . T n t ∈ (0 , 1)
Main ingredients for proofs: ◮ � � n U n : i / t ni i =1 is a reverse martingale. ◮ Exponential transform and Doob’s inequality for submartingales. ◮ Connection between Beta and Gamma distributions. ◮ Analytical properties of K ( · , · ). ◮ Donsker’s invariance principle for uniform quantile process.
Some realizations of ˜ X n for n = 5000 and ν = 3: 2 1 0 X n ( t ) -1 -2 -3 0.0 0.2 0.4 0.6 0.8 1.0 t
2 1 0 X n ( t ) -1 -2 -3 0.0 0.2 0.4 0.6 0.8 1.0 t
3 2 1 X n ( t ) 0 -1 -2 -3 0.0 0.2 0.4 0.6 0.8 1.0 t
2 1 0 X n ( t ) -1 -2 -3 0.0 0.2 0.4 0.6 0.8 1.0 t
Distribution function of arg max t ˜ X n ( t ): 1.0 0.8 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 t
III.1 Goodness-of-Fit Tests Let X 1 , X 2 , . . . , X n be i.i.d. with unknown c.d.f. F on R . Empirical c.d.f. n � F n ( x ) := 1 � 1 [ X i ≤ x ] . n i =1 Testing problem: H o : F ≡ F o versus H A : F �≡ F o .
Berk–Jones (1979) proposed the test statistic n K ( � T n ( F o ) := sup F n , F o ) R with critical value n K n ( � κ BJ n ,α := (1 − α ) − quantile of sup G n ( t ) , t ) t ∈ (0 , 1) � � = log log( n ) + O log log log( n ) .
New proposal: � � n K ( � T n ( F o ) := sup F n , F o ) − C ( F o ) − ν D ( F o ) R with critical value κ new := (1 − α ) − quantile of n ,α � � n K ( � G n ( t ) , t ) − C ( t ) − ν D ( t ) sup t ∈ (0 , 1) → (1 − α ) − quantile of � � U ( t ) 2 sup 2 t (1 − t ) − C ( t ) − ν D ( t ) . t ∈ (0 , 1)
Power. For any fixed κ > 0, � � I P F n T n ( F o ) > κ → 1 as √ n | F n − F o | sup � → ∞ . (1 + C ( F o )) F o (1 − F o ) + C ( F o ) / √ n R
Special case: Detecting heterogeneous Gaussian mixtures (Ingster 1997, 1998; Donoho–Jin 2004) Setting 1: F o := Φ , F n := (1 − ε n ) Φ + ε n Φ( · − µ n ) with ε n = n − β + o (1) , β ∈ (1 / 2 , 1) , µ n → ∞ .
Theorem. For any fixed κ > 0, � � → 1 I P F n T n ( F o ) > κ provided that � � β − 1 / 2 if β ≤ 3 / 4 , µ n = 2 r log( n ) with r > � 1 − √ 1 − β � 2 if β ≥ 3 / 4 .
Setting 2 (Contiguous alternatives): F o := Φ , � � 1 − π Φ + π F n := √ n √ n Φ( · − µ ) , π, µ > 0 . Optimal level- α test of F o versus F n has asymptotic power � � Φ − 1 ( α ) + π 2 (exp( µ 2 ) − 1) Φ . 4
� Theorem. Let µ = 2 s log(1 /π ) for fixed s > 0. As π ↓ 0, � � � Φ − 1 ( α ) + π 2 (exp( µ 2 ) − 1) α if s < 1 , Φ → 4 1 if s > 1 , while for any fixed κ > 0, � � → I P F n T n ( F o ) > κ 1 if s > 1 .
III.2 Confidence Bands Owen (1995) proposed (1 − α )-confidence band � � n K ( � F n , F ) ≤ κ BJ F : sup . n ,α R New proposal: With order statistics X n :1 ≤ X n :2 ≤ · · · ≤ X n : n , � � � � F : max ( n + 1) K ( t ni , F ( X n : i )) − C ( t ni ) − ν D ( t ni ) ≤ ˜ κ n ,α 1 ≤ i ≤ n
Resulting bounds for F ( x ): With confidence 1 − α , on [ X n : i , X n : i +1 ), 0 ≤ i ≤ n , [ a BJO , b BJO ] with Owen’s (1995) proposal , ni ni F ∈ [ a new , b new ] with new proposal , ni ni while F n = s ni := i � n .
i �→ a new , s ni , b new n = 500: ni ni 1.0 0.8 0.6 0.4 0.2 0.0 0 100 200 300 400 500
i �→ a ∗ ni − s ni , b ∗ n = 500: ni − s ni 0.05 0.00 -0.05 0 100 200 300 400 500
i �→ a ∗ ni − s ni , b ∗ n = 2000: ni − s ni 0.04 0.02 0.00 -0.02 -0.04 0 500 1000 1500 2000
i �→ a ∗ ni − s ni , b ∗ n = 8000: ni − s ni 0.02 0.01 0.00 -0.01 -0.02 0 2000 4000 6000 8000
Theorem. For any fixed α ∈ (0 , 1), b new − a new ni ni max → 1 , b BJO − a BJO 0 ≤ i ≤ n ni ni while � 2 log log n 0 ≤ i ≤ n ( b BJO − a BJO max ) = (1 + o (1)) , ni ni n ) = O ( n − 1 / 2 ) . 0 ≤ i ≤ n ( b new − a new max ni ni
IV. Bi-Log-Concave Distribution Functions Shape constraint 1: Log-concave density. F has density f = e φ with φ : R → [ −∞ , ∞ ) concave. Shape constraint 2: Bi-log-concave distribution function. Both log( F ) and log(1 − F ) are concave. • Log-concave density = ⇒ bi-log-concave c.d.f. • A bi-log-concave c.d.f. may have arbitrarily many modes!
Theorem. Let J ( F ) := { x ∈ R : 0 < F ( x ) < 1 } � = ∅ . Four equivalent statements: ◮ F bi-log-concave. ◮ F has a density f . On J ( F ), f f f = F ′ > 0 , ց and ր . F 1 − F ◮ F has a bounded density f . On J ( F ), − f 2 ≤ f ′ ≤ f 2 f = F ′ > 0 and F . 1 − F ◮ F has a density f s.t. for arbitrary x ∈ J ( F ) and t ∈ R , � f � ≤ F ( x ) exp F ( x ) · t , F ( x + t ) � � f ≥ 1 − (1 − F ( x )) exp − 1 − F ( x ) · t .
1.5 − log ( 1 − F ) 1.0 0.5 F 0.0 1 + log ( F ) -0.5 -4 -2 0 2 4
0.4 0.3 f ( 1 − F ) f F 0.2 f = F ' 0.1 0.0 -4 -2 0 2 4
0.2 f 2 F 0.1 f ' 0.0 -0.1 − f 2 ( 1 − F ) -0.2 -4 -2 0 2 4
Recommend
More recommend