L OG - CONCAVE DENSITY ESTIMATION WITH APPLICATIONS Co-authors: Y. Chen, M. Cule, L. D¨ umbgen, R. Gramacy, A Kim, D. Schuhmacher, M. Stewart, M. Yuan
R. J. Samworth Log-concave densities The original problem Let X 1 , . . . , X n be a random sample from a density f 0 in R d . How should we estimate f 0 ? Two main alternatives: • Parametric models: use e.g. MLE. Assumptions often too restrictive. • Nonparametric models: use e.g. kernel density estimate. Choice of bandwidth difficult, particularly for d > 1 . June 3, 2013- 2
R. J. Samworth Log-concave densities Shape-constrained estimation Nonparametric shape constraints are becoming increasingly popular (Groeneboom et al. 2001, Walther 2002, Pal et al. 2007, D¨ umbgen and Rufibach 2009, Schuhmacher et al. 2011, Seregin and Wellner 2010, Koenker and Mizera 2010 . . . ) . E.g. log-concavity, r -concavity, k -monotonicity, convexity. A density f is log-concave if log f is concave. • Univariate examples: normal, logistic, Gumbel densities, as well as Weibull, Gamma, Beta densities for certain parameter values. June 3, 2013- 3
R. J. Samworth Log-concave densities Characterising log-concave densities Cule, S. and Stewart (2010) Let X have density f in R d . For a subspace V of R d , let P V ( x ) denote the orthogonal projection of x onto V . Then in order that f be log-concave, it is: 1. necessary that for any subspace V , the marginal density of P V ( X ) is log-concave (Pr´ ekopa 1973) , and the conditional density f X | P V ( X ) ( ·| t ) of X given P V ( X ) = t is log-concave for each t 2. sufficient that, for every ( d − 1) -dimensional subspace V , the conditional density f X | P V ( X ) ( ·| t ) of X given P V ( X ) = t is log-concave for each t . June 3, 2013- 4
R. J. Samworth Log-concave densities Unbounded likelihood! Consider maximising the likelihood L ( f ) = � n i =1 f ( X i ) over all densities f . June 3, 2013- 5
R. J. Samworth Log-concave densities Existence and uniqueness Walther (2002), Cule, S. and Stewart (2010) Let X 1 , . . . , X n be independent with density f 0 in R d , and suppose that n ≥ d + 1 . Then, with probability one, a log-concave maximum likelihood estimator ˆ f n exists and is unique. June 3, 2013- 6
R. J. Samworth Log-concave densities Sketch of proof Consider maximising over all log-concave functions n ψ n ( f ) = 1 � � log f ( X i ) − R d f ( x ) dx. n i =1 Any maximiser ˆ f n must satisfy: 1. ˆ f n ( x ) > 0 iff x ∈ C n ≡ conv( X 1 , . . . , X n ) h y : R d → R be the smallest 2. Fix y = ( y 1 , . . . , y n ) and let ¯ concave function with ¯ h y ( X i ) ≥ y i for all i . Then log ˆ f n = ¯ h y ∗ for some y ∗ R d ˆ � 3. f n ( x ) dx = 1 . June 3, 2013- 7
R. J. Samworth Log-concave densities Schematic diagram of MLE on log scale June 3, 2013- 8
R. J. Samworth Log-concave densities Computation Cule, S. and Stewart (2010), Cule, Gramacy and S. (2009) First attempt: minimise n τ ( y ) = − 1 � ¯ exp { ¯ � h y ( X i ) + h y ( x ) } dx. n C n i =1 June 3, 2013- 9
R. J. Samworth Log-concave densities Computation Cule, S. and Stewart (2010), Cule, Gramacy and S. (2009) First attempt: minimise n τ ( y ) = − 1 � ¯ exp { ¯ � h y ( X i ) + h y ( x ) } dx. n C n i =1 Better: minimise n σ ( y ) = − 1 � � exp { ¯ y i + h y ( x ) } dx. n C n i =1 Then σ has a unique minimum at y ∗ , say, log ˆ f n = ¯ h y ∗ and σ is convex . . . June 3, 2013- 10
R. J. Samworth Log-concave densities Computation Cule, S. and Stewart (2010), Cule, Gramacy and S. (2009) First attempt: minimise n τ ( y ) = − 1 � ¯ exp { ¯ � h y ( X i ) + h y ( x ) } dx. n C n i =1 Better: minimise n σ ( y ) = − 1 � � exp { ¯ y i + h y ( x ) } dx. n C n i =1 Then σ has a unique minimum at y ∗ , say, log ˆ f n = ¯ h y ∗ and σ is convex . . . but non-differentiable ! June 3, 2013- 11
R. J. Samworth Log-concave densities Log-concave projections Let P k be the set of probability distributions P on R k with � R k � x � dP ( x ) < ∞ and P ( H ) < 1 for all hyperplanes H . Let F k be the set of upper semi-continuous log-concave densities on R k . The condition P ∈ P d is necessary and sufficient for the existence of a unique log-concave projection ψ ∗ : P d → F d given by � ψ ∗ ( P ) = argmax R d log f dP. f ∈F d umbgen, S., Schuhmacher, 2011) . (Cule, S. and Stewart, 2010; Cule and S., 2010; D¨ June 3, 2013- 12
R. J. Samworth Log-concave densities One-dimensional characterisation D¨ umbgen, S. and Schuhmacher (2011) Let P 0 ∈ P 1 have distribution function F 0 . Let S ( f ∗ ) = { x ∈ R : log f ∗ ( x ) > 1 2 log f ∗ ( x − δ )+ 1 2 log f ∗ ( x + δ ) ∀ δ > 0 } . Then the distribution function F ∗ of f ∗ is characterised by � x ≤ 0 for all x ∈ R { F ∗ ( t ) − F 0 ( t ) } dt for all x ∈ S ( f ∗ ) ∪ {∞} . = 0 −∞ June 3, 2013- 13
R. J. Samworth Log-concave densities Example 1 Suppose f 0 ( x ) = 1 2 (1 + x 2 ) − 3 / 2 . Then f ∗ ( x ) = 1 2 e −| x | . June 3, 2013- 14
R. J. Samworth Log-concave densities Example 2 June 3, 2013- 15
R. J. Samworth Log-concave densities Log-concave projections preserve independence Chen and S. (2012) Suppose P ∈ P d can be written as P = P 1 ⊗ P 2 , where P 1 and P 2 are probability measures on R d 1 and R d 2 , with d 2 = d − d 1 . If f ∗ is the log-concave projection of P and f ∗ ℓ is the projection of P ℓ ( ℓ = 1 , 2 ), then f ∗ ( x ) = f ∗ 1 ( x 1 ) f ∗ 2 ( x 2 ) 2 ) T ∈ R d . for x = ( x T 1 , x T This makes log-concave projections very attractive for independent component analysis (S. and Yuan, 2012) . June 3, 2013- 16
R. J. Samworth Log-concave densities Convergence of log-concave densities Cule and S. (2010), Kim and S. (2013) Let ( f n ) be a sequence of u.s.c. log-concave densities on R d with corresponding probability measures ( ν n ) d � � satisfying ν n → ν . If dim csupp( ν ) = d , then (a) ν is absolutely continuous w.r.t. Lebesgue measure on R d , with log-concave Radon–Nikodym derivative f = cl(lim inf f n ) . (b) Let a 0 > 0 and b 0 ∈ R be such that f ( x ) ≤ e − a 0 � x � + b 0 . If e a � x � | f n ( x ) − f ( x ) | dx → 0 and, if f is � a < a 0 then continuous, sup x e a � x � | f n ( x ) − f ( x ) | → 0 . June 3, 2013- 17
R. J. Samworth Log-concave densities Theoretical properties Cule and S. (2010), D¨ umbgen, S. and Schuhmacher (2011) The log-concave projection is continuous with respect to Wasserstein (Mallows-1) distance. iid ∼ P 0 ∈ P d , and let f ∗ = ψ ∗ ( P 0 ) . In particular, let X 1 , . . . , X n Taking a 0 > 0 and b 0 ∈ R such that f ∗ ( x ) ≤ e − a 0 � x � + b 0 , we have for any a < a 0 that � f n ( x ) − f ∗ ( x ) | dx a.s. R d e a � x � | ˆ → 0 , and, if f ∗ is continuous, sup x e a � x � | ˆ f n ( x ) − f ∗ ( x ) | a.s. → 0 . June 3, 2013- 18
R. J. Samworth Log-concave densities Global minimax bounds Kim and S. (2013) We have �� � d R d ( ˜ f n − f ) 2 560 × 2 5( d +1)( d +4) / (2 d ) n − 4 / ( d +4) . inf sup ≥ E ˜ f n f ∈F d Similar lower bounds (with different constants) hold for L 2 1 , Hellinger, Kullback–Leibler, chi-squared losses. Under a growth condition on − log f , we can obtain the same rates up to logarithmic factors for a Bayesian predictive estimator (Yang and Barron, 1999) . When d ≤ 4 , the log-concave MLE attains these rates for fixed f . June 3, 2013- 19
R. J. Samworth Log-concave densities Moment (in)equalities D¨ umbgen, S. and Schuhmacher (2011) Let P ∈ P d , let f ∗ = ψ ∗ ( P ) and let P ∗ ( B ) = B f ∗ . Then � � � R d x dP ∗ ( x ) = R d x dP ( x ) and � � R d h dP ∗ ≤ R d h dP for all convex h : R d → ( −∞ , ∞ ] . June 3, 2013- 20
R. J. Samworth Log-concave densities Smoothed log-concave density estimator D¨ umbgen and Rufibach (2009), Cule, S. and Stewart (2010), Chen and S. (2012) Let f n = ˆ ˜ f n ∗ φ ˆ A , where φ ˆ A is a d -dimensional normal density with mean zero and covariance matrix ˆ A = ˆ Σ − ˜ Σ . Here, ˆ Σ is the sample covariance matrix and ˜ Σ is the covariance matrix corresponding to ˆ f n . Then ˜ f n is a smooth, fully automatic log-concave estimator supported on the whole of R d which satisfies the same theoretical properties as ˆ f n . It offers potential improvements for small sample sizes. June 3, 2013- 21
R. J. Samworth Log-concave densities Breast cancer data June 3, 2013- 22
R. J. Samworth Log-concave densities Classification boundaries June 3, 2013- 23
R. J. Samworth Log-concave densities Testing for log-concavity Chen and S. (2012) Suppose P 0 ∈ P d and let A ∗ denote the difference between the covariance matrix of P 0 and that of its log-concave projection. Then A ∗ = 0 if and only if P 0 has a log-concave density. We can therefore use tr( ˆ A ) as a test statistic, and generate a critical value from bootstrap samples drawn from ˆ f n . This test is consistent: if P 0 is not log-concave, then the power converges to 1 as n → ∞ . June 3, 2013- 24
Recommend
More recommend