some applications of interval computing in statistics
play

Some applications of interval computing in statistics Michal Cern - PowerPoint PPT Presentation

Some applications of interval computing in statistics Michal Cern y Department of Econometrics & DYME Research Center University of Economics, Prague, Czech Republic SWIM 2015, Prague M. y (V Cern SE Prague) Interval


  1. Some applications of interval computing in statistics Michal ˇ Cern´ y Department of Econometrics & DYME Research Center University of Economics, Prague, Czech Republic SWIM 2015, Prague M. ˇ y (Vˇ Cern´ SE Prague) Interval computing & statistics 1 / 42

  2. Introduction Many ideas and results are summarized in the wonderful book: Some results: joint research with M. Hlad´ ık, M. Rada, O. Sokol, J. Hor´ aˇ cek, J. Antoch et al. M. ˇ y (Vˇ Cern´ SE Prague) Interval computing & statistics 2 / 42

  3. The core problem of Interval Analysis We are given a (continuous, say) function f : R n → R and a box x ∈ IR n . We are to determine the range f ( x ) = [ f ( x ) , f ( x )] = { f ( x ) : x ∈ x } . Which particular functions f are interesting in statistics & data analysis? Outline: Part I: one-dimensional interval-valued data Part II: multivariate data & regression M. ˇ y (Vˇ Cern´ SE Prague) Interval computing & statistics 3 / 42

  4. Part I. One-dimensional data M. ˇ y (Vˇ Cern´ SE Prague) Interval computing & statistics 4 / 42

  5. One-dimensional data: a model Assumptions. Let x 1 , . . . , x n be a dataset; for example, let the data be a random sample from a distribution Φ. The dataset is unobservable. What is observable is a collection of intervals x 1 , . . . , x n such that x 1 ∈ x 1 , . . . , x n ∈ x n a.s. A general goal: We want to make inference about the original dataset x 1 , . . . , x n , about the generating distribution Φ, about its parameters, we want to test hypotheses etc. We are given a statistic S ( x 1 , . . . , x n ) and we want to determine/estimate its value, distribution, or other properties, using only the observable interval-valued data x 1 , . . . , x n . Now: the appropriate toolbox depends on whether we can make further assumptions on the distribution of ( x , x ). M. ˇ y (Vˇ Cern´ SE Prague) Interval computing & statistics 5 / 42

  6. Example Allmaras et al., SIAM Review 55 (2013); Aguilar et al., SIAM Review 57 (2015) Measurement of a falling box : the aim is to estimate the gravity acceleration and air resistance A camera takes snaps in discrete times: the position x i (= distance traveled in time i ) is uncertain due to unpredictable rotation They make an assumption that the distribution of x i given x i , x i is beta and apply Bayesian framework M. ˇ y (Vˇ Cern´ SE Prague) Interval computing & statistics 6 / 42

  7. Example (contd.) known initial height scale x i : true distance traveled x i x i ( β -distributed) M. ˇ y (Vˇ Cern´ SE Prague) Interval computing & statistics 7 / 42

  8. The possibilistic approach Interval computation comes into play when the only assumption about the distribution of ( x , x ) we make is x ∈ x a.s. Nothing more. Then, given a statistic S , the only information we can infer about S from the observable interval-valued data x is the pair of tight bounds S = max { S ( ξ ) : ξ ∈ x } , S = min { S ( ξ ) : ξ ∈ x } , clearly satisfying S � S ( x ) � S a.s. Remark. In econometrics, partial knowledge about the distribution ( x , x ) is referred to as partial identification: see the survey paper E. Tamer, Partial identification in econometrics , Annual Review of Economics 2 (2010), pp. 167–195. Also many papers in Econometrica and other journals. M. ˇ y (Vˇ Cern´ SE Prague) Interval computing & statistics 8 / 42

  9. Which statistics are interesting? Descriptive statistics: sample mean, variance, median, coefficient of variation, quantiles, higher-order moments, . . . Many well-known people did a lot of work: Kreinovich, Ferson, Ginzburg, Aviles, Longpr´ e, Xiang, Ceberio, Dantsin, Wolpert, Hajagos, Oberkampf, Jaulin, Patangay, Starks, Beck, . . . (sorry that I cannot mention all) Estimators of parameters of the data-generating distribution Φ Test statistics for various hypotheses M. ˇ y (Vˇ Cern´ SE Prague) Interval computing & statistics 9 / 42

  10. Test statistics We are to test a null hypothesis ( H 0 ) against an alternative A We usually construct a test statistic S s.t. its distribution D under H 0 is known Then, quantiles of D determine the critical region, where we reject H 0 at a pre-selected level α of confidence (say, α = 95%) Given the intervals x 1 , . . . , x n : if we can compute S , S , then we can make at least partial conclusions: D D D 95% 95% 95% 2.5% 2.5% 2.5% 2.5% 2.5% 2.5% S S S S S S DO NOT REJECT REJECT ??? M. ˇ y (Vˇ Cern´ SE Prague) Interval computing & statistics 10 / 42

  11. Test statistics: An example Example. Say that x 1 , . . . , x n / 2 and x ( n / 2)+1 , . . . , x n are two independent samples from N ( µ 1 , σ 2 1 ) and N ( µ 2 , σ 2 2 ), respectively. We want to test stability of variance: σ 2 1 = σ 2 2 . A well-known test statistic: F -ratio sample variance of x 1 , . . . , x n / 2 F = . sample variance of x ( n / 2)+1 , . . . , x n Problem: computation of both values F , F is NP-hard! (How serious is this obstacle? We will see later...) M. ˇ y (Vˇ Cern´ SE Prague) Interval computing & statistics 11 / 42

  12. Test statistics: Further examples Let x 1 , . . . , x n be a N ( µ, σ 2 ) sample Given µ 0 ∈ R , to test µ = µ 0 we use the t -ratio (coefficient of variation) � � �� 1 � n � � − µ 0 t = | � µ − µ 0 | i =1 x i n = � � 2 . � n � � n σ � 1 x j − 1 k =1 x k n − 1 j =1 n Some results: t is NP-hard and inapproximable with an arbitrary absolute error t is computable in pseudopolynomial time t computable in polynomial time M. ˇ y (Vˇ Cern´ SE Prague) Interval computing & statistics 12 / 42

  13. Test statistics: Further examples Testing independence: Durbin-Watson statistic � n i =2 ( r i − r i − 1 ) 2 � n DW = , j =1 r 2 i � n where r i = x i − 1 k =1 x k . n Testing stability of mean (important e.g. in quality control): H 0 : E x 1 = E x 2 = · · · = E x n A : ∃ k : E x 1 = E x 2 = · · · = E x k = µ 1 � = µ 2 = E x k +1 = E x k +2 = · · · = E x n . Test statistic: � � k � n n ℓ =1 ( x ℓ − 1 ι =1 x ι ) k ( n − k ) n T = max � � 2 . � � n � n k =1 ,..., n − 1 1 x i − 1 j =1 x j n − 1 i =1 n Computational aspects of S and S have been investigated for many statistics S ... and many are still waiting... M. ˇ y (Vˇ Cern´ SE Prague) Interval computing & statistics 13 / 42

  14. Sample variance     2   n n � � 1  x i − 1 s 2 = max  : x ∈ x x j  ,  n − 1 n i =1 j =1     2   n n � � 1  x i − 1 s 2 = min  x j : x ∈ x  .  n − 1 n i =1 j =1 Observation: s 2 → CQP → weakly polynomial time Ferson et al.: a strongly polynomial algorithm O ( n 2 ) Unfortunately: s 2 is NP-hard Even worse: s 2 is NP-hard to approximate with an arbitrary absolute error M. ˇ y (Vˇ Cern´ SE Prague) Interval computing & statistics 14 / 42

  15. NP-hardness of s 2 NP-hardness of s 2 → investigation of special cases solvable in polynomial time Ferson et al. : consider the “ 1 n -narrowed” intervals 1 n x i := [ x C i − 1 n x ∆ i , x C i + 1 n x ∆ i ] , i = 1 , . . . , n . n x j = ∅ for all i � = j , then s 2 can be computed in Theorem: If 1 n x i ∩ 1 polynomial time. Another formulation: If there is no k -tuple of indices 1 � i 1 < · · · < i k � n such that � 1 n x ℓ � = ∅ , ℓ ∈{ i 1 ,..., i k } then s 2 can be computed in time O ( p ( n ) · 2 k ), where p is a polynomial. M. ˇ y (Vˇ Cern´ SE Prague) Interval computing & statistics 15 / 42

  16. Computation of s 2 & Ferson et al. (contd.) Graph-theoretic reformulation: Let G n ( V n , E n ) be the interval graph over 1 n x 1 , . . . , 1 n x n : Vertices: V n = set of the narrowed intervals 1 n x 1 , . . . , 1 n x n Edges: { i , j } ∈ E ( i � = j ) iff 1 n x i ∩ 1 n x j � = ∅ Let ω n be the size of the largest clique of G n . Now: the algorithm works in time O ( p ( n ) · 2 ω n ). Remark. Determining the largest clique of an interval graph is easy. Remark. The worst case is bad — e.g. when x C 1 = x C 2 = · · · = x C n . (Such instances result from the NP-hardness proof.) But: What if the data are generated by a random process? Then, do the “ugly” instances occur frequently, or only rarely? M. ˇ y (Vˇ Cern´ SE Prague) Interval computing & statistics 16 / 42

  17. Simulations Assumption: The centers and radii of intervals x i are generated by a “reasonable” random process: Centers x C i : sampled from a “reasonable” distribution (continuous, finite variance) — uniform, normal, exp, . . . Radii x ∆ i : sampled from a “reasonable” nonnegative distribution (continuous, finite variance) — uniform, one-sided normal, exp, . . . Simulations show Sokol’s conjecture: The clique is logarithmic on average! Thus: The algorithm is polynomial on average. M. ˇ y (Vˇ Cern´ SE Prague) Interval computing & statistics 17 / 42

  18. Sokol’s conjecture M. ˇ y (Vˇ Cern´ SE Prague) Interval computing & statistics 18 / 42

Recommend


More recommend