Nonparametric testing by convex optimization Anatoli Juditsky ∗ joint research with Alexander Goldenshluger ‡ and Arkadi Nemirovski † ∗ University J. Fourier, ‡ University of Haifa, † ISyE, Georgia Tech, Atlanta Gargantua, November 26, 2013 1 / 41
Motivation: event detection in sensor networks [Tartakovsky, Veeravalli, 2004, 2008] 1 0.8 0.6 0.4 0.2 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Array of 20 sensors on the uniform grid along the left and bottom edges of [0 , 1] 2 . “ + ” represent the points of the uniform 20 × 20–grid Γ, “ • ” are sensor positions, interposed with contour plot of the response of the 6th sensor 2 / 41
Suppose that m sensors are deployed on the domain G ⊆ R d . Given a grid Γ = ( γ i ) i =1 ,..., n ⊂ G . An event at a node γ i ∈ Γ produces the signal s = re [ i ] : Γ → R n of known signature e [ i ] with unknown real factor r . The signal is contaminated by a nuisance (a background signal) v ∈ V , where V is a known convex and compact set in R n . Observation ω = [ ω 1 ; ... ; ω m ] of the array of m sensors is a linear transformation of the signal, contaminated with random noise: ω ∼ P µ – a random vector in R m with the distribution parameterized by µ ∈ R m , where µ = A ( s + v ) , and A ∈ R m × n is a known matrix of sensor responses. 3 / 41
Objective: testing the (null) hypothesis H 0 that no event happened against the alternative H 1 that exactly one event took place. We require that • Ae [ i ] � = 0 for all i • under H 1 , when an event occurs at a node γ i ∈ Γ, we have s = re [ i ] with | r | ≥ ρ i with some given ρ i > 0. Problem ( D ρ ): Given ρ = [ ρ 1 ; ... ; ρ n ] > 0, decide between • hypothesis H 0 : s = 0 against • alternative H 1 ( ρ ) : s = re [ i ] for some i ∈ { 1 , ..., n } and r with | r | ≥ ρ i . The risk of the test is the maximal probability to reject H 0 when the hypothesis is true or to accept H 0 when H 1 ( ρ ) is true. Our goal is, given an ǫ ∈ (0 , 1) , construct a test with risk ≤ ǫ for as wide as possible (i.e., with as small ρ as possible) alternative H 1 ( ρ ). 4 / 41
A particular case: signal detection in convolution [Yin, 1988, Wang, 1995, Muller 1999, Gustavson, 2000, Antoniadis, Gijbels, 2002, Goldenshluger et al., 2008,...] We consider the model with observation ω = A ( s + v ) + σξ, where s , v ∈ R n , and ξ ∼ N (0 , I m ) with known σ > 0. Let µ = [ µ 1 ; ...µ m ] be the vector of m consecutive outputs of a discrete time 0.18 0.16 linear dynamical system with a given 0.14 impulse response { g k } , k = 1 , ..., T , i.e. 0.12 µ ∈ R m is the convolution image of 0.1 0.08 n -dimensional “signal” s 0.06 (that is, n = m + T − 1). 0.04 0.02 A is the Toeplitz m × n matrix of the 0 −60 −40 −20 0 20 40 60 80 100 described linear mapping x �→ µ . Convolution kernel, m = 100, n = 159 We want to detect the presence of the signal s = re [ i ] , where e [ i ] , i = 1 , ..., n , are some given vectors in R n . 5 / 41
Situation, formally Given are • “Observation space” Ω , P Ω: Polish (complete separable metric) space P : σ -finite σ -additive Borel measure on Ω • Family P = { P µ ( d ω ) = p µ ( ω ) P ( d ω ) : µ ∈ M} of probability distributions on Ω distribution’s parameter running through “parameter space” M ⊂ R m µ : p µ : density of distribution P µ w.r.t. the reference measure P • “Parameter spaces” – two nonempty convex compact subsets M 0 ⊂ M and M 1 ⊂ M . 6 / 41
Assumptions We assume that • M ⊂ R m is a convex set which coincides with its relative interior; • distributions P µ ∈ P possess densities p µ ( ω ) w.r.t. the measure P on the space Ω. We assume that p µ ( ω ) is continuous in µ ∈ M and is positive for all ω ∈ Ω; • We are given a finite-dimensional linear space F of continuous functions on Ω containing constants such that ln( p µ ( · ) / p ν ( · )) ∈ F whenever µ, ν ∈ M ; 7 / 41
Assumptions We assume that • M ⊂ R m is a convex set which coincides with its relative interior; • distributions P µ ∈ P possess densities p µ ( ω ) w.r.t. the measure P on the space Ω. We assume that p µ ( ω ) is continuous in µ ∈ M and is positive for all ω ∈ Ω; • We are given a finite-dimensional linear space F of continuous functions on Ω containing constants such that ln( p µ ( · ) / p ν ( · )) ∈ F whenever µ, ν ∈ M ; �� � • For every φ ∈ F , the function F φ ( µ ) = ln Ω exp { φ ( ω ) } p µ ( ω ) P ( d ω ) is well defined and concave in µ ∈ M . We call the just described situation a good observation scheme. 7 / 41
... and goal Given observation scheme [observation space (Ω , P ) and family of distributions { p µ ( · ) } µ ∈M , “parameter spaces” M 0 , M 1 , and random observation ω ∼ p µ ( · ) , coming from some unknown µ , known to belong either to M 0 (hypothesis H 0 ) or to M 1 (hypothesis H 1 ), decide between H 0 and H 1 . Risk of the test: given a test (we interpret value 0 as accepting H 0 and 1 as accepting H 1 ), we consider the quantities ǫ 0 = sup Prob ω ∼ P µ { test rejects H 0 } , µ ∈ M 0 ǫ 1 = sup Prob ω ∼ P µ { test rejects H 1 } , µ ∈ M 1 We say that risk of the test is ≤ ǫ , if both error probabilities are ≤ ǫ . 8 / 41
Example: Gaussian case Given an noisy observation ω = µ + ξ, ξ ∼ N (0 , I ) , make conclusions about µ . The observation scheme is • (Ω , P ): R m with Lebesque measure • p µ ( ω ) = N ( µ, I ) , µ ∈ M := R m • F = { φ ( ω ) = a T ω + b : a ∈ R m , b ∈ R } , and �� � = b + a T µ + a T a R m e a T ω + b p µ ( ω ) d ω ) ln 2 , is concave in µ Gaussian observation scheme is good! 9 / 41
Example: Poisson case Given m realizations of independent Poisson random variables ω i ∼ Poisson ( µ i ) with parameters µ i , make conclusions about µ . The observation scheme is • (Ω , P ): Z m + with counting measure i µ i , µ ∈ M = int R m • p µ ( ω ) = µ ω ω ! e − � + • F = { φ ( ω ) = a T ω + b : a ∈ R m , b ∈ R } , and � � m e a T ω + b p µ ( ω ) [ e a i − 1] µ i , = b + ln ω ∈ Z m i =1 + is concave in µ Poisson observation scheme is good! 10 / 41
Example: discrete case Given realization of random variable ω taking values 1 , ..., m with probabilities µ i µ i := Prob { ω = i } , make conclusions about µ . The observation scheme is • (Ω , P ): { 1 , ..., m } with counting measure � � µ > 0 , µ ∈ R m : • p µ ( ω ) = µ ω , µ ∈ M = � m ω =1 µ ω = 1 • F = R (Ω) = R m , and �� � � m � � e φ ( ω ) p µ ( ω ) e φ ( ω ) µ ω ln = ln , ω ∈ Ω ω =1 is concave in µ . Discrete observation scheme is good! 11 / 41
Simple test Simple (Cramer’s) test: a simple test is specified by a detector φ ( · ) ∈ F ; it accepts H 0 , the observation being ω , if φ ( ω ) ≥ 0, and accepts H 1 otherwise. We can easily bound the risk of a simple test φ : for µ ∈ M 0 we have � Prob ω ∼ P µ ( φ ( ω ) < 0) ≤ E ω ∼ P µ ( e − φ ( ω ) ) = e − φ ( ω ) p µ ( ω ) P ( d ω ) , Ω and for ν ∈ M 1 , � Prob ω ∼ P ν ( φ ( ω ) ≥ 0) ≤ E ω ∼ P ν ( e φ ( ω ) ) = e φ ( ω ) p ν ( ω ) P ( d ω ) . Ω We associate with φ ( · ) ∈ F , and [ µ ; ν ] ∈ M 0 × M 1 the aggregate �� � �� � Ω e − φ ( ω ) p µ ( ω ) P ( d ω ) Ω e φ ( ω ) p ν ( ω ) P ( d ω ) Φ( φ, [ µ ; ν ]) = ln + ln Key observation: in a good observation scheme Φ( φ, [ µ ; ν ]) is continuous on its domain, convex in φ ( · ) ∈ F and concave in [ µ ; ν ] ∈ M 0 × M 1 . 12 / 41
Main result Theorem 1 (i) Φ( φ, [ µ ; ν ]) possesses a saddle point ( min in φ , max in [ µ ; ν ] ) ( φ ∗ ( · ) , [ x ∗ ; y ∗ ]) on F × ( M 0 × M 1 ) with the saddle value min max Φ( φ, [ µ ; ν ]) := 2 ln( ε ∗ ) . φ ∈F [ µ ; ν ] ∈ M 0 × M 1 The risk of the simple test associated with the detector φ ∗ on the composite hypotheses H M 0 , H M 1 is ≤ ε ∗ . 13 / 41
Main result Theorem 1 (i) Φ( φ, [ µ ; ν ]) possesses a saddle point ( min in φ , max in [ µ ; ν ] ) ( φ ∗ ( · ) , [ x ∗ ; y ∗ ]) on F × ( M 0 × M 1 ) with the saddle value min max Φ( φ, [ µ ; ν ]) := 2 ln( ε ∗ ) . φ ∈F [ µ ; ν ] ∈ M 0 × M 1 The risk of the simple test associated with the detector φ ∗ on the composite hypotheses H M 0 , H M 1 is ≤ ε ∗ . (ii) The detector φ ∗ is readily given by the [ µ ; ν ] -component [ µ ∗ ; ν ∗ ] of the associated saddle point of Φ , specifically, φ ∗ ( · ) = 1 2 ln [ p µ ∗ ( · ) / p ν ∗ ( · )] . 13 / 41
Recommend
More recommend