on computing optimal thresholds in decentralized
play

On Computing Optimal Thresholds in Decentralized Sequential - PowerPoint PPT Presentation

On Computing Optimal Thresholds in Decentralized Sequential Hypothesis Testing Can Cui and Aditya Mahajan Electrical and Computer Engineering Department, McGill University 54th Conference on Decision and Control C. Cui and A. Mahajan (McGill


  1. On Computing Optimal Thresholds in Decentralized Sequential Hypothesis Testing Can Cui and Aditya Mahajan Electrical and Computer Engineering Department, McGill University 54th Conference on Decision and Control C. Cui and A. Mahajan (McGill University) Computing Optimal Thresholds CDC 2015 1 / 1

  2. Outline C. Cui and A. Mahajan (McGill University) Computing Optimal Thresholds CDC 2015 2 / 1

  3. Introduction Sequential hypothesis testing: sensor network, intrusion detection, primary channel detection, quality control and clinical trials, etc. Decentralized sequential hypothesis testing: decisions are made in decentralized manner by multi decision makers. Motivation: There are various results that establish optimality of threshold-based strategies in different setups, but few results on how to compute optimal thresholds. C. Cui and A. Mahajan (McGill University) Computing Optimal Thresholds CDC 2015 3 / 1

  4. Problem Formulation: Model Consider a decentralized sequential hypothesis problem investigated in Teneketzis and Ho. (1987). Decision maker: Two decision makers DM i , i ∈ { 1 , 2 } ; Hypothesis: H ∈ { h 0 , h 1 } with a prior probability p and 1 − p ; Observation: Y i t ∈ Y i ; { Y i t } ∞ t = 1 are i.i.d. with PMF f i k , k ∈ { 1 , 2 } ; { Y 1 t = 1 and { Y 2 t } ∞ t } ∞ t = 1 are conditionally independent given H . Strategy: U i t ∈ { h 0 , h 1 , C } according to U i t = g i t ( Y i 1 : t ) . C. Cui and A. Mahajan (McGill University) Computing Optimal Thresholds CDC 2015 4 / 1

  5. Problem Formulation: Model Stopping time: N i = min { t ∈ Z > 0 : U i t ∈ { h 0 , h 1 }} ; Observation cost: c i for each observation at DM i ; Stopping cost: ℓ ( U 1 , U 2 , H ) which satisfies: ℓ ( U 1 , U 2 , H ) cannot be decomposed as ℓ ( U 1 , H ) + ℓ ( U 2 , H ) ; For any m , n ∈ { h 0 , h 1 } , m � = n , ℓ ( m , m , n ) � ℓ ( n , m , n ) � c i � ℓ ( n , n , n ); ℓ ( m , m , n ) � ℓ ( m , n , n ) � c i � ℓ ( n , n , n ) . Goal: Given p , choose ( g 1 , g 2 ) to minimize J ( g 1 , g 2 ; p ) , where J ( g 1 , g 2 ; p ) = E [ c 1 N 1 + c 2 N 2 + ℓ ( U 1 , U 2 , H )] . C. Cui and A. Mahajan (McGill University) Computing Optimal Thresholds CDC 2015 5 / 1

  6. Problem Formulation: Problem Problem 1 : Given the prior probability p , the observation PMFs f i 0 , f i 1 , the observation cost c i , and the loss function ℓ , find a strategy ( g 1 , g 2 ) that minimizes the cost given by J ( g 1 , g 2 ; p ) . Problem 2 : Given the prior probability p , the observation PMFs f i 0 , f i 1 , the observation cost c i , and the loss function ℓ , find a strategy ( g 1 , g 2 ) that is person-by-person optimal (PBPO). A person-by-person optimal (PBPO) strategy ( g 1 , g 2 ) satisfies: g 2 ∈ G 2 , J ( g 1 , g 2 ) ≤ J ( g 1 , ˜ g 2 ) , ∀ ˜ g 1 ∈ G 1 . J ( g 1 , g 2 ) ≤ J (˜ g 1 , g 2 ) , ∀ ˜ C. Cui and A. Mahajan (McGill University) Computing Optimal Thresholds CDC 2015 6 / 1

  7. Information State Process For any i ∈ { 1 , 2 } , let − i denote the other decision maker. For any realization y i 1 : t of Y i 1 : t , define π i t := P ( H = h 0 | y i 1 : t ) . In addition, define q i ( y i t + 1 | π i t ) := π i t · f i 0 ( y i t + 1 ) + ( 1 − π i t ) · f i 1 ( y i t + 1 ) , (1) φ i ( π i t , y i t + 1 ) := π i t · f i 0 ( y i t + 1 ) / q i ( y i t + 1 | π i t ) . (2) The update of the information state is given by π i t + 1 = φ i ( π i t , y i t + 1 ) . { π i t } ∞ t = 1 is an information state process for DM i . For ease of notation, for any i ∈ { 1 , 2 } , k ∈ { 0 , 1 } , u i ∈ { h 0 , h 1 } , and g i ∈ G i , define k ( u i , g i ; p ) = P ( U i = u i | H = h k ; g i , p ) . ξ i C. Cui and A. Mahajan (McGill University) Computing Optimal Thresholds CDC 2015 7 / 1

  8. Structure of Optimal Decision Rules Threshold based strategy: A strategy of the above form is called threshold based if there exists thresholds α i t , β i t ∈ [ 0 , 1 ] , α i t ≤ β i t , such that for any π i ∈ [ 0 , 1 ] , if π i < α i  t , h 1   t ≤ π i ≤ β i g i t ( π i ) = if α i C t , if π i > β i  t .  h 0 Time invariant strategy: A strategy g i = ( g i 1 , g i 2 , . . . ) is called time invariant if for any π i ∈ [ 0 , 1 ] , g i t ( π i ) does not depend on t . Theorem For any i ∈ { 1 , 2 } and any time-invariant and threshold-based strategy g − i ∈ G − i , there is no loss of optimality in restricting attention to time-invariant and threshold based strategies at DM i . Moreover, the best response strategy at DM i is given by the solution of a dynamic program: C. Cui and A. Mahajan (McGill University) Computing Optimal Thresholds CDC 2015 8 / 1

  9. Dynamic Program For any π i ∈ [ 0 , 1 ] 0 ( π i , g − i ) , W i 1 ( π i , g − i ) , W i C ( π i , g − i ) } , V i ( π i ) = min { W i (3) where for k ∈ { 0 , 1 } , 0 ( u 2 , g 2 ; π 1 ) · π 1 · ℓ ( h k , u 2 , h 0 ) � W 1 k ( π 1 , g 2 ) = ξ 2 � u 2 ∈{ h 0 , h 1 } + ξ 2 1 ( u 2 , g 2 ; π 1 ) · ( 1 − π 1 ) · ℓ ( h k , u 2 , h 1 ) � , (4) W 2 k is defined similarly, and C ( π i , g − i ) = c i + B i V i ( π i ) , W i (5) where B i is the Bellman operator given by q ( y i | π i ) · V i ( φ ( π i , y i )) , � [ B i V i ]( π i ) = y i and q ( y i | π i ) is given by ( ?? ). C. Cui and A. Mahajan (McGill University) Computing Optimal Thresholds CDC 2015 9 / 1

  10. Algorithms for computing optimal thresholds We propose two methods to compute the optimal thresholds. Orthogonal search Iteratively solve � α 1 , β 1 � = D 1 ( � α 2 , β 2 � ) � α 2 , β 2 � = D 2 ( � α 1 , β 1 � ) . and (6) Direct search Approximately compute J ( � α 1 , β 1 � , � α 2 , β 2 � ; p ) and search for optimal � α 1 , β 1 � , � α 2 , β 2 � using derivative-free non-convex optimization method. C. Cui and A. Mahajan (McGill University) Computing Optimal Thresholds CDC 2015 10 / 1

  11. Orthogonal search The following procedures are used to solve the coupled dynamic programs: Start with an arbitrary threshold-based strategy ( � α 1 ( 1 ) , β 1 ( 1 ) � ) . 1 Construct a sequence of strategies as follows: 2 For even n : 1 � α 1 ( n ) , β 1 ( n ) � = D 1 ( � α 2 ( n − 1 ) , β 2 ( n − 1 ) � ) , and � α 2 ( n ) , β 2 ( n ) � = � α 2 ( n − 1 ) , β 2 ( n − 1 ) � . For odd n : 2 � α 1 ( n ) , β 1 ( n ) � = � α 1 ( n − 1 ) , β 1 ( n − 1 ) � , and � α 2 ( n ) , β 2 ( n ) � = D 2 ( � α 1 ( n − 1 ) , β 1 ( n − 1 ) � ) . C. Cui and A. Mahajan (McGill University) Computing Optimal Thresholds CDC 2015 11 / 1

  12. Orthogonal search Theorem The orthogonal search procedure described above converges to a time-invariant threshold-based strategy ( g 1 , g 2 ) that is person-by-person optimal. Proof. Let ( g 1 ( n ) , g 2 ( n ) ) denote the strategy at step n . By construction, J ( g 1 ( n ) , g 2 ( n ) ) ≤ J ( g 1 ( n − 1 ) , g 2 ( n − 1 ) ) . Thus, the sequence { J ( g 1 ( n ) , g 2 ( n ) ) } is a decreasing sequence lower bounded by 0. Hence, a limit exists and the limiting strategy is PBPO. C. Cui and A. Mahajan (McGill University) Computing Optimal Thresholds CDC 2015 12 / 1

  13. Preliminaries: Discretizing continuous state Markov chains For any m ∈ N , for any i ∈ { 0 , 1 } , we approximate the [ 0 , 1 ] -valued Markov process { π i t } ∞ � 0 , 1 m , 2 � t = 1 , by a S m -valued Markov chain S m = m , . . . , 1 . Algorithm 1: Compute transition matrices output : P i 0 , P i 1 , P i input: Discretization size m , DM i ; ∗ forall s p ∈ S m do forall y ∈ Y i do let s + = φ i ( s , y i ) find s q , s q + 1 ∈ S m such that s + ∈ [ s q , s q + 1 ) find λ y q , λ y q + 1 ∈ [ 0 , 1 ] such that • λ y q + λ y • s + = λ y q s q + λ y q + 1 = 1 q + 1 s q + 1 forall q ∈ { 0 , 1 , . . . , m } do y λ y [ P i q · f i 0 ] pq = � 0 ( y ) · s p y λ y [ P i 1 ] pq = � q · f i 1 ( y ) · ( 1 − s p ) q · q i ( y i | s p ) y λ y [ P i ∗ ] pq = � C. Cui and A. Mahajan (McGill University) Computing Optimal Thresholds CDC 2015 13 / 1

  14. � � � �� � � � � � �� � � � � � �� �� � Approximation with discrete-state Markov chain k (fix i and k ), given any threshold based strategy g i = � α i , β i � For a given ξ i such that α i , β i ∈ S m , define sets A i β i , β i + 1 0 , A i 1 ⊂ S m as: A i � � 0 = m , . . . , 1 0 , 1 and A i � m , . . . , α i � 1 = as shown below. � � � � Then ξ i k ( h 0 , g i ; p ) is approximated by the event that the Markov chain with transition probability P i k that starts in p gets absorbed in the set A i 0 before it is absorbed in the set A i 1 . Define θ i k ( g i ; p ) = E [ N i | H = h k ; g i , p ] , then θ i k ( g i ; p ) can be approximated using the expected stopping time of Markov chain. This is approximated by the event that the Markov chain starting in p is absorbed in ( A i b ∪ A i 1 ) . C. Cui and A. Mahajan (McGill University) Computing Optimal Thresholds CDC 2015 14 / 1

Recommend


More recommend