Structure of optimal strategies for remote estimation over Gilbert-Elliott channel with feedback Jhelum Chakravorty Joint work with Aditya Mahajan McGill University ISIT June 27, 2017 1 / 18
Motivation Sequential transmission of data Zero delay in reconstruction 2 / 18
Motivation Applications? Smart grids 2 / 18
Motivation Applications? Environmental monitoring, sensor network 2 / 18
Motivation Applications? Internet of things 2 / 18
Motivation Applications? Smart grids Environmental monitoring, sensor network Internet of things Salient features Sensing is cheap Transmission is expensive Size of data-packet is not critical 2 / 18
Motivation We study the structure of optimal strategies for a fundamental trade-off between estimation accuracy and transmission cost! 2 / 18
The model 3 / 18
Markov Process Transmitter Erasure Channel Receiver 𝑌 𝑢 𝑉 𝑢 𝑍 𝑢 ˆ 𝑌 𝑢 ACK/NACK The remote-state estimation setup Source model Generic : X t ∈ X , X : finite or Borel-measurable; Stylized : X t + 1 = aX t + W t ; X t ∈ X , W t i.i.d. 4 / 18
Markov Process Transmitter Erasure Channel Receiver 𝑌 𝑢 𝑉 𝑢 𝑍 𝑢 ˆ 𝑌 𝑢 ACK/NACK The remote-state estimation setup Transmitter U t = f t ( X 0 : t , S 0 : t − 1 , Y 0 : t − 1 ) ∈ { 0 , 1 } 4 / 18
Markov Process Transmitter Erasure Channel Receiver 𝑌 𝑢 𝑉 𝑢 𝑍 𝑢 ˆ 𝑌 𝑢 ACK/NACK The remote-state estimation setup Channel model S t Markovian; S t = 1: channel ON, S t = 0: channel OFF State transition matrix Q . if U t = 1 and S t = 1 X t , if U t = 0 and S t = 1 Y t = E 1 , if S t = 0 . E 0 , 4 / 18
𝑌 𝑢 ACK/NACK Process Transmitter Erasure Channel Receiver 𝑌 𝑢 𝑉 𝑢 𝑍 𝑢 ˆ Markov The remote-state estimation setup Receiver ˆ X t = g t ( Y 0 : t ) Per-step distortion: d ( X t − ˆ X t ) . d ( · ) : even and quasi-convex. Communication Transmission strategy f = { f t } ∞ t = 0 strategies Estimation strategy g = { g t } ∞ t = 0 4 / 18
The infinite horizon optimization problem Discounted setup: β ∈ ( 0 , 1 ) D β ( f , g ) := ( 1 − β ) E ( f , g ) � ∞ � � � β t d ( X t − ˆ � X 0 = 0 X t ) � t = 0 N β ( f , g ) := ( 1 − β ) E ( f , g ) � ∞ � � � β t U t � X 0 = 0 � t = 0 Long-term average setup: β = 1 T E ( f , g ) � T − 1 1 � � � d ( X t − ˆ D 1 ( f , g ) := lim sup � X 0 = 0 X t ) � T →∞ t = 0 T E ( f , g ) � T − 1 1 � � � N 1 ( f , g ) := lim sup � X 0 = 0 U t � T →∞ t = 0 5 / 18
The infinite horizon optimization problem Problem β ( λ ) := inf ( f , g ) D β ( f , g ) + λ N β ( f , g ) , β ∈ ( 0 , 1 ] C ∗ 5 / 18
The infinite horizon optimization problem Problem β ( λ ) := inf ( f , g ) D β ( f , g ) + λ N β ( f , g ) , β ∈ ( 0 , 1 ] C ∗ Salient features Multiple decision makers — Transmitter and Estimator : decentralized control system Cooperative set-up — minimization of a common objective function Modeled as a Team problem ; Team: Multiple decision makers to achieve a common goal 5 / 18
Decentralized control systems Pioneers: Theory of teams Economics: Marschak, 1955; Radner, 1962 Systems and control: Witsenhausen, 1971; Ho, Chu, 1972 6 / 18
Decentralized control systems Pioneers: Theory of teams Economics: Marschak, 1955; Radner, 1962 Systems and control: Witsenhausen, 1971; Ho, Chu, 1972 Remote-state estimation as Team problem No packet drop - Marshak, 1954; Kushner, 1964; Åstrom, Bernhardsson, 2002; Xu and Hespanha, 2004; Imer and Başar, 2005; Lipsa and Martins, 2011; Molin and Hirche, 2012; Nayyar, Başar, Teneketzis and Veeravalli, 2013; D. Shi, L. Shi and Chen, 2015 With packet drop - Ren, Wu, Johansson, G. Shi and L. Shi, 2016; Chen, Wang, D. Shi and L. Shi, 2017; With noise - Gao, Akyol and Başar, 2015–2017 6 / 18
Structural results 7 / 18
Structure of optimal strategies Generic model: X is finite or Borel-measurable. Belief states based on common information t ( x ) := P f ( X t = x | S 0 : t − 1 = s 0 : t − 1 , Y 0 : t − 1 = y 0 : t − 1 ) , π 1 t ( x ) := P f ( X t = x | S 0 : t = s 0 : t , Y 0 : t = y 0 : t ) . π 2 Theorem 1: structure of optimal strategies t ( X t , S t − 1 , Π 1 U t = f ∗ t ) , ˆ t (Π 2 X t = g ∗ t ) . POMDP-like dynamic programming formulation. 8 / 18
Structure of optimal strategies Stylized model: X t + 1 = aX t + W t ; W t : Unimodal and symmetric. Theorem 2: Optimal estimator Time homogeneous! � if Y t �∈ { E 0 , E 1 } ; Y t , ˆ X t = a ˆ if Y t ∈ { E 0 , E 1 } . X t − 1 , 8 / 18
Structure of optimal strategies Stylized model: X t + 1 = aX t + W t ; W t : Unimodal and symmetric. Theorem 2: Optimal estimator Time homogeneous! � if Y t �∈ { E 0 , E 1 } ; Y t , ˆ X t = a ˆ if Y t ∈ { E 0 , E 1 } . X t − 1 , Theorem 2: Optimal transmitter X t ∈ R ; U t is threshold based action: � if | X t − a ˆ 1 , X t − 1 | ≥ k ( S t − 1 ) U t = if | X t − a ˆ 0 , X t − 1 | < k ( S t − 1 ) 8 / 18
Proof sketch Theorem 1 Use notion of Irrelevant Information to show that ( X t , S 0 : t − 1 , Y 0 : t − 1 ) is sufficient information at the transmitter Identify the common information ( S 0 : t − 1 , Y 0 : t − 1 ) at the transmitter and ( S 0 : t , Y 0 : t ) at the receiver Local information at the transmitter: X t and at the receiver: ∅ Belief states: at the transmitter π 1 t := P ( X t | S 0 : t − 1 , Y 0 : t − 1 ) , at the receiver π 2 t := P ( X t | S 0 : t , Y 0 : t ) Common information approach - Nayyar, Mahajan, Teneketzis TAC’13 : show that ( X t , S t − 1 , π 1 t ) is sufficient statistic at the transmitter and π 2 t is sufficient statistic at the receiver 9 / 18
Proof sketch Theorem 2 Change of variables: E t , E + t , ˆ E t � if Y t ∈ { E 0 , E 1 } aZ t − 1 , Z t = if Y t �∈ { E 0 , E 1 } Y t , E + E t := ˆ ˆ E t := X t − aZ t − 1 , t := X t − Z t , X t − Z t Step 1: Forward induction method utilizing majorization properties to show optimal ˆ E t = 0 — leads to the structure of optimal estimator Step 2: Fix the optimal estimator. Show by constructing a threshold based prescription that such a transmission strategy is optimal 9 / 18
Computation of optimal performances: autoregressive model 10 / 18
Step 1: computation of the performance of a threshold based strategy � 1 , if S t − 1 = 0 & | E t | ≥ k ( S t − 1 ) f ( k ) ( E t , S t − 1 ) = 0 , if S t − 1 = 0 & | E t | < k ( S t − 1 ) . τ ( k ) : the time a packet was last received successfully. 11 / 18
Step 1: computation of the performance of a threshold based strategy τ ( k ) : the time a packet was last received successfully. Till first successful reception � τ ( k ) − 1 � L ( k ) � � β t d ( E t ) � E 0 = 0 , S 0 = 1 := E � β t = 0 � τ ( k ) − 1 β t � � M ( k ) � � E 0 = 0 , S 0 = 1 := E � β t = 0 � τ ( k ) � � K ( k ) � β t U t � E 0 = 0 , S 0 = 1 := E � β t = 0 11 / 18
Step 1: computation of the performance of a threshold based strategy E t is regenerative process Renewal relationships L ( k ) D ( k ) β := D β ( f ( k ) , g ∗ ) = β M ( k ) β K ( k ) N ( k ) := N β ( f ( k ) , g ∗ ) = β β M ( k ) β 11 / 18
Step 2: Optimality condition (JC & AM: TAC’17, NecSys ’16) D ( k ) β , N ( k ) β , C ( k ) - differentiable in k . β Theorem If ( k , λ ) satisfies ∇ k D ( k ) + λ ∇ k N ( k ) = 0 , then, ( f ( k ) , g ∗ ) optimal β β for costly comm. with cost λ . 12 / 18
Step 2: Optimality condition (JC & AM: TAC’17, NecSys ’16) D ( k ) β , N ( k ) β , C ( k ) - differentiable in k . β Theorem If ( k , λ ) satisfies ∇ k D ( k ) + λ ∇ k N ( k ) = 0 , then, ( f ( k ) , g ∗ ) optimal β β for costly comm. with cost λ . β ( λ ) := C β ( f ( k ) , g ∗ ; λ ) is continuous, increasing and concave in λ . C ∗ 12 / 18
Step 2: Computation of optimal thresholds Numerically compute L ( k ) β , M ( k ) and K ( k ) β ; Renewal relationship β to compute C ( k ) β . Analytical formulae are difficult to obtain. 13 / 18
Step 2: Computation of optimal thresholds Numerically compute L ( k ) β , M ( k ) and K ( k ) β ; Renewal relationship β to compute C ( k ) β . Analytical formulae are difficult to obtain. Simulation based approach - JC, JS & AM ACC’17 Two DP based approaches - Monte Carlo (MC) and Temporal Difference (TD) MC - High variance due to one sample path; low bias TD - Low variance due to bootstrapping ; high bias 13 / 18
Step 2: Computation of optimal thresholds Numerically compute L ( k ) β , M ( k ) and K ( k ) β ; Renewal relationship β to compute C ( k ) β . Analytical formulae are difficult to obtain. Simulation based approach - JC, JS & AM ACC’17 Two DP based approaches - Monte Carlo (MC) and Temporal Difference (TD) MC - High variance due to one sample path; low bias TD - Low variance due to bootstrapping ; high bias Exploit regenerative property of the underlying state (error) process Renewal Monte Carlo (RMC) - low variance (independent sample paths from renewal) and low bias (since MC) 13 / 18
Recommend
More recommend