making sense of suppressions and failures in sensor data
play

Making Sense of Suppressions and Failures in Sensor Data: A - PowerPoint PPT Presentation

Making Sense of Suppressions and Failures in Sensor Data: A Bayesian Approach Adam Silberstein Jun Yang, Kamesh Munagala Yahoo! Research Duke CS Gavino Puggiono, Alan Gelfand Duke ISDS September 27, 2007 September 27, 2007 1 1


  1. Making Sense of Suppressions and Failures in Sensor Data: A Bayesian Approach Adam Silberstein Jun Yang, Kamesh Munagala Yahoo! Research Duke CS Gavino Puggiono, Alan Gelfand Duke ISDS September 27, 2007 September 27, 2007 1 1 Silberstein, VLDB 2007

  2. Introduction • What is a sensor network? – A collection of nodes – Node components • Sensors (e.g. temperature) • Radio (wireless) communication • Battery power Crossbow Mica2 WiSARD September 27, 2007 September 27, 2007 2 2 Silberstein, VLDB 2007

  3. Duke Forest Deployment September 27, 2007 September 27, 2007 3 3 Silberstein, VLDB 2007

  4. Getting All the Data • Scientists often want ALL the data! – No aggregates (e.g. mean) • Continuous reporting – Repeatedly transmit readings to root • Explicitly construct central DB and use traditional processing techniques • Radio costs too high! � Cost to transmit a bit over radio ~1000 times more than to execute machine instruction � Push processing into network with suppression September 27, 2007 September 27, 2007 4 4 Silberstein, VLDB 2007

  5. Outline 1. Suppression 2. Failure! 3. Coping using redundancy 4. BaySail • Inference of missing readings, parameters

  6. Suppression • Push-based communication – Only report deviations from a model • Value-based Temporal Suppression – model: temp t =temp ( t – 1) if (curr_temp != last_sent_temp) { transmit(temp); last_sent_temp=curr_temp; } • In practice, include error tolerance September 27, 2007 September 27, 2007 6 6 Silberstein, VLDB 2007

  7. The Catch for Suppression • What about reports generated, but lost to failure? y 1 , y 2 , y 3 , y 4 Environment May be a Sensor Supp. spatio-temporal suppresion scheme Transmitted y 1 , ? 2 , y 3 , y 4 with intra-node communication Network Base Station y 1 , ? 2 , y 3 , ? 4 • For non-reported values, the base station cannot distinguish failures from suppressions September 27, 2007 7 Silberstein, VLDB 2007

  8. Coping With Failure • Focus on simple temporal suppression • Learn ALL missing values Two Coping Strategies System-level acks + Application-level re-transmissions redundancy • Sender re-sends until • Augment existing receiver returns reports acknowledgement � Minimize impact � Minimize chance of missing report report not received September 27, 2007 September 27, 2007 8 8 Silberstein, VLDB 2007

  9. Redundancy • Temporal Suppression with error tolerance – Report only if reading changes beyond ε since last reported • 5 report types Name Payload Addition Standard Node reading Counter Incrementing report number Timestamp Last n report times Timestamp D Last n report times + direction bits History last n times + readings • Increasing payload, increasing info September 27, 2007 September 27, 2007 9 9 Silberstein, VLDB 2007

  10. TinyOS Implementation • Application-level Redundancy – Simple to implement • 40-50 lines of additional code to a tutorial example • Lower-level redundancy – Activate “acks” in MAC-layer code – Re-transmissions in application code • Failure Rates – Tied to distance, clearance, battery, etc. – Independent over time – 30% failure rate with maximum 2 re-transmissions gives <3% effective failure rate September 27, 2007 10 Silberstein, VLDB 2007

  11. Suppression-Aware Inference • Redundancy + knowledge of suppression scheme ) hard constraints on missing data – Temporal suppression with ε = 0.3, prediction = last reported – Actual: ( x 1 , x 2 , x 3 , x 4 ) = (2.5, 3.5, 3.7, 2.7) – Base station receives: (2.5, nothing, nothing, 2.7) – With Timestamp ( r =1) • (2.5, failed, suppressed, 2.7) • | x 2 – 2.5| > 0.3; | x 3 – x 2 | · · · · 0.3; |2.7 – x 2 | > 0.3 – With Timestamp + Direction Bit ( r =1) • (2.5, failed & increased, suppressed, 2.7 & decreased) • x 2 – 2.5 > 0.3; –0.3 · · · · x 3 – x 2 · · · 0.3; x 2 – 2.7 > 0.3 · – With Count • One suppression and one failure in x 2 and x 3 ; not sure which • A very hairy constraint! • Posterior: p ( X mis , Θ Θ | X obs ), with X mis subject to constraints Θ Θ September 27, 2007 11 Silberstein, VLDB 2007

  12. Using Redundancy Bayesian, model-based Just data AR(1) with uncertain parameter x 3 x 3 BayBase No knowledge ??? of suppression x 2 x 2 x 3 x 3 BaySail x 2 2 [2.2, 3.0] Knowledge of ] 3 . 0 + x suppression & 2 , 3 . 0 – x 2 Timestamps x [ 2 2 x 2 x 2 x 3 x 3 x 2 > 3.0 x 3 BaySail Knowledge of ] 3 . 0 suppression & + x 2 , 3 . 0 Timestamps + – x [ 2 2 Direction Bits 12 Silberstein, VLDB 2007 x 2 x 2 x 3

  13. BaySail Key Features 1. Estimates missing readings/parameters 2. Bayesian provides posterior distributions, not just single point estimates 3. Missing data not generically missing • Constrain possible settings using suppression scheme and redundancy 4. Computing posteriors is hard • Gibbs’ sampling iteratively generates samples of reading time series and of each parameter 5. Combine simple, low-cost in-network reporting with efficient out-of-network inference September 27, 2007 September 27, 2007 13 13 Silberstein, VLDB 2007

  14. BaySail Experimental Example • Simple model of soil moisture – y s,t = c t + φ y s,t-1 + ε s,t • c t is a series of known precipitations • φ 2 (0,1) controls how fast moisture escapes soil • Cov( Y s , t , Y s ’, t ’ ) = σ 2 ( φ | t – t ’| /(1 – φ 2 )) exp(– τ || s – s ’||) • τ controls strength of spatial correlation over distance • Prior : 1/ σ 2 ~ Gamma, φ ~ U(0,1), τ ~ Gamma • Joint Posterior : p(Y mis , φ , σ 2 , τ | Y obs ) subject to constraints September 27, 2007 September 27, 2007 14 14 Silberstein, VLDB 2007

  15. Why the Direction Bit? • TS gives OR constraints: | x 2 - x 1 | > ε – Inefficient rejection sampling • TS+D gives linear constraint: x 1 – x 2 > ε – Allows for more efficient sampling [Rodriguez-Yam et al. 04] >100x improvement… the major reason for the direction bit! September 27, 2007 15 Silberstein, VLDB 2007

  16. 3 Missing Values Cluster s s s BayBase : Conditioning on BaySail : Conditioning on model and endpoints model, endpoints, and that missing values are suppressions September 27, 2007 September 27, 2007 16 16 Silberstein, VLDB 2007

  17. Metrics • Compare posterior mean to actual? – Mean misleading for bimodal distributions • High density regions (hdr) – Given percentage x , return minimal length range(s) of values such that x % of sample’s probability density contained in range(s) – Ensure hdr covers actual reading x% of time 50% 90% r1 r2 r3 r4 September 27, 2007 September 27, 2007 17 17 Silberstein, VLDB 2007

  18. Cost vs. HDR Interval • Parameters induce 60% suppression rate – σ 2 = 1.0, φ = 0.9, ε = 1.0 • Failure rate 30% • 3 Schemes – Samp( τ ) • Fixed reporting every τ rounds – Supp/TD( r ) • Timestamp + direction for last r reports – Supp/Ack( r ) • Maximum r re-transmission attempts September 27, 2007 September 27, 2007 18 18 Silberstein, VLDB 2007

  19. Readings Interval 80% hdr � BaySail demonstrates significant improvement September 27, 2007 September 27, 2007 19 19 Silberstein, VLDB 2007

  20. Phi Interval 80% hdr � Choice has little effect for process parameter September 27, 2007 September 27, 2007 20 20 Silberstein, VLDB 2007

  21. Spatial Inference 3x3 Grid 1 2 3 1 2 3 4 5 6 7 8 9 4 5 6 7 8 9 September 27, 2007 21 Silberstein, VLDB 2007

  22. Conclusion • Suppression is a viable technique only when made robust to failure • BaySail combines low-cost in-network redundancy with efficient out-of-network statistical inference – Generates posteriors distributions on raw missing values and process parameters • Future Challenges – Sophisticated spatio-temporal schemes • Failure on in-network constraints • Failure of model parameter transmission – Storing query results September 27, 2007 22 Silberstein, VLDB 2007

Recommend


More recommend