D ATA C ACHING IN WSN Mario A. Nascimento Univ. of Alberta, Canada http: //www.cs.ualberta.ca/~mn With R. Alencar and A. Brayner. Work partially supported by NSERC and CBIE (Canada) and CAPES (Brazil)
Outline Outline Motivation Cache-Aware Query Processing Cache-Aware Query Optimization Query Partitioning 2 Cached Data Selection Cache Maintenance Experimental Results 2 /23
(One) Application Scenario (One) Application Scenario Satellite User 3 User Base WSN Station User User 3 /23
Using Pre Using Previous (Cached) Queries vious (Cached) Queries { P i } : Set of previous queries Q : Current query Q ’ : Q “minus” { P i } P 2 Q Q’ 4 P 1 P 3 (a) (b) 4 /23
Quer Query P y Par artitioning Ov titioning Over erhead head SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN BS SN SN BS SN SN SN SN SN SN SN SN SN SN SN SN SN SN 5 SN SN SN SN SN SN SN SN (a) (b) Query Processing: query is forwarded, locally flooded, results are collected and shipped back Query processing cost is estimated through an analytical cost-model 5 /23
Ov Overall Ar erall Archit chitecture ecture Current query Answer User Q D(Q) Q, P’’, � � Cache Query WSN Manager Processor P’, D(P’) D( ) � 6 Q P Q, P’ P’’, � Cache Query Index Optimizer Subset of relevant queries and sub-queries (min: query cost) Base Station Relevant Cached Non-stale subset of Queries P and its dataset 6 /23
“Quer “Query Plan” Pr y Plan” Problem (QSP) oblem (QSP) 7 Less larger sub-queries vs. more smaller sub-queries For obtaining Q’ we used the General Polygon Clipper library. For partitioning Q’ into the set of sub-queries Θ we used a O(v log v) algorithm which finds a sub-optimal solution (minimizing the number of sub-queries). 7 /23
B+B (Heuristic) Solution t B+B (Heuristic) Solution to QSP o QSP For each node Q is “clipped” using a subset of P’’ , a set of sub-queries is generated and its cost is obtained. The search stops at a local minimun. P’’ = P’ 8 P’’ = P’ \ {P’ } P’’ = P’ \ {P’ } P’’ = P’ \ {P’ } P’’ = P’ \ {P’ } 1 2 3 4 P’’ = P’ \ {P’ , P’ } P’’ = P’ \ {P’ , P’ } 2 1 2 4 P’’ = P’ \ {P’ , P’ } 2 3 8 /23
Other Heuristic Solutions t Other Heuristic Solutions to QSP o QSP In addition to the B+B we also used two more aggressive greedy heuristics: GrF (GrE) starts with all (no) cached queries removing (inserting) the smallest (largest) cached query as long as there is some gain. 9 P’’ = P’ GrF “path” P’’ = P’ \ {P’ } P’’ = P’ \ {P’ } P’’ = P’ \ {P’ } P’’ = P’ \ {P’ } 1 2 3 4 P’’ = P’ \ {P’ , P’ } P’’ = P’ \ {P’ , P’ } 2 1 2 4 P’’ = P’ \ {P’ , P’ } 2 3 9 /23
Cache Maint Cache Maintenance enance Q Query P’, D(P’) Processor P’, P’’, � P \ P’ 10 Cache Cache Cache Manager Reader Updater (internals) P Q, P \ P’, P’ \ P’’, � Q Cache Index 10 /23
Cache Maint Cache Maintenance enance Data that can be used P P (dropped) 1,2 P 2 1 to refresh P ’s data 1 P (used) P 2 1,1 11 P 3 Q’ Q (a) (b) (c) 11 /23
Losses Losses wr wrt Optimal Solution Optimal Solution 12 B+B GrF 10 GrE Frequency [%] 8 6 4 12 2 0 < ( ( ( ( ( > 1 2 4 6 8 1 1 - 0 0 0 0 0 1 - - - - 0 0 3 5 7 9 ] 0 0 0 0 ] ] ] ] Energy loss (range) wrt OPT [%] B+B is the Branch-and-Bound heuristic. GrF (GrE) is an aggressive greedy heuristic, starting with all (no) cache and removing (inserting) the smallest (largest) cached queries available as long as there is some gain. 12 /23
Gains Gains wr wrt NO NOT Using Cache T Using Cache 12 B+B 11 GrF 10 GrE Frequency [%] 9 8 7 6 5 4 13 3 2 ( ( ( ( ( 0 2 4 6 8 - 0 0 0 0 1 - - - - 0 3 5 7 9 ] 0 0 0 0 ] ] ] ] Energy savings (range) wrt no cache [%] By design GrE cannot be any worse that no using any cache. 13 /23
Gains Gains wr wrt Using ALL Cache Using ALL Cache 40 B+B 35 GrF GrE 30 Frequency [%] 25 20 15 10 14 5 0 ( ( ( ( ( 0 2 4 6 8 - 0 0 0 0 1 - - - - 0 3 5 7 9 ] 0 0 0 0 ] ] ] ] Energy savings (range) wrt FC [%] By design GrF cannot be any worse that using all of the cache. 14 /23
Detailed results or skip to main conclusions? 15 15 /23
De Detailed results tailed results We investigate the performance of the proposed approach wrt efficiency (for finding the query plan) and effectiveness (cost of solution) when varying: 16 Number of sensors Size of cache (number of cached queries) Query size (wrt total area) Validity time (of cached results) 16 /23
Var arying # of Sensor ying # of Sensors s 25 B+B Energy cost loss wrt OPT [%] FC 20 GrF GrE 15 10000 10 17 Number of states explored 5 1000 GrE GrF 0 B+B 1 2 3 4 5 100 OPT Number of sensors (x 1,000) 10 1 1 2 3 4 5 Number of sensors (x 1,000) 17 /23
Var arying Cache Size ying Cache Size 20 B+B Energy cost loss wrt OPT [%] FC GrF 15 GrE 10 10000 18 Number of states explored 5 1000 GrE GrF 0 B+B 100 200 300 400 500 100 OPT Cache size [# Queries] 10 1 100 200 300 400 500 Cache size [# Queries] 18 /23
Var arying Quer ying Query Size y Size 70 B+B Energy cost loss wrt OPT [%] FC 60 GrF GrE 50 40 30 10000 20 19 Number of explored states 1000 GrE 10 GrF B+B 0 OPT 0.01 0.25 1 4 16 100 Query size [% of total area] 10 1 0.01 0.25 1 4 16 Query size [% total area] 19 /23
Var arying Quer ying Query V y Validity Time alidity Time 12 Energy cost loss wrt OPT [%] 10 B+B FC GrF 8 GrE 6 10000 4 20 Number of states explored 2 1000 GrE GrF 0 B+B 10 15 20 25 30 35 40 45 50 100 OPT Validity time [number of timestamps] 10 1 10 15 20 25 30 35 40 45 50 Validity time [# timestamps] 20 /23
Conclusions Conclusions The cached query selection, query clipping and sub- queries generation amounts to a fairly complex and combinatorial problem 21 Although a query cost model is needed, our proposal is orthogonal to it If nothing can be done your best shot is to use all of the cache, but … 21 /23
Conclusions Conclusions The Branch-and-Bound heuristic : Finds a “query plan” orders of magnitude faster than the exhaustive search Is typically less than 2% more expensive than the optimal query cost 22 Is robust with respect to a number of different parameters Next stop: Aggregation queries … 22 /23
Thanks Thanks 23 23 /23
Recommend
More recommend