Streaming Queries over Streaming Data Sirish Chandrasekaran UC Berkeley August 20, 2002 VLDB 2002 with Michael J. Franklin
Motivation (1): Queries over Data Streams Queries over the past Select all the Hondas that passed I-80 at Ashby Ave. between 2:15 and 2:45 pm today Queries over the future: Continuous Query Landmark Window Query Sliding Window Query Continuously Select all the Hondas Continuously Select all the Hondas that pass I-80 at Ashby Ave. that pass I-80 at Ashby Ave. in the starting now latest half hour starting now Hybrid Query Continuously Select all the Hondas that have passed and will pass I-80 at Ashby Ave. since 2:30 pm today Sirish Chandrasekaran
Conventional Databases Data is stored/indexed in the system Queries are applied to stored data as they “stream through” Result Query Index Data Only support queries over the past Sirish Chandrasekaran
CQ Engines Queries are stored/indexed in the system Data is applied to stored queries as they “stream” through Index Result Queries Data Only support landmark window queries over the future Sirish Chandrasekaran
Psoup Insight #1 Queries and data are duals Store new queries, apply to data that arrived earlier Store new data, apply to queries that arrived earlier Result Index Index Query Data Queries Multiquery Processing = “join” of query and data – Supports all three types of queries: queries over the past, (landmark and sliding window) continuous, and hybrid Sirish Chandrasekaran
Psoup Insight #1 Queries and data are duals Store new queries, apply to data that arrived earlier Store new data, apply to queries that arrived earlier Result Index Index Data Data Queries Multiquery Processing = “join” of query and data – Supports all three types of queries: queries over the past, (landmark and sliding window) continuous, and hybrid Sirish Chandrasekaran
Motivation (2): Disconnected Operation Previous solutions stream out answers immediately Not feasible/suitable for all applications Intermittent Connectivity: e.g., Applications on hand-held devices (as in this morning’s keynote address) Even if connected: Not always interested in streaming answers Sirish Chandrasekaran
Psoup Insight #2 Separate computation from delivery Query answers continuously generated in background Apply windows on-demand to transmit “current” results Query Data Queries ID Predicate ID R.aR.b Invoke T T F T F T T T } Data F F F F T F F T Register Results Structure Efficient support for disconnected operation Low response time, Shared computation and storage across invocations Sirish Chandrasekaran
Outline of Talk PSoup Overview Query Model Query Registration Background Processing Selections Queries, Join Queries Query Invocation PSoup in Telegraph Experiments and Results Conclusions and Future Work Sirish Chandrasekaran
PSoup Query Model S ELECT select_list F ROM from_list W HERE where_clause B EGIN begin_time E ND end_time Where clause: conjunction of boolean factors B EGIN -E ND clause: system clock or sequence numbers (begin_time, end_time): (constant, constant) – snapshot query (constant, variable) – landmark window query (variable, variable) – sliding window query Sirish Chandrasekaran
Query Registration } Standing Query S ELECT select_list Clause (SQC) F ROM from_list W HERE where_clause to the Symmetric Join } B EGIN begin_time to the Windows_Table E ND end_time QueryID: handle for future query invocations Sirish Chandrasekaran
Selections over Single Stream: Arrival of New Query Specification Query Store Data Store ID Predicate ID R.a R.b 20 0<R.a<=5 48 4 3 21 R.a>4 and R.b=3 49 7 3 22 0>R.b>4 50 3 8 23 R.a=4 and R.b=3 51 0 0 52 8 4 PSoup (a) Initial State Sirish Chandrasekaran
Selections over Single Stream: Arrival of New Query Specification Query Store Data Store ID Predicate ID R.a R.b 20 0<R.a<=5 48 4 3 21 R.a>4 and R.b=3 49 7 3 22 0>R.b>4 50 3 8 23 R.a=4 and R.b=3 51 0 0 52 8 4 Select * From R Where R.a<=4 and R.b>=3 PSoup New query (b) Arrival of new Query Sirish Chandrasekaran
Selections over Single Stream: Arrival of New Query Specification Query Store Data Store ID Predicate ID R.a R.b 20 0<R.a<=5 48 4 3 21 R.a>4 and R.b=3 49 7 3 22 0>R.b>4 50 3 8 23 R.a=4 and R.b=3 51 0 0 24 R.a<=4 and R.b>=3 52 8 4 BUILD PSoup (c) Building Query Store Sirish Chandrasekaran
Selections over Single Stream: Arrival of New Query Specification Query Store Data Store ID Predicate ID R.a R.b 20 0<R.a<=5 match 48 4 3 21 R.a>4 and R.b=3 49 7 3 22 0>R.b>4 match 50 3 8 23 R.a=4 and R.b=3 PROBE 51 0 0 24 R.a<=4 and R.b>=3 52 8 4 PSoup (d) Probing Data Store Sirish Chandrasekaran
Selections over Single Stream: Arrival of New Query Specification Queries 20 21 22 23 24 48 4 3 48 T 49 F Data 50 3 8 50 T Results 51 F 52 F Results Structure (e) Inserting Results Sirish Chandrasekaran
Selections over Single Stream: Arrival of New Data Query Store Data Store ID Predicate ID R.a R.b 20 0<R.a<=5 48 4 3 21 R.a>4 and R.b=3 49 7 3 22 0>R.b>4 50 3 8 23 R.a=4 and R.b=3 24 R.a<=4 and R.b>=3 51 0 0 52 8 4 PSoup (a) Initial State Sirish Chandrasekaran
Selections over Single Stream: Arrival of New Data Query Store Data Store ID Predicate ID R.a R.b 20 0<R.a<=5 48 4 3 21 R.a>4 and R.b=3 49 7 3 22 0>R.b>4 50 3 8 23 R.a=4 and R.b=3 24 R.a<=4 and R.b>=3 51 0 0 52 8 4 PSoup New data 53 3 6 (b) Arrival of new Data Sirish Chandrasekaran
Selections over Single Stream: Arrival of New Data Query Store Data Store ID Predicate ID R.a R.b 20 0<R.a<=5 48 4 3 21 R.a>4 and R.b=3 49 7 3 22 0>R.b>4 50 3 8 23 R.a=4 and R.b=3 24 R.a<=4 and R.b>=3 51 0 0 52 8 4 53 3 6 D L I U B PSoup (c) Building Data Store Sirish Chandrasekaran
Selections over Single Stream: Arrival of New Data Query Store Data Store ID Predicate ID R.a R.b match 20 0<R.a<=5 48 4 3 21 R.a>4 and R.b=3 49 7 3 22 0>R.b>4 50 3 8 23 R.a=4 and R.b=3 match 24 R.a<=4 and R.b>=3 PROBE 51 0 0 52 8 4 53 3 6 PSoup (d) Probing Query Store Sirish Chandrasekaran
Selections over Single Stream: Arrival of New Data Queries 20 21 22 23 24 48 20 0<R.a<=5 49 Data 50 Results 51 24 R.a<=4 and R.b>=3 52 53 T F F F T Results Structure (e) Inserting Results Sirish Chandrasekaran
Query Invocation System returns the results corresponding to the current value of the B EGIN -E ND clause Queries 20 21 22 23 24 48 T Current Window 49 F Data 50 T BEGIN begin_time } 51 F END end_time 52 F 53 T F F F T Results Structure Sirish Chandrasekaran
Joins over R and S: Arrival of New Query Specification S-Data Store ID S.a S.b 21 2 2 25 3 3 36 4 4 49 5 5 Query Store R-Data Store ID Predicate 20 R.a=5 and R.b<S.b ID R.a R.b 21 R.a>4 and R.b<S.b and S.a<10 10 2 5 22 R.b=4 and R.a+5>S.a and S.b>2 14 3 3 31 4 1 48 9 7 PSoup (a) Initial State Sirish Chandrasekaran
Joins over R and S: Arrival of New Query Specification S-Data Store ID S.a S.b 21 2 2 25 3 3 36 4 4 49 5 5 Query Store R-Data Store ID Predicate 20 R.a=5 and R.b<S.b ID R.a R.b 21 R.a>4 and R.b<S.b and S.a<10 10 2 5 22 R.b=4 and R.a+5>S.a and S.b>2 14 3 3 31 4 1 48 9 7 New query PSoup 23 R.a<5 and R.a>S.a and S.b>1 (b) Arrival of new Query Sirish Chandrasekaran
Joins over R and S: Arrival of New Query Specification S-Data Store ID S.a S.b 21 2 2 25 3 3 36 4 4 49 5 5 Query Store R-Data Store ID Predicate 20 R.a=5 and R.b<S.b ID R.a R.b 21 R.a>4 and R.b<S.b and S.a<10 10 2 5 22 R.b=4 and R.a+5>S.a and S.b>2 14 3 3 31 4 1 23 R.a<5 and R.a>S.a and S.b>1 48 9 7 B U I L PSoup D (c) Building Query Store Sirish Chandrasekaran
Joins over R and S: Arrival of New Query Specification S-Data Store ID S.a S.b 21 2 2 25 3 3 36 4 4 49 5 5 Matches Query Store R-Data Store ID Predicate 20 R.a=5 and R.b<S.b ID R.a R.b 21 R.a>4 and R.b<S.b and S.a<10 10 2 5 } 22 R.b=4 and R.a+5>S.a and S.b>2 14 3 3 31 4 1 23 R.a<5 and R.a>S.a and S.b>1 PROBE 48 9 7 PSoup (d) Probing R-Data Store Sirish Chandrasekaran
Joins over R and S: Arrival of New Query Specification S-Data Store Hybrid Structs ID S.a S.b R.ID Q.ID Q.Predicate 21 2 2 10 23 2>S.a and S.b>1 25 3 3 14 23 3>S.a and S.b>1 36 4 4 31 23 4>S.a and S.b>1 49 5 5 Query Store R-Data Store Matches ID Predicate 20 R.a=5 and R.b<S.b ID R.a R.b 21 R.a>4 and R.b<S.b and S.a<10 10 2 5 } 22 R.b=4 and R.a+5>S.a and S.b>2 14 3 3 31 4 1 23 R.a<5 and R.a>S.a and S.b>1 48 9 7 PSoup (e) Constructing Hybrid Structs Sirish Chandrasekaran
Recommend
More recommend