west coast systems west coast systems
play

West Coast Systems West Coast Systems Peter Dempsey Peter Dempsey - PowerPoint PPT Presentation

West Coast Systems West Coast Systems Peter Dempsey Peter Dempsey Jonathan Abbett Abbett Jonathan Archana Joshi Joshi Archana Kevin Hoeschele Kevin Hoeschele Schedule Schedule 1:40 1:40 2:25 CACQ 2:25 CACQ


  1. West Coast Systems West Coast Systems Peter Dempsey Peter Dempsey Jonathan Abbett Abbett Jonathan Archana Joshi Joshi Archana Kevin Hoeschele Kevin Hoeschele

  2. Schedule Schedule • 1:40 • 1:40– –2:25 CACQ 2:25 CACQ – – Peter Peter • 2:25 • 2:25– –3:10 3:10 PSoup PSoup – – Jonathan Jonathan • 3:10 • 3:10– –3:20 Break 3:20 Break • 3:20 • 3:20– –4:05 STREAM 4:05 STREAM – – Kevin and Kevin and Archana Archana • 4:05-4:30 Discussion • 4:05-4:30 Discussion

  3. Continuously Adaptive Continuously Adaptive Continuous Queries over Streams Continuous Queries over Streams (CACQ) (CACQ) Peter Dempsey Peter Dempsey

  4. References References • S. Madden, M. Smith, J. • S. Madden, M. Smith, J. Hellerstein Hellerstein, and V. , and V. Raman. Continuously adaptive continuous Raman. Continuously adaptive continuous queries over streams. In Proc. ACM SIGMOD Proc. ACM SIGMOD queries over streams. In Intl. Conf. on Management of Data , pages 49- , pages 49- Intl. Conf. on Management of Data 60, Madison, Wisconsin, May 2002. 60, Madison, Wisconsin, May 2002. • J. • J. Hellerstien Hellerstien. Query Processing and Network . Query Processing and Network Infrastructures. Talk at M.I.T. November 2002. Infrastructures. Talk at M.I.T. November 2002. • R. • R. Avnur Avnur and J. and J. Hellerstein Hellerstein. Eddies: Continuously . Eddies: Continuously adaptive query processing. In ACM SIGMOD ACM SIGMOD , , adaptive query processing. In Dallas, TX, May 2000. Dallas, TX, May 2000.

  5. CACQ Introduction CACQ Introduction • A Data Stream Management System. • A Data Stream Management System. • Leverages earlier work at Berkeley • Leverages earlier work at Berkeley (Telegraph). (Telegraph). • Adaptive approach to query processing. • Adaptive approach to query processing. • Implements cross-query sharing of work • Implements cross-query sharing of work and space. and space.

  6. CACQ Overview CACQ Overview • Eddies • Eddies • Lineage • Lineage • Predicate Index • Predicate Index • SteMs • SteMs

  7. Eddies Eddies • Developed for Telegraph. • Developed for Telegraph. • An improvement on static query plans. • An improvement on static query plans. • Provide continuous • Provide continuous adaptivity adaptivity. . • Route • Route tuples tuples through operators. through operators.

  8. Static Query Plan Static Query Plan

  9. Eddy Eddy

  10. Eddy Eddy

  11. Eddy Eddy

  12. Eddy Eddy

  13. Eddy Eddy

  14. Eddy Eddy

  15. Eddy Eddy

  16. CACQ Overview CACQ Overview • Eddies • Eddies • Lineage • Lineage • Predicate Index • Predicate Index • SteMs • SteMs

  17. Lineage Lineage • Each • Each tuple tuple’ ’s s path through the eddy is stored in path through the eddy is stored in the tuple tuple. . the • Each operator can handle • Each operator can handle tuples tuples with different with different lineages. lineages.

  18. When do we output a tuple tuple? ? When do we output a • Each query has a • Each query has a completionMask completionMask, which , which is simply a bitmask bitmask. . is simply a • Each query checks its • Each query checks its completionMask completionMask ANDed with a with a tuple tuple’ ’s s done done bits. bits. ANDed • If the above operation equals the • If the above operation equals the completionMask, then the , then the tuple tuple is output is output completionMask to that query. to that query.

  19. CACQ Overview CACQ Overview • Eddies • Eddies • Lineage • Lineage • Predicate Index • Predicate Index • SteMs • SteMs

  20. Predicate Index : Grouped Filter Predicate Index : Grouped Filter • An index is maintained • An index is maintained for each attribute that for each attribute that appears in a query. appears in a query. • Used to improve • Used to improve efficiency in overlapping efficiency in overlapping range queries. range queries. • Allows the system to • Allows the system to apply multiple selections apply multiple selections at once. at once.

  21. Predicate Indexes Illustrated Predicate Indexes Illustrated

  22. CACQ Overview CACQ Overview • Eddies • Eddies • Lineage • Lineage • Predicate Index • Predicate Index • SteMs • SteMs

  23. SteMs: State Modules : State Modules SteMs • Used to help computes joins. • Used to help computes joins. • Is an index that is built on-the-fly. • Is an index that is built on-the-fly. • Joins are no longer a binary operation. • Joins are no longer a binary operation. • Enforces a • Enforces a window window on the join. on the join.

  24. SteMs: Illustrated : Illustrated SteMs

  25. A few extra details with joins A few extra details with joins • With k input sources there can be 2^k • With k input sources there can be 2^k possible intermediate tuples tuples. . possible intermediate • A Virtual Source is used to encode a • A Virtual Source is used to encode a subset of sources. subset of sources. • The • The sourceId sourceId bitmap is used to denote the bitmap is used to denote the virtual source. virtual source.

  26. Putting it all together: Examples Putting it all together: Examples • Single query without • Single query without joins. joins. • Multiple queries • Multiple queries without joins. without joins. • Multiple queries with • Multiple queries with joins. joins.

  27. Putting it all together: Examples Putting it all together: Examples • Single query without • Eddy • • Single query without Eddy joins. joins. • Multiple queries • Multiple queries without joins. without joins. • Multiple queries with • Multiple queries with joins. joins.

  28. Putting it all together: Examples Putting it all together: Examples • Single query without • Eddy. • • Single query without Eddy. joins. joins. • Multiple queries • Multiple queries • Eddy with Predicate • Eddy with Predicate without joins. without joins. Index. Index. • Multiple queries with • Multiple queries with joins. joins.

  29. Putting it all together: Examples Putting it all together: Examples • Single query without • Eddy. • • Single query without Eddy. joins. joins. • Multiple queries • Multiple queries • Eddy with Predicate • Eddy with Predicate without joins. without joins. Index. Index. • Multiple queries with • Multiple queries with • Eddy with • Eddy with SteMs SteMs. . joins. joins.

  30. Putting it all together: Examples Putting it all together: Examples • Single query without • Eddy. • • Single query without Eddy. joins. joins. • Multiple queries • Multiple queries • Eddy with Predicate • Eddy with Predicate without joins. without joins. Index. Index. • Multiple queries with • Multiple queries with • Eddy with • Eddy with SteMs SteMs. . joins. joins.

  31. Conclusions Conclusions • The strict rules of query paths for • The strict rules of query paths for continuous queries are broken. continuous queries are broken. • Operators are shared much more • Operators are shared much more aggressively than other systems. aggressively than other systems. • Multiple selections are computed at once. • Multiple selections are computed at once. • CACQ improves on performance and space • CACQ improves on performance and space by utilizing shared resources. by utilizing shared resources.

Recommend


More recommend