view invalidation for dynamic content caching in
play

View Invalidation for Dynamic Content Caching in Multitiered - PowerPoint PPT Presentation

View Invalidation for Dynamic Content Caching in Multitiered Architectures K. Seluk Candan Divyakant Agrawal Wen-Syan Li Oliver Po Wang-Pin Hsiung NEC USA, C&C Research Labs. CA USA Multi-tiered architectures. Clients do not


  1. View Invalidation for Dynamic Content Caching in Multitiered Architectures K. Selçuk Candan Divyakant Agrawal Wen-Syan Li Oliver Po Wang-Pin Hsiung NEC USA, C&C Research Labs. CA USA

  2. Multi-tiered architectures…. Clients do not access the � database directly. Instead, they use applications � which invoke DBMSs – or they access result caches proxy cache (A) – front-end cache (B) – E edge cache (C) – user side cache (D) – F middle-tier caches (E) – Presented by K. Selçuk Candan 12/3/2002

  3. Problem….. Users Presented by K. Selçuk Candan 12/3/2002

  4. Result caches and consistency � Various view materialization and update management techniques – have been proposed to deal with updates to the underlying data. � These techniques guarantee that cached results are always consistent – with the underlying data. Presented by K. Selçuk Candan 12/3/2002

  5. Strong consistency requirements.. Data Warehouse Data Data Presented by K. Selçuk Candan 12/3/2002

  6. Strong consistency requirements.. Data Warehouse Data Data Presented by K. Selçuk Candan 12/3/2002

  7. Strong consistency requirements.. Queries Data Warehouse Data Data Presented by K. Selçuk Candan 12/3/2002

  8. Result Caches and consistency � Various view materialization and update management techniques – have been proposed to deal with updates to the underlying data. � These techniques guarantee that cached results are always consistent – with the underlying data. � Other applications do not require caches reflect the database exactly all the time. Presented by K. Selçuk Candan 12/3/2002

  9. Relaxed consistency requirements.. Queries Queries Data Warehouse Middletier Cache Misses Data Data Data Data Presented by K. Selçuk Candan 12/3/2002

  10. Invalidation vs. view maintenance Result caches need all out-dated results be invalidated – in a timely fashion. Presented by K. Selçuk Candan 12/3/2002

  11. Example � Page: http://www.autobuy.com/modelinfo?car=Toyota select maker, model, price from Car where maker = "Toyota"; is cached. Presented by K. Selçuk Candan 12/3/2002

  12. Example (cont.) � If a new tuple (Toyota; Avalon; 25000) – is inserted into Car, then we can either recompute the new results of this query (preferably incrementally) and – rerun the application to regenerate the page. – or purge the corresponding page from the cache. – the request can still served from the database! – Presented by K. Selçuk Candan 12/3/2002

  13. Overinvalidation as a tool � Overinvalidation can be used if accurate invalidation is too expensive or – not feasible in a given time frame – � Underinvalidation is not acceptable! Invalidation is inherently cheaper than view maintenance: • we do not need to compute all consequences of updates • to reduce the invalidation delay, we can overinvalidate Presented by K. Selçuk Candan 12/3/2002

  14. Query and update streams… up1 up2 up3 inv2 inv3 inv1 q1 q2 q3 q4 q5 Presented by K. Selçuk Candan 12/3/2002

  15. Example � Query, select * from Car, Mileage where Car.maker = "Toyota" and Car.model = Mileage.model; New tuples: � (“Mitsubishi", “Galant", 23000), (No additional information required) – (“Toyota", “Avalon", 25000), – (Additional information required) � For the second tuple, we need to check whether Car.model = Mileage.model (Polling query) – can be satisfied using the data in the database. Presented by K. Selçuk Candan 12/3/2002

  16. Polling queries (cont.) � Polling query that has to be answered: select * from Mileage where "Avalon" = Mileage.model; � If the result to polling query is non-empty, then the newly inserted tuple affected the query – Keypoint: We only need to check for existence , we do not need to evaluate the polling query completely Presented by K. Selçuk Candan 12/3/2002

  17. ?: the effect of updates on join views Presented by K. Selçuk Candan 12/3/2002

  18. ?: the effect of updates on join views - no distinction between deleted or inserted tuples - no need to evaluate entire ? Presented by K. Selçuk Candan 12/3/2002

  19. Challenges in calculating ? available from the update logs Presented by K. Selçuk Candan 12/3/2002

  20. Challenges in calculating ? not available !!! available from the update logs snapshot-based: a copy of the database is maintained � synchronous: a single copy is maintained � the copy is locked during invalidation – asynchronous: a single copy is maintained � no locking is used – Presented by K. Selçuk Candan 12/3/2002

  21. Snapshot-based approach (new and old versions are available) Presented by K. Selçuk Candan 12/3/2002

  22. Results � Snapshot-based approach no over- or under-invalidation – replication overhead – Presented by K. Selçuk Candan 12/3/2002

  23. Synchronous approach (only new available) - old version of the database is not available!!! OVERINVALIDATION Presented by K. Selçuk Candan 12/3/2002

  24. Results � Snapshot-based approach no over- or under-invalidation – replication overhead – � Synchronous approach when there are more than two relations, unrecoverable over- – invalidation is possible locking overhead – Presented by K. Selçuk Candan 12/3/2002

  25. Asynchronous approach (neither old nor new is available) Presented by K. Selçuk Candan 12/3/2002

  26. Results � Snapshot-based approach no over- or under-invalidation – replication overhead – � Synchronous approach when there are more than two relations, unrecoverable over- – invalidation is possible locking overhead – � Asynchronous approach when there are more than two relations, unrecoverable under- – invalidation is possible no overhead – Presented by K. Selçuk Candan 12/3/2002

  27. Efficiency: consolidated invalidation TIME Presented by K. Selçuk Candan 12/3/2002

  28. Consolidated invalidation Presented by K. Selçuk Candan 12/3/2002

  29. Consolidated invalidation Presented by K. Selçuk Candan 12/3/2002

  30. Consolidated invalidation Presented by K. Selçuk Candan 12/3/2002

  31. Consolidated invalidation Presented by K. Selçuk Candan 12/3/2002

  32. Consolidation versus individual invalidation � Individual invalidation: is the average top-1 retrieval cost – is the number of queries – � Consolidated invalidation: is the total size of ? – Presented by K. Selçuk Candan 12/3/2002

  33. Polling query overhead Presented by K. Selçuk Candan 12/3/2002

  34. Polling query overhead Presented by K. Selçuk Candan 12/3/2002

  35. Overinvalidation vs. table sizes Presented by K. Selçuk Candan 12/3/2002

  36. Overinvalidation vs. update rate Presented by K. Selçuk Candan 12/3/2002

  37. Conclusions � Fast invalidation is key for caching in multi-tiered architectures � Hard consistency is not required by many applications Overinvalidation is acceptable – Underinvalidation is not! – � View invalidation is inherently cheaper than view maintenance � View invalidation is feasible! Presented by K. Selçuk Candan 12/3/2002

Recommend


More recommend