2/10/2009 Answering Queries Using Views What is a view? • A stored query that can be queried like a relation A stored query that can be queried like a relation Answering Queries Using Answering Queries Using • – Materialized view: result set is stored – Virtual view: not stored Views Views • SQL view: SQL view: • CREATE VIEW Animation_Heros AS SELECT Main.Name FROM Main, Genre Paper by Alon Halevy Paper by Alon Halevy WHERE Main.Hero = Genre.movie AND Genre.Type = “Animation” Presentation by Oana Sandu Presentation by Oana Sandu • Datalog view: Datalog view: • movie-hero(hero) :- Main(hero, movie), Discussion by Doug Bateman Discussion by Doug Bateman Genre(movie, "Animation") Answering Queries Using Views Answering Queries Using Views Answering queries using views Data Management Problems • Answer a query in terms of views rather Answer a query in terms of views rather • We will discuss AQUV in 2 contexts: We will discuss AQUV in 2 contexts: than using the underlying base table than using the underlying base table • Query: Query: • 1. Query Optimization 1. Query Optimization q(hero) :- Main(hero, movie), – Speed up a query Genre(movie, “Animation”) • Materialized view: Materialized view: • 2. Data Integration 2. Data Integration hero-type(hero, type) :- Main(hero, movie), – Query over many local databases Genre(movie, type) • Rewriting using view: Rewriting using view: q(hero):-hero- type(hero, “Animation”) Answering Queries Using Views Answering Queries Using Views Query Optimization: Query Optimization Closed World • Rewrite query in terms of views Rewrite query in terms of views and and base base • Assumption: views are complete Assumption: views are complete • tables tables • We want an We want an equivalent query rewriting: equivalent query rewriting: retrieve retrieve • – Query: exactly the same answers as the original query exactly the same answers as the original query q(hero, tophero):- Main(hero, movie), Query: q(hero) : Query: Genre(movie, type), q(hero) :- Main(hero, movie), Genre(movie, “Action”) Main(hero, movie), Genre(movie, “Action”) TopCharacter(type, tophero) Views: Views: – View: hero hero-type(hero, type) : type(hero, type) :- Main(hero, movie), Genre(movie, type) Main(hero, movie), Genre(movie, type) hero-type(hero, type) :- Main(hero, movie), dreamworks-hero(hero,type) : dreamworks hero(hero,type) :- Main(hero, movie), Main(hero, movie), Genre(movie, type) Genre(movie,type), Producer(movie, “Dreamworks”) Genre(movie,type), Producer(movie, “Dreamworks”) Equivalent rewriting: Equivalent rewriting: – Rewriting: q(hero, tophero):- hero-type(hero, type), q(hero): q(hero):-hero hero- type(hero, “Action”) type(hero, “Action”) TopCharacter(type, tophero) Nonequivalent rewriting: Nonequivalent rewriting: q(hero): q(hero):- dreamworks dreamworks- hero(hero,”Action”) hero(hero,”Action”) 1
2/10/2009 Answering Queries Using Views Answering Queries Using Views Query Optimization: Discussion: Question 1 Incorporating Views • Fold views into System Fold views into System-R style R style • What sorts of traditional database What sorts of traditional database optimizer optimizer scenarios can you imagine using scenarios can you imagine using many materialized views? (Think many materialized views? (Think • Views used for alternate access Views used for alternate access about YOUR data.) about YOUR data.) paths paths • Determine which views are usable in Determine which views are usable in answering query q answering query q • Compare evaluation plans Compare evaluation plans – Some of q might be precomputed in views – Sometimes cheaper to use base tables Answering Queries Using Views Answering Queries Using Views Discussion: Question 2 Data Integration: • Imagine that you're building a query Imagine that you're building a query optimizer. Would you consider it optimizer. Would you consider it worth your while to use views when worth your while to use views when answering queries? Why or why not? answering queries? Why or why not? Would you try it only for certain Would you try it only for certain kinds of queries? Which ones? How kinds of queries? Which ones? How does this tradeoff compare with does this tradeoff compare with using bushy trees? using bushy trees? Answering Queries Using Views Answering Queries Using Views Data Integration: Open World Data Integration: Rewriting • Query written in terms of a Query written in terms of a mediated mediated • Local sources: Local sources: schema schema (not materialized!) (not materialized!) – maintained autonomously, can be incomplete • So need an algorithm to rewrite the So need an algorithm to rewrite the Expedia: info on Carribean beaches query in terms of the local schemas query in terms of the local schemas Fodor’s: info on US beaches • Assumption: Open World Assumption: Open World – local sources together need not • Success Criteria: answer includes as Success Criteria: answer includes as cover all tuples in the many correct tuples as can be many correct tuples as can be conceptual Beaches determined from local sources determined from local sources 2
2/10/2009 Answering Queries Using Views Answering Queries Using Views Data Integration: Rewriting Data Integration: Local As View • How do we rewrite the query? How do we rewrite the query? • Mediated Schema: Mediated Schema: Mediated Airport(code, city) Airport(code, city) Schema • Need some mapping between Need some mapping between Feature(city, attraction) Feature(city, attraction) mediated schema and local sources mediated schema and local sources … Beaches AirCanada • Local As View approach: Local As View approach: • Local Sources as Views: Local Sources as Views: – Local sources expressed as views over mediated schema AirCanada(code, city) : AirCanada(code, city) :- Airport(code, city) Airport(code, city) – Executing queries over mediated schema Beaches(code) :- Airport(code, city), Beaches(code) : Airport(code, city), reduces to AQUV Feature(city, “Beach”) Feature(city, “Beach”) • Can easily add new sources Can easily add new sources Answering Queries Using Views Answering Queries Using Views AQUV in Data Integration: Discussion: question 3 Maximally Contained Rewritings • What cases can you imagine using What cases can you imagine using • Query: Query: data integration? (Think about YOUR data integration? (Think about YOUR Dest(code) :- Airport(code, city), Feature(city, “Beach”) Dest(code) : Airport(code, city), Feature(city, “Beach”) • Sources/Views: Sources/Views: data.) data.) CAA CAA-Air(code, city) : Air(code, city) :- Airport(code, city) Airport(code, city) Fodors(city, POI) : Fodors(city, POI) :- Feature(city, POI) Feature(city, POI) • Rewriting: Rewriting: Dest(code):-CAA Dest(code): CAA- Air(code, city), Fodors(city, “Beach”) Air(code, city), Fodors(city, “Beach”) • Contained Rewriting: all answers obtained are Contained Rewriting: all answers obtained are valid answers to query valid answers to query • Maximally Contained: get all possible answers Maximally Contained: get all possible answers given the local sources given the local sources • Finding rewriting is NP Finding rewriting is NP-complete complete Answering Queries Using Views Answering Queries Using Views Data Integration: Bucket Algorithm Data Integration: Rewriting Algorithms • 1. Bucket Algorithm 1. Bucket Algorithm A,B,C C,D – breaks down query answering into answering Dest(code) : Dest(code) :- Airport(code, city), Feature(city, “Beach”) Airport(code, city), Feature(city, “Beach”) subgoals using views • Step 1: Step 1: • 2. MiniCon 2. MiniCon – Create a bucket for each query subgoal – considers subgoal interactions to reduce – For each bucket, place all relevant views search space • Step 2: Step 2: – Checks all rewritings: cross-product of buckets (A,B,C)X(C,D) = (A,C),(A,D),(B,C),(B,D),(C,C),(C,D) – Containment checking is NP-hard! 3
Recommend
More recommend