ROLEX Relational On-Line Exchange with XML Phil Bohannon, Sumit Ganguly, Hank Korth, PPS Narayan, Pradeep Shenoy Database Principles Research Department Information Sciences Research Center Bell Labs
ROLEX Evolution of Applications Evolution of Applications � In the beginning was client- server B2B � Many layers of software later, Client Application applications are portable and Web browser-based…but slow! Servers � So, cache some data, manage HTML XML consistency in application App. Servers � Newest applications speak Application XML cache cache Net J2EE � Supported by XML Publishing XML JDBC XML through View Queries ODBC SQL/ODBC Pub/Shred � Even slower, more caching DBMS Lucent Proprietary ROLEX 2
ROLEX Example: A ROLEX View Query Example: A ROLEX View Query � Relational Schema � Desired Hierarchical Structure Metroarea (metroid, metroname) <metroarea> Hotel (hotelid, hotelname, starrating, chain_id, metro_id, city, pool, gym) <hotel> Guestroom (r_id, roomnumber, type, rackrate) <confstat> <guestrooms> Confroom (c_id, chotel_id, croomnumber, capacity, rackrate) Availability (a_id, a_r_id, startdate, enddate, price) Lucent Proprietary ROLEX 3
ROLEX Example: Add “tag queries” Example: Add “tag queries” <metro> $m = SELECT metroid, metroname FROM metro <hotel> $h = SELECT hotelid, hotelname, starrating, state_id FROM hotel WHERE metro_id = $m.metroid AND starrating > 4 <confstat> <guestroom> Lucent Proprietary ROLEX 4
ROLEX Example: “binding variables” Example: “binding variables” <metro> $m = SELECT metroid, metroname FROM metro <hotel> $h = SELECT hotelid, hotelname, starrating, state_id FROM hotel WHERE metro_id = $m.metroid AND starrating > 4 <confstat> <guestroom> Lucent Proprietary ROLEX 5
ROLEX Our World View Our World View � In the beginning was client- server B2B � Many layers of software later, Client Application applications are portable and Web browser-based…but slow! Servers � So, cache some data, manage HTML XML consistency in application App. Servers � Newest applications speak Application XML cache cache Net J2EE � Even slower, more caching XML JDBC XML ODBC SQL/ODBC Pub/Shred � Worse, applications need to interoperate for OLTP DBMS Lucent Proprietary ROLEX 6
ROLEX Current XML Publishing Current XML Publishing XML DOM Logic Tree Parser Application View Query XML Text Issues Query Tagger Rewrite Publisher � Inherent inefficiencies Tuple- SQL Stream � DOM Tree = Cache? Execute Optimize � Updates? DBMS Lucent Proprietary ROLEX 7
ROLEX Six Steps to ROLEX: Step 1 Six Steps to ROLEX: Step 1 XML DOM Logic Tree Application Parser View Query XML Text “ROLEX” 1. Combine DBMS and Query Tagger Publisher (many Rewrite vendors) Tuple SQL Stream Optimize Execute DBMS Lucent Proprietary ROLEX 8
ROLEX Six Steps to ROLEX: Step 2 Six Steps to ROLEX: Step 2 XML DOM Logic Tree Application Parser View Query XML Text ROLEX 1. DBMS += Publisher Tagger Optimize Tree 2. Optimize to produce Tree tree results Query Execute DBMS Lucent Proprietary ROLEX 9
ROLEX Six Steps to ROLEX: Step 3 Six Steps to ROLEX: Step 3 XML DOM Logic Tree Application Parser View Query XML Text ROLEX 1. DBMS += Publisher Tagger Optimize Tree 2. Optimize to produce Tree trees Query Execute 3. Main-Memory DBMS TM DataBlitz DBMS Lucent Proprietary ROLEX 10
ROLEX Six Steps to ROLEX: Step 4 Six Steps to ROLEX: Step 4 View Logic Query DOM Application Tree Optimize Tree Tree ROLEX Query 4. Share memory, produce DOM tree directly, Execute eliminate parsing TM DataBlitz DBMS Lucent Proprietary ROLEX 11
ROLEX Six Steps to ROLEX: Step 5 Six Steps to ROLEX: Step 5 View Logic Query DOM Application Tree Optimize Tree Query 4. Share memory, produce DOM tree directly, eliminate parsing TM DataBlitz Execute ROLEX 5. Virtual DOM, lazy evaluation Lucent Proprietary ROLEX 12
ROLEX Six Steps to ROLEX: Step 6 Six Steps to ROLEX: Step 6 View Query + Navigation Profile Logic DOM Application Tree Optimize Tree Query 6. Optimize for the expected navigation profile Execute TM DataBlitz ROLEX Lucent Proprietary ROLEX 13
ROLEX Considering Expected Use Considering Expected Use � Recent interest in making use of user feedback during long queries � MQO takes advantage of co-incident query execution � What about consecutive execution? � OLTP interaction with DBMS very stylized, yet little work looks at pattern of related queries � Notable exception: [Florescu, Levy, Suciu, Yagoub, 1999] Lucent Proprietary ROLEX 14
ROLEX Navigational Profile Navigational Profile � Navigational Profile, an input to query optimization � Simple approach to expected use for a tree- based data model <metro> � Example Profile: 0.2 <hotel> Pr{hotel|metro} 1.0 0.1 Pr{confstat|hotel} <confstat> <guestrooms> Lucent Proprietary ROLEX 15
ROLEX Remainder of Talk Remainder of Talk � The Virtual DOM � “Document Object Model” � A Decorrelation Plan Space � A VOLCANO-Based Optimizer � Cost Estimation � Performance � Conclusion Lucent Proprietary ROLEX 16
ROLEX Notation for queries Notation for queries Schema-tree <metro> $m = SELECT metroid, metroname nodes FROM metro <hotel> $h = SELECT hotelid, hotelname, starrating, state_id FROM hotel WHERE metro_id = $m.metroid AND starrating > 4 Q h (m) <guestroom> <confstat> � Let the query for a schema-tree node be Q b (s1,..,sk) where s1, .., sk are parameters and b is a binding variable . Lucent Proprietary ROLEX 17
ROLEX The Virtual DOM The Virtual DOM � A query “sub-plan” and <metro> NI navigation index , NI, for each node � maps s1,…sk to query results or “ absent” <hotel> � supports sibling navigation � Allow support for DOM operations within DB � If s1,..,sk “absent” in NI, <guestrooms> execute sub-plan for node. Lucent Proprietary ROLEX 18
ROLEX Decorrelation Plan-Space Decorrelation Plan-Space � Usual strategy: decorrelate as much as possible to explore alternate join orders � ROLEX strategy: may want to leave sub- query correlated to avoid work when navigation probability is low � Approach: develop a decorrelation plan- space and let the optimizer decide what to do Lucent Proprietary ROLEX 19
ROLEX Decorrelation Decorrelation Q m () <metro> $m = SELECT metroid, metroname FROM metro $h = SELECT hotelid, hotelname, starrating, state_id FROM hotel, metro <hotel> WHERE metro_id = metroid AND starrating > 4 $h = SELECT hotelid, hotelname, GROUP BY hotelid starrating, state_id FROM hotel WHERE metro_id = $m.metroid Q hm () AND starrating > 4 Q h (m) Lucent Proprietary ROLEX 20
ROLEX Decorrelation Plan-Space Decorrelation Plan-Space Alternatives for Q v (m,a): <metroarea> Q m () a (h,m), {Q v (m,a), Q v Q h (m) <hotel> a,h (m), a,h,m ()} Q v Q v <hotel_available> Q a (h) <metro_available> Q v (m,a) Lucent Proprietary ROLEX 21
ROLEX Notes Notes � We always replace bottom query rather than top query � Keeps space from being exponential � Outer-join rules avoided � Cost of greater emphasis on CSE required by optimizer Lucent Proprietary ROLEX 22
ROLEX VOLCANO-Based Optimizer VOLCANO-Based Optimizer � VOLCANO [Graefe McKenna 1993]- style optimizer [Roy 2000] modified to support ROLEX optimization � New operators (NavNodes) added to model schema-tree nodes from the query � Decorrelation plan-space modeled as distinct but equivalent NavNodes � VOLCANO automatically collapses equivalent plan nodes across the tree Lucent Proprietary ROLEX 23
ROLEX Cost Model Cost Model � Goal: � estimate expected unique calls for each node given a Navigation Profile � Approach: � Estimate, for each schema-tree node, n , in query plan: � …expected # visits: EVis(n) � …expected output from one call: ESize(n) � …exp. unique parameter bindings: EUniqBind(s1,..,sk) � Unique calls = min(EVis(n),EUniqBind(s1,..,sk)) Lucent Proprietary ROLEX 24
ROLEX Materialization Materialization � For individual query operators, consider only the binding parameters present in that operator � Materialize {$m,$h} “opportunistically” as bindings increase: {$h} {$m} � Compute this sub- expression only when $m {$h} σ changes {$m} hotelid=$h.id σ metroid=$m.id � Generalizes [Rao Ross 98] � If EVis(n)>EUniqBind(s1,..,sk) for a node, then materialize that node’s result Lucent Proprietary ROLEX 25
ROLEX Experimental Setup Experimental Setup � Execution engine built on top of DataBlitz™ tuple-layer interface � Handles complex queries including group-by and aggregation � Virtual DOM under construction � Prototype for performance testing scans tree result � Obeys navigation probabilities � Does a local ‘sprintf’ to model consumption of a node Lucent Proprietary ROLEX 26
Recommend
More recommend