Query Con tainmen t for Conjunctiv e Queries With Regular Expressions Daniela Florescu, Alon Levy , Dan Suciu Query Con tainmen t for Conjunctiv e Queries With Regular Expressions Dan Suciu A T&T Labs 1 PODS'98
Intr o duction Semistructured Data. F eatures: � Do es not �t in to pre-existing, �xed sc hema � Missing attributes � A ttributes of unkno wn cardinalities � Set of attributes unkno wn in adv ance � Irregular nesting Query Languages for Semistructured Data. F eatures: � Lab el v ariables � Regular path expressions: recursiv e queries Query Con tainmen t for Conjunctiv e Queries With Regular Expressions Dan Suciu A T&T Labs 2 PODS'98
Motivation Curren t applications of semistructured data: � Data in tegration [Tsimmis] � W eb querying [W ebSQL, W ebOQL] � General-purp ose [Lore, UnQL] � W eb-site managemen t [Strudel] Query Con tainmen t needed in semistructured data for: � Chec king in tegrit y constrain ts for W eb-sites [Strudel] � Query rewriting to accomo date v ariet y of storage metho ds. Query Con tainmen t for Conjunctiv e Queries With Regular Expressions Dan Suciu A T&T Labs 3 PODS'98
Pr evious Work on Query Containment and Equivalenc e Non-recursiv e Conjunctiv e Queries: � decidable for conjunctiv e queries [Chandra and Merlin 77] � w/ union [Sagiv and Y annak akis 81] � w/ order and inequalities [Klug 88] � w/ nested relations [Levy and Suciu 97] Results on Recursiv e Queries: � undecidable for Datalog [Sh umeli 93] � recursiv e v.s. nonrecursiv e [Chaudh uri and V ardi 92] � sev eral results on deciding of recursiv e queries b ounde dness � No p ositiv e results for true recursiv e queries. Query Con tainmen t for Conjunctiv e Queries With Regular Expressions Dan Suciu A T&T Labs 4 PODS'98
Outline � Bac kground (semistructured data and queries) � De�nitions � Main results � Con tainmen t for simple regular expressions. � Con tainmen t for arbitrary regular expressions. � Conclusions Query Con tainmen t for Conjunctiv e Queries With Regular Expressions Dan Suciu A T&T Labs 5 PODS'98
Backgr ound: Semistructur e d Data De�nition Graph Database = Graph with lab eled edges tup tup tup b a name phone name phone d a name phone c john sue 2654 joe 1234 b 5469 Other p ossible v ariations: v alues attac hed to lea v es. Query Con tainmen t for Conjunctiv e Queries With Regular Expressions Dan Suciu A T&T Labs 6 PODS'98
Backgr ound: Query L anguages for Semistructur e d Data Languages: LOREL, UnQL, W ebSQL, W ebOQL, StruQL. De�nition An atomic condition X R Y where = v ariables, = regular path expression. X ; Y R Examples: X a Y simple lab el a: ( b j c:d ) follo w ed b y or b y X Y a b c:d + � regular path expression X a :b Y L: ( a:L ) � lab el v ariables X Y Query Con tainmen t for Conjunctiv e Queries With Regular Expressions Dan Suciu A T&T Labs 7 PODS'98
Conjunctive Queries with R e gular Expr essions � De�nition : ( ) : � Y Q q X R Z ; : : : ; Y R Z n n n 1 1 1 � Meaning on database : a -ary relation, where = j j . D B k k X Example: u1 DB = Q ( D B ) b a a L Y u2 u3 u4 ( L; ) : � X ( a:L ) q Y Y ; Y L Z c b u 5 b c b b u6 u5 u7 u8 u9 u 8 c d b d c d c u15 u10 u11 u12 u13 u14 Query Con tainmen t for Conjunctiv e Queries With Regular Expressions Dan Suciu A T&T Labs 8 PODS'98
The Containment and Equivalenc e Pr oblems Con tainmen t Giv en Q 1 ; Q 2 c hec k if 8 D Q 1( D ) � Q 2( D ) B ; B B Equiv alence Giv en Q 1 ; Q 2 c hec k if 8 D Q 1( D ) = Q 2( D ) B ; B B Example: + Q 1 : 1( X ) : � q ; Z X a Z + + � Q 2 : 2( X ) : � ( a j ( a:b )) Z q ; Z X L Z ; Y aZ ; X Q 1 � Q 2 Note Con tainmen t and Equiv alence Problem for regular expressions is PSP A CE-complete [Stokmey er and Mey er 1973]. Query Con tainmen t for Conjunctiv e Queries With Regular Expressions Dan Suciu A T&T Labs 9 PODS'98
Main R esults Theorem Con tainmen t (and equiv alence) of conjunctiv e queries with regular path expressions is decidable in exp onen tial space. Theorem Con tainmen t (and equiv alence) of conjunctiv e queries with simple regular path expressions is NP-complete. Query Con tainmen t for Conjunctiv e Queries With Regular Expressions Dan Suciu A T&T Labs 10 PODS'98
Simple R e gular Expr essions De�nition R ::= r :r : : : r where eac h r is constan t or � n i 1 2 Examples a:b:c � � a: :c: :a � :a: � � (= � :a: � :b ) : :b Query Con tainmen t for Conjunctiv e Queries With Regular Expressions Dan Suciu A T&T Labs 11 PODS'98
Containment of Conjunctive Queries with Simple R e gular Expr essions Con tainmen t test: � 0 i� there exists a query mapping : 0 ! s.t. Q Q f Q Q 0 0 � X aY mapp ed to similar condition X aY � 0 � 0 mapp ed to an y c hain X Y X r Z ; Z r Z ; : : : ; Z r Y n n 0 1 1 1 2 Example: X c Y ; Y ( � :a ) Z ; Y ( b:d ) U � X 0 � Y 0 ; Y 0 a Z 0 ; X 0 � U 0 ; U 0 b W 0 . a a ���� ���� ��� ��� ��� ��� ��� ��� ���� ���� ��� ��� ��� ��� ��� ��� * ���� ���� ��� ��� ��� ��� ��� ��� * ���� ���� ��� ��� ��� ��� ��� ��� ���� ���� ��� ��� ��� ��� ��� ��� c ���� ���� ���� ���� ��� ��� ���� ���� ��� ��� ��� ��� ���� ���� ���� ���� ��� ��� ���� ���� ��� ��� ��� ��� ���� ���� ���� ���� ���� ���� ���� ���� ���� ���� ���� ���� b ���� ���� d ���� ���� b * ���� ���� ���� ���� ���� ���� ��� ��� ��� ��� ��� ��� ���� ���� ��� ��� ��� ��� ��� ��� ���� ���� ��� ��� ��� ��� ��� ��� ���� ���� ��� ��� ��� ��� ��� ��� ���� ���� ��� ��� ��� ��� ��� ��� ���� ���� ��� ��� ��� ��� ��� ��� Hence: NP-complete. Query Con tainmen t for Conjunctiv e Queries With Regular Expressions Dan Suciu A T&T Labs 12 PODS'98
Do es Not Work in Gener al Case One query mapping is not su�cien t: � Q 1 : 1( X ) : � ( a:b ) q X a U; U b V ; U V � Q 2 : 2( X ) : � (( a:a ) :b ) q X Y Q 1 � Q 2, but it is \witnessed" b y t w o query mappings. b (a.a)*.b a* ? X X a.b Need to consider sev eral query mappings. Query Con tainmen t for Conjunctiv e Queries With Regular Expressions Dan Suciu A T&T Labs 13 PODS'98
A nother Example 1( X ) : � ( a: � ) Y ( b:c ) Z q X ; Y ; Y cU 2( X ) : � ( a: � :c ) W q X : Need to map either to or to W Z U Query Con tainmen t for Conjunctiv e Queries With Regular Expressions Dan Suciu A T&T Labs 14 PODS'98
Canonic al Datab ases De�nition Giv en Q 1, a canonical database is one obtained b y D B \expanding" the query graph. Example: Q 1 : 1( X ) : � X � ( a:b ) q a U; U b V ; U V a b a b a a* X Q 1 = = X D B a a b a a.b In general there are in�nitely man y canonical for Q 1. D B � Prop osition Q 1 � Q 2 i� X 2 Q 2( D B ) for all canonical databases. Query Con tainmen t for Conjunctiv e Queries With Regular Expressions Dan Suciu A T&T Labs 15 PODS'98
Canonic al Datab ases and Query Mappings � � Eac h mapping f : Q 2 ! Q 1 pro v es X 2 Q ( D B ) for a certain set of canonical 's S D B f � There are only exp onen tially man y mappins : Q 2 ! Q 1. f � Hence: to c hec k Q 1 � Q 2 su�ces to c hec k that S = all S f f canonical databases. � Ho w to compute S ? f Query Con tainmen t for Conjunctiv e Queries With Regular Expressions Dan Suciu A T&T Labs 16 PODS'98
Recommend
More recommend