From relation algebra to semi-join algebra: an approach for graph query optimization Jelle Hellings 1 Catherine L. Pilachowski 2 Dirk Van Gucht 2 Marc Gyssens 1 Yuqing Wu 3 1 Hasselt University 2 Indiana University 3 Pomona College 1/19
Graph queries: data model WorksWith Victor Alice FriendOf Bob Wendy FriendOf ParentOf ParentOf Carol ParentOf ParentOf Dan Faythe ParentOf FriendOf FriendOf Peggy Grace 2/19
Graph queries: basic path queries WorksWith Victor Alice FriendOf Bob Wendy FriendOf ParentOf ParentOf Carol ParentOf ParentOf Dan Faythe ParentOf FriendOf FriendOf Peggy Grace ( WorksWith ∪ FriendOf ) ◦ [ ParentOf ] + ◦ FriendOf 3/19
Graph queries: basic path queries WorksWith Victor Alice FriendOf Bob Wendy FriendOf ParentOf ParentOf Carol ParentOf ParentOf Dan Faythe ParentOf FriendOf FriendOf Peggy Grace ( WorksWith ∪ FriendOf ) ◦ [ ParentOf ] + ◦ FriendOf 3/19
Graph queries: node-tests and branching WorksWith Victor Alice FriendOf Bob Wendy FriendOf ParentOf ParentOf Carol ParentOf ParentOf Dan Faythe ParentOf FriendOf FriendOf Peggy Grace π 1 [ ParentOf ◦ ParentOf ◦ ParentOf ] ◦ FriendOf 4/19
Graph queries: node-tests and branching WorksWith Victor Alice FriendOf Bob Wendy FriendOf ParentOf ParentOf Carol ParentOf ParentOf Dan Faythe ParentOf FriendOf FriendOf Peggy Grace π 1 [ ParentOf ◦ ParentOf ◦ ParentOf ] ◦ FriendOf 4/19
Graph querying: relation algebra + id ∪ ◦ � ∩ − di π π RPQs 2RPQs Nested RPQs Navigational XPath, Graph XPath FO[3] + transitive closure 5/19
Relation algebra and query evaluation + id ∪ ◦ � ∩ − di π π Cheap ( ∪ , � , π , ∩ , − ). Cost linearly upper bounded by operands In between ( id , π ). Cost linearly upper bounded by #nodes Expensive ( ◦ , +, di ). Worst-case quadratically lower bounded by #nodes 6/19
Naive query evaluation: an inefficient example Return pairs of (great-grandparent, friend) π 1 [ ParentOf ◦ ParentOf ◦ ParentOf ] ◦ FriendOf 1. Compute (grandparent, grandchild): X = ParentOf ◦ ParentOf 2. Compute (great-grandparent, great-grandchild): Y = ParentOf ◦ X 3. Throw away the great-grandchildren: Z = π 1 [ Y ] 4. Compute (great-grandparent, friend): Result = Z ◦ FriendOf 7/19
Optimize query evaluation: add specialized operators? Return pairs of (great-grandparent, friend) π 1 [ ParentOf ◦ ParentOf ◦ ParentOf ] ◦ FriendOf 1. Compute (grandparent, ???): X = ParentOf ⋉ ParentOf 2. Compute (great-grandparent, ???): Y = ParentOf ⋉ ( X ) 3. Throw away ???: Z = π 1 [ Y ] 4. Compute (great-grandparent, friend): Result = Z ⋊ FriendOf π 1 [ ParentOf ⋉ ( ParentOf ⋉ ParentOf )] ⋊ FriendOf 8/19
Simple idea: automatic query rewriting ◮ Rewrite composition into semi-joins ◮ Rewrite transitive closure into fixpoints In such a way that the rewritten query is equivalent 9/19
When are expressions equivalent? Definition Queries q 1 and q 2 are path-equivalent if, for every graph G , [ [ q 1 ] ] G = [ [ q 2 ] ] G (denoted by q 1 ≡ path q 2 ) left-projection-equivalent if, for every graph G , [ [ q 1 ] ] G | 1 = [ [ q 2 ] ] G | 1 (denoted by q 1 ≡ π 1 q 2 ) right-projection-equivalent if, for every graph G , [ [ q 1 ] ] G | 2 = [ [ q 2 ] ] G | 2 (denoted by q 1 ≡ π 2 q 2 ) Example ◮ R ∩ S ≡ path R − ( R − S ) ◮ R ◦ S ≡ π 1 R ⋉ S ◮ π 1 [ R ◦ S ] ≡ path π 1 [ R ⋉ S ] 10/19
The main result + id ∪ ◦ � di ∩ − π π ≡ π 2 ≡ π 1 ≡ path ⋉ , ⋊ fp id ∪ � di ∩ − π π FO[2] + fixpoint ◮ Collapse also holds for fragments (that include π ) ◮ Example: Nested RPQs are projection-equivalent to expressions using only id , ∪ , ⋉ , ⋊ , fp, � , and π 11/19
Intersection ∩ and difference − Issues when combining composition with ∩ or − ( FriendOf ◦ FriendOf ) ∩ FriendOf ◮ Restricting : use ∩ and − only on composition-free expressions ◮ Exact syntactic fragment of FO[3] + TC that is projection-equivalent to FO[2] + fixpoint . ◮ Data models : usage of ∩ and − is sometimes redundant ◮ Sibling-ordered trees: FO tree � π FO[2] + fixpoints. ◮ Downward queries on trees [DBPL 2015] ◮ ... ◮ Partial rewriting : keep compositions when necessary 12/19
The rewrite functions - partial rewriting τ ( e ) ≡ path e τ π 1 ( e ) ≡ π 1 e τ π 2 ( e ) ≡ π 2 e τ ◦ 1 ( e ; ε ) ≡ π 1 e ⋉ ε τ ◦ 2 ( e ; ε ) ≡ π 2 ε ⋊ e Example π 1 [(( WorksOn ◦ WorksOn � ) ∩ FriendOf ) ◦ EditorOf ] ◦ StudentOf 13/19
The rewrite functions - partial rewriting τ ( e ) ≡ path e τ π 1 ( e ) ≡ π 1 e τ π 2 ( e ) ≡ π 2 e τ ◦ 1 ( e ; ε ) ≡ π 1 e ⋉ ε τ ◦ 2 ( e ; ε ) ≡ π 2 ε ⋊ e Example π 1 [(( WorksOn ◦ WorksOn � ) ∩ FriendOf ) ◦ EditorOf ] ◦ StudentOf τ ( e ) . 13/19
The rewrite functions - partial rewriting τ ( e ) ≡ path e τ π 1 ( e ) ≡ π 1 e τ π 2 ( e ) ≡ π 2 e τ ◦ 1 ( e ; ε ) ≡ π 1 e ⋉ ε τ ◦ 2 ( e ; ε ) ≡ π 2 ε ⋊ e Example π 1 [(( WorksOn ◦ WorksOn � ) ∩ FriendOf ) ◦ EditorOf ] ◦ StudentOf τ ( e ) = τ π 2 ( π 1 [(( W ◦ W � ) ∩ F ) ◦ E ]) ⋊ τ ( S ) . 13/19
The rewrite functions - partial rewriting τ ( e ) ≡ path e τ π 1 ( e ) ≡ π 1 e τ π 2 ( e ) ≡ π 2 e τ ◦ 1 ( e ; ε ) ≡ π 1 e ⋉ ε τ ◦ 2 ( e ; ε ) ≡ π 2 ε ⋊ e Example π 1 [(( WorksOn ◦ WorksOn � ) ∩ FriendOf ) ◦ EditorOf ] ◦ StudentOf τ ( e ) = τ π 2 ( π 1 [(( W ◦ W � ) ∩ F ) ◦ E ]) ⋊ τ ( S ) = π 1 [ τ π 1 ((( W ◦ W � ) ∩ F ) ◦ E )] ⋊ S . 13/19
The rewrite functions - partial rewriting τ ( e ) ≡ path e τ π 1 ( e ) ≡ π 1 e τ π 2 ( e ) ≡ π 2 e τ ◦ 1 ( e ; ε ) ≡ π 1 e ⋉ ε τ ◦ 2 ( e ; ε ) ≡ π 2 ε ⋊ e Example π 1 [(( WorksOn ◦ WorksOn � ) ∩ FriendOf ) ◦ EditorOf ] ◦ StudentOf τ ( e ) = τ π 2 ( π 1 [(( W ◦ W � ) ∩ F ) ◦ E ]) ⋊ τ ( S ) = π 1 [ τ π 1 ((( W ◦ W � ) ∩ F ) ◦ E )] ⋊ S = π 1 [ τ ◦ 1 (( W ◦ W � ) ∩ F ; τ π 1 ( E ))] ⋊ S . 13/19
The rewrite functions - partial rewriting τ ( e ) ≡ path e τ π 1 ( e ) ≡ π 1 e τ π 2 ( e ) ≡ π 2 e τ ◦ 1 ( e ; ε ) ≡ π 1 e ⋉ ε τ ◦ 2 ( e ; ε ) ≡ π 2 ε ⋊ e Example π 1 [(( WorksOn ◦ WorksOn � ) ∩ FriendOf ) ◦ EditorOf ] ◦ StudentOf τ ( e ) = τ π 2 ( π 1 [(( W ◦ W � ) ∩ F ) ◦ E ]) ⋊ τ ( S ) = π 1 [ τ π 1 ((( W ◦ W � ) ∩ F ) ◦ E )] ⋊ S = π 1 [ τ ◦ 1 (( W ◦ W � ) ∩ F ; τ π 1 ( E ))] ⋊ S = π 1 [( τ ( W ◦ W � ) ∩ τ ( F )) ⋉ E ] ⋊ S . 13/19
The rewrite functions - partial rewriting τ ( e ) ≡ path e τ π 1 ( e ) ≡ π 1 e τ π 2 ( e ) ≡ π 2 e τ ◦ 1 ( e ; ε ) ≡ π 1 e ⋉ ε τ ◦ 2 ( e ; ε ) ≡ π 2 ε ⋊ e Example π 1 [(( WorksOn ◦ WorksOn � ) ∩ FriendOf ) ◦ EditorOf ] ◦ StudentOf τ ( e ) = τ π 2 ( π 1 [(( W ◦ W � ) ∩ F ) ◦ E ]) ⋊ τ ( S ) = π 1 [ τ π 1 ((( W ◦ W � ) ∩ F ) ◦ E )] ⋊ S = π 1 [ τ ◦ 1 (( W ◦ W � ) ∩ F ; τ π 1 ( E ))] ⋊ S = π 1 [( τ ( W ◦ W � ) ∩ τ ( F )) ⋉ E ] ⋊ S = π 1 [(( τ ( W ) ◦ τ ( W � )) ∩ F ) ⋉ E ] ⋊ S . 13/19
The rewrite functions - partial rewriting τ ( e ) ≡ path e τ π 1 ( e ) ≡ π 1 e τ π 2 ( e ) ≡ π 2 e τ ◦ 1 ( e ; ε ) ≡ π 1 e ⋉ ε τ ◦ 2 ( e ; ε ) ≡ π 2 ε ⋊ e Example π 1 [(( WorksOn ◦ WorksOn � ) ∩ FriendOf ) ◦ EditorOf ] ◦ StudentOf τ ( e ) = τ π 2 ( π 1 [(( W ◦ W � ) ∩ F ) ◦ E ]) ⋊ τ ( S ) = π 1 [ τ π 1 ((( W ◦ W � ) ∩ F ) ◦ E )] ⋊ S = π 1 [ τ ◦ 1 (( W ◦ W � ) ∩ F ; τ π 1 ( E ))] ⋊ S = π 1 [( τ ( W ◦ W � ) ∩ τ ( F )) ⋉ E ] ⋊ S = π 1 [(( τ ( W ) ◦ τ ( W � )) ∩ F ) ⋉ E ] ⋊ S = π 1 [(( W ◦ W � ) ∩ F ) ⋉ E ] ⋊ S . 13/19
Query optimization ◮ Cost of each operator ◮ Input size of each operator ◮ Number of necessary evaluation steps 14/19
Query optimization ◮ Cost of each operator ✓ ◮ Input size of each operator ◮ Number of necessary evaluation steps 14/19
Query optimization ◮ Cost of each operator ✓ ◮ Input size of each operator Example Let R = { (1 , i ) | 0 ≤ i ≤ m } . Consider R ◦ R � ≡ π 1 R ⋉ R � . ◮ Number of necessary evaluation steps 14/19
Query optimization ◮ Cost of each operator ✓ ◮ Input size of each operator ✓ Example Let R = { (1 , i ) | 0 ≤ i ≤ m } . Consider R ◦ R � ≡ π 1 R ⋉ R � . Solution: use single-column evaluation algorithms ◮ Number of necessary evaluation steps 14/19
Query optimization ◮ Cost of each operator ✓ ◮ Input size of each operator ✓ Example Let R = { (1 , i ) | 0 ≤ i ≤ m } . Consider R ◦ R � ≡ π 1 R ⋉ R � . Solution: use single-column evaluation algorithms ◮ Number of necessary evaluation steps ✗ 14/19
Expressions and evaluation steps Expression size we denote the expression size of e by � e � . Evaluation size we denote the evaluation size of e by eval-steps( e ). Example e 1 = (( R ◦ R ) ◦ ( R ◦ R )) ◦ (( R ◦ R ) ◦ ( R ◦ R )) e 2 = R ⋉ ( R ⋉ ( R ⋉ ( R ⋉ ( R ⋉ ( R ⋉ ( R ⋉ R )))))) ◮ e 1 ≡ π 1 e 2 ◮ We have � e 1 � = 7 and eval-steps( e 1 ) = 3: 1. X = R ◦ R 2. Y = X ◦ X 3. Result = Y ◦ Y ◮ We have � e 2 � = 7 and eval-steps( e 2 ) = 7. 15/19
Recommend
More recommend