the tractability frontier of well designed sparql queries
play

The Tractability Frontier of Well-designed SPARQL Queries Miguel - PowerPoint PPT Presentation

The Tractability Frontier of Well-designed SPARQL Queries Miguel Romero (University of Oxford) ACM PODS 2018, 12 June, Houston-USA Well-designed SPARQL SPARQL : standard query language for RDF graphs Well-designed SPARQL (Perez, Arenas, Gutierrez


  1. The Tractability Frontier of Well-designed SPARQL Queries Miguel Romero (University of Oxford) ACM PODS 2018, 12 June, Houston-USA

  2. Well-designed SPARQL SPARQL : standard query language for RDF graphs Well-designed SPARQL (Perez, Arenas, Gutierrez 2006) • Evaluation is coNP-complete (PSPACE-complete for SPARQL) This work: • Well-designed SPARQL restricted to AND, OPTIONAL, UNION

  3. Tractable evaluation Evaluating well-designed SPARQL becomes tractable for some classes • Most general condition: local tractability (Letelier, Perez, Pichler, Skritek 2013; Barceló, Pichler, Skritek 2015) Main Question: Which classes of well-designed SPARQL queries 
 can be evaluated in polynomial time? Our Contribution: The tractable classes are precisely those of bounded domination width

  4. Well-designed Pattern Trees/Forests (Letelier, Perez, Pichler, Skritek 2013) Well-designed SPARQL queries 
 = Well-designed Pattern Trees with AND, OPTIONAL Well-designed SPARQL queries 
 = Well-designed Pattern Forests with AND, OPTIONAL, UNION In this talk: We focus on (well-designed) pattern forests

  5. Basics of RDF graphs and pattern trees/forests

  6. RDF Graphs Fix: set of identifiers I , set of variables V RDF Graph = finite set of triples from I x I x I p (s, p, o) s o

  7. Conjunctive Queries (CQs) over RDF graphs Fix: set of identifiers I , set of variables V Conjunctive query (CQ) = 
 AND of triples from (I U V) x (I U V) x (I U V) + free variables q(?y, ?z) = (?x, p, o) AND (?y, ?x, a) AND (o, ?z, ?y) AND (p, ?w, ?w) Answer of a CQ q(X) over an RDF graph G: q(G) = {h| X : h is a homomorphism from q to G} • Full CQ = All variables are free (no projection)

  8. Well-designed Pattern Trees Well-designed Pattern Tree = (T, pat), where T is rooted tree and pat is a function 
 mapping each node of T to a full CQ such that • For each variable ?x, the set {t in T | ?x in pat(t)} is connected in T

  9. Well-designed Pattern Trees: semantics P=(T, pat) G T’ Subtree T’ of P = subtree of T containing the root pat(T’) = AND of all the CQs in {pat(t) | t in T’}

  10. Well-designed Pattern Trees: semantics P=(T, pat) G T’ Subtree T’ of P = subtree of T containing the root pat(T’) = AND of all the CQs in {pat(t) | t in T’} Child of T’= node not in T’ whose parent is in T’

  11. Well-designed Pattern Trees: semantics P=(T, pat) G h T’ g t pat(t) h is in P(G) iff 
 there is a subtree T’ such that • h is a homomorphism from pat(T’) to G • for each child t of T’, h cannot be extended to pat(T’) AND pat(t)

  12. Well-designed Pattern Forests Well-designed Pattern Forest = Union of well-designed pattern trees Answer of F={P 1 ,…,P m } over RDF graph G: F(G) = P 1 (G) U … U P m (G)

  13. The Evaluation Problem Let C be a class of well-designed pattern forests EVAL(C) Instance: well-designed pattern forest F in C , RDF graph G , mapping h Question: does h belong to F(G) ?

  14. Domination width and main theorem

  15. Main Theorem Theorem: Assume FPT=W[1]. Let C be a recursively enumerable class of 
 well-designed pattern forests. Then the following are equivalent: • EVAL(C) can be solved in polynomial time • C has bounded domination width Proof based on the corresponding characterisation for conjunctive queries 
 (Dalmau, Kolaitis, Vardi 2002; Grohe 2003) Treewidth of a CQ = measure of tree-likeness ctw(q(X)):= treewidth of the core of q(X)

  16. The case of Conjunctive Queries Theorem ( Dalmau, Kolaitis, Vardi 2002; Grohe 2003) Assume FPT=W[1]. Let C be a recursively enumerable class of 
 conjunctive queries of bounded arity. Then the following are equivalent: • CQ-EVAL(C) can be solved in polynomial time • C has bounded ctw Tractability part via the existential k-pebble game (Kolaitis, Vardi 1995) Relaxation for checking existence of homomorphisms (complete, but not correct) • Always correct for conjunctive queries q with ctw(q) < k • Existence of a winning strategy for the Duplicator can be done in poly time • Hardness part via a reduction from the clique problem (W[1]-hardness)

  17. The case of Conjunctive Queries Theorem ( Dalmau, Kolaitis, Vardi 2002; Grohe 2003) Assume FPT=W[1]. Let C be a recursively enumerable class of 
 conjunctive queries of bounded arity. Then the following are equivalent: • CQ-EVAL(C) can be solved in polynomial time • C has bounded ctw Can be extended to unions of CQs (UCQs) Q(X)={q 1 (X),…q m (X)} ctw(Q(X)) = 
 minimum k such that for every q i (X), there is q j (X) such that 
 ctw(q j (X)) is at most k and 
 q j (X) can be mapped to q i (X) via a homomorphism

  18. Domination width h in P(G) ? P=(T, pat) G h T’ can be computed in poly time Is h a “potential solution”?

  19. Domination width h in P(G) ? P=(T, pat) G h T’ X:= vars(T’) … … CQ q t i (X):= (pat(T’) AND pat(t i ))(X) t n t 1 t i UCQ Q T’ (X) := {q t 1 (X),…,q t n (X)} h is not in P(G) iff h is in Q T’ (G)

  20. Domination width h in P(G) ? P=(T, pat) G h T’ X:= vars(T’) … … CQ q t i (X):= (pat(T’) AND pat(t i ))(X) t n t 1 t i UCQ Q T’ (X) := {q t 1 (X),…,q t n (X)} h is not in P(G) iff h is in Q T’ (G) dw(P) := maximum ctw(Q T’ (X)), over all subtree T’

  21. Domination width dw(P) < k P=(T, pat) G T’ t j t i CQ q t i (X):= (pat(T’) AND pat(t i ))(X) q t j (X) q t i (X) ctw(q t j (X))<k dw(P) := maximum ctw(Q T’ (X)), over all subtree T’

  22. Domination width dw(P) < k h in P(G) ? P=(T, pat) G h T’ exist. k-pebble game t i CQ q t i (X):= (pat(T’) AND pat(t i ))(X) dw(P) := maximum ctw(Q T’ (X)), over all subtree T’

  23. Domination width h in P(G) ? h F={P 1 ,…,P m } G …. …. T’ T’’ h T’ AND T’’ rename new variables

  24. Domination width h in P(G) ? h F={P 1 ,…,P m } G …. …. T’ T’’ h X:= vars(T’)=vars(T’’)=dom(h) h is not in F(G) iff h is in Q {T’,T’’} (X) Q {T’,T’’} (X):={pat(T’) AND pat(T’’) + choice of children} (and renaming) dw(F) = maximum ctw(Q S (X)), over all set S of subtrees over the same set of variables X and satisfying certain closure property

  25. Main Theorem Theorem: Assume FPT=W[1]. Let C be a recursively enumerable class of 
 well-designed pattern forests. Then the following are equivalent: • EVAL(C) can be solved in polynomial time • C has bounded domination width Tractability part: 
 Application of the existential k-pebble game 
 as for the case of conjunctive queries (Dalmau, Kolaitis, Vardi 2002) Hardness part: 
 Reduction from clique (Grohe 2003) 
 + some basic properties of pattern forests with large dw

  26. The case of UNION-free queries (pattern trees)

  27. Branch Treewidth r P=(T, pat) Branch B t of t pat(t) t

  28. Branch Treewidth r P=(T, pat) Branch B t of t X:= vars(B t ) CQ b t (X) := (pat(B t ) AND pat(t))(X) bw(P) := maximum ctw(bt(X)) over all node t of T pat(t) t Proposition: 
 For every well-designed pattern tree P, we have dw(P)=bw(P)

  29. Final Remarks Characterisation of tractable classes of pattern forests (well-designed SPARQL restricted to AND, OPTIONAL, UNION) • Dichotomy: A class C is either tractable or W[1]-hard The {AND, OPTIONAL, UNION} fragment is maximal with this property: • Dichotomy fails when we add FILTER (CQs with inequalities) and 
 SELECT (Kroll, Pichler, Skritek 2016) c f(|q|) |G| Open problem: Characterise fixed-parameter tractable classes of queries with SELECT 
 (Recent characterisation for simple queries, Mengel, Skritek 2018) Thank you!

Recommend


More recommend