Probabilistic Query Evaluation on Bounded- Treewidth Instances - PowerPoint PPT Presentation

Probabilistic Query Evaluation on Bounded- Treewidth Instances SIGMOD/PODS PH.D. SYMPOSIUM JUNE 26, 2016, SAN FRANCISCO Mikaël Monet Supervised by Pierre Senellart

Context 2  Boolean queries (yes/no) on relational instances

Context 2  Boolean queries (yes/no) on relational instances  We want the answer to contain more information than just « yes/no »:  Add uncertainty  Obtain provenance information

Context 2  Boolean queries (yes/no) on relational instances  We want the answer to contain more information than just « yes/no »:  Add uncertainty  Obtain provenance information  We need restrictions for all of this to be tractable

3 A probabilistic database R S a d d e f c f e a e d a c e b e Q c e f

3 A probabilistic database R S R S a d d e a d 0.2 d e 0.005 f c f c 0.9 f e f e 0.7 a e a e 0.7 d a d a 0.13 c e c e 0.23 b e b e 0.81 Q Q c e f c e f 0.66

3 A probabilistic database R S R S a d d e a d 0.2 d e 0.005 f c f c 0.9 f e f e 0.7 a e a e 0.7 d a d a 0.13 c e c e 0.23 b e b e 0.81 Q Q c e f c e f 0.66 TID model

4 Probability of a possible world R S R S a d 0.2 d e 0.005 f c 0.9 f e 0.7 f e A possible a e 0.7 d a 0.13 d a c e 0.23 c e world I b e 0.81 Q Q c e f 0.66 c e f

4 Probability of a possible world R S R S a d 0.2 d e 0.005 f c 0.9 f e 0.7 f e A possible a e 0.7 d a 0.13 d a c e 0.23 c e world I b e 0.81 Q Q c e f 0.66 c e f Probability Pr( I ) of this possible world = 0.7*0.13*0.23*0.66

4 Probability of a possible world R S R S a d 0.2 d e 0.005 f c 0.9 f e 0.7 f e A possible a e 0.7 d a 0.13 d a c e 0.23 c e world I b e 0.81 Q Q c e f 0.66 c e f Probability Pr( I ) of this possible world = 0.7*0.13*0.23*0.66 *(1-0.2)*(1-0.81)*(1-0.005)*(1-0.9)*(1-0.7)

Probabilistic query 5 evaluation (PQE)  Focus on Boolean queries (yes/no)

Probabilistic query 5 evaluation (PQE)  Focus on Boolean queries (yes/no)  Probability of a query Q on probabilistic instance 𝖀 : P( Q ) = 𝐽 ⊆ 𝖀 , 𝐽 ⊨ Q Pr(𝐽)

Probabilistic query 5 evaluation (PQE)  Focus on Boolean queries (yes/no)  Probability of a query Q on probabilistic instance 𝖀 : P( Q ) = 𝐽 ⊆ 𝖀 , 𝐽 ⊨ Q Pr(𝐽)  Problem: in general #P-hard

6 3 possible directions  Approximate  Restrict queries  Restrict instances

1) Approximate probability 7 computation

1) Approximate probability 7 computation  Monte-Carlo sampling

1) Approximate probability 7 computation  Monte-Carlo sampling  Inconvenient: running time quadratic in desired precision

1) Approximate probability 7 computation  Monte-Carlo sampling  Inconvenient: running time quadratic in desired precision ⇒ Not adequate for low probabilities.

2) Restricting the class of 8 queries  [Dalvi and Suciu 2012] shows the following dichotomy for any UCQ Q :

2) Restricting the class of 8 queries  [Dalvi and Suciu 2012] shows the following dichotomy for any UCQ Q :  Either PQE is PTIME on all intances

2) Restricting the class of 8 queries  [Dalvi and Suciu 2012] shows the following dichotomy for any UCQ Q :  Either PQE is PTIME on all intances  Or PQE is #P-hard on all instances

2) Restricting the class of 8 queries  [Dalvi and Suciu 2012] shows the following dichotomy for any UCQ Q :  Either PQE is PTIME on all intances  Or PQE is #P-hard on all instances  Simple conjunctive query ∃ x,y R(x),S(x,y),T(y) is already #P-hard!

2) Restricting the class of 8 queries  [Dalvi and Suciu 2012] shows the following dichotomy for any UCQ Q :  Either PQE is PTIME on all intances  Or PQE is #P-hard on all instances  Simple conjunctive query ∃ x,y R(x),S(x,y),T(y) is already #P-hard!  Criterion is too crisp

3) Restricting the shape of 9 the instances  Bound the treewidth of instances by a constant  Treewidth: measure used to tell how far a graph is from being a tree

Treewidth 10 R S a d d e f c f e a e d a c e b e Q c e f

Treewidth 10 R S a d d e f c f e a e d a c e b e Q c e f Divide and conquer !

11 O(EXP(k ).|I|) O(|A|.|T| ) O(f(q,k ))

11 Instance I O(EXP(k ).|I|) of treewidth k O(|A|.|T| ) O(f(q,k ))

11 Instance I Tree O(EXP(k ).|I|) of treewidth decomposition k T O(|A|.|T| ) O(f(q,k ))

11 Instance I Tree O(EXP(k ).|I|) of treewidth decomposition k T O(|A|.|T| ) Query q, O(f(q,k )) int k

11 Instance I Tree O(EXP(k ).|I|) of treewidth decomposition k T O(|A|.|T| ) Query q, Automaton O(f(q,k )) int k A

11 Instance I Tree O(EXP(k ).|I|) of treewidth decomposition k T O(|A|.|T| ) Query q, Automaton O(f(q,k )) int k A Provenance circuit C

11 Instance I Tree O(EXP(k ).|I|) of treewidth decomposition k T O(|A|.|T| ) Query q, Automaton O(f(q,k )) int k A Provenance circuit C Probability

11 Instance I Tree O(EXP(k ).|I|) of treewidth decomposition k T O(|A|.|T| ) Query q, Automaton O(f(q,k )) int k A MSO Provenance circuit C Probability

11 Instance I Tree O(EXP(k ).|I|) of treewidth decomposition k T O(|A|.|T| ) Query q, Automaton O(f(q,k )) int k A MSO Provenance ? circuit C Probability

11 Instance I Tree O(EXP(k ).|I|) of treewidth decomposition k T O(|A|.|T| ) ? Query q, Automaton O(f(q,k )) int k A MSO Provenance ? circuit C Probability

Provenance circuit C of 12 query Q on instance I

Provenance circuit C of 12 query Q on instance I  Boolean circuit (AND, OR, NOT gates)

Provenance circuit C of 12 query Q on instance I  Boolean circuit (AND, OR, NOT gates)  Inputs = the facts of I

Provenance circuit C of 12 query Q on instance I  Boolean circuit (AND, OR, NOT gates)  Inputs = the facts of I  For every ν : I → {true, false} ν(I) ⊨ Q iff ν(C) = 1

Tree automata 13  A bottom-up deterministic tree automaton on {a, b}-trees is a tuple A = (Q, F, 𝛋 , 𝛆 ) where :  Q : finite set of states  F ⊆ Q : accepting states  𝛋 : {a, b} → Q , determining state for the leaves  𝛆 : {a, b} X Q² → Q , determining the state for internal nodes

Run of an automaton on a tree 14  Q = {O, O, O}

Run of an automaton on a tree 14  Q = {O, O, O}  F = {O}

Run of an automaton on a tree 14  Q = {O, O, O}  F = {O}  𝛋 = { (a, O), (b, O)}

Run of an automaton on a tree 14  Q = {O, O, O}  F = {O}  𝛋 = { (a, O), (b, O)} lab q1 q2 out a O O O a O ? O a ? O O a O ? O 𝛆 = a ? O O b O O O b O O O b O O O b O O O b O ? O b ? O O

Initialization of the leaves 14  Q = {O, O, O}  F = {O}  𝛋 = { (a, O), (b, O)} lab q1 q2 out a O O O a O ? O a ? O O a O ? O 𝛆 = a ? O O b O O O b O O O b O O O b O O O b O ? O b ? O O

Internal nodes 14  Q = {O, O, O}  F = {O}  𝛋 = { (a, O), (b, O)} lab q1 q2 out a O O O a O ? O a ? O O a O ? O 𝛆 = a ? O O b O O O b O O O b O O O b O O O b O ? O b ? O O

And so on… 14  Q = {O, O, O}  F = {O}  𝛋 = { (a, O), (b, O)} lab q1 q2 out a O O O a O ? O a ? O O a O ? O 𝛆 = a ? O O b O O O b O O O b O O O b O O O b O ? O b ? O O

This tree is in the language of A 14  Q = {O, O, O}  F = {O}  𝛋 = { (a, O), (b, O)} lab q1 q2 out a O O O a O ? O a ? O O a O ? O 𝛆 = a ? O O b O O O b O O O b O O O b O O O b O ? O b ? O O

Major drawbacks 15  In general, computing the automaton has non- elementary complexity in the query

Major drawbacks 15  In general, computing the automaton has non- elementary complexity in the query  Exponential dependence in the instance treewidth

Probabilistic Query Evaluation on Bounded- Treewidth Instances - PowerPoint PPT Presentation

Probabilistic Query Evaluation on Bounded- Treewidth Instances SIGMOD/PODS PH.D. SYMPOSIUM JUNE 26, 2016, SAN FRANCISCO Mikal Monet Supervised by Pierre Senellart Context 2 Boolean queries (yes/no) on relational instances Context 2

Improve Query Performance with the Query Log Analyzer Kees Vegter Field Engineer Query Log

Query Execution 2 and Query Optimization Instructor: Matei Zaharia cs245.stanford.edu Query

Query Processing Relevance feedback; query expansion; Web Search 1 Overview Indexes Query

Probabilistic model Probabilistic model c Probabilistic model Probabilistic model c c

Query Understanding: A Manifesto Daniel Tunkelang queryunderstanding.com Overview What is

Perfect Query FORMULA 5 critical sections in every successful query letter (c) 2019

Query Op)miza)on 1 Query op)miza)on Given an SQL query,

CS4224/CS5424 Lecture 9 Distributed Query Processing Query Processing Translates query into a

CS 4110 Probabilistic Programming Probabilistic Programming It's not about writing software.

A Generic Mapping-based Query Translation A Generic Mapping-based Query Translation from SPARQL

Information Retrieval > Query Us User er Query Words Query Words Search Personalization

Module 13: Optimizing Query Performance Overview Introduction to the Query Optimizer

Chapter 3: Top-k Query Processing and Indexing 3.1 Top-k Algorithms 3.2 Approximate Top-k Query

CAS CS 460/660 Introduction to Database Systems Query Evaluation II 1.1 Cost-based Query

Query Execuon Declarave Query (SQL) We start from

Database Systems II Query Compiler CMPT 454, Simon Fraser University, Fall 2009, Martin Ester

What do you do if a computational object fails a specification? Target ... We have

Automata-based analysis of recursive cryptographic protocols Thomas Wilke Joint work with Ralf

Polishness of some topologies related to automata Olivier Finkel Joint work with Olivier Carton

Automata-Based Analysis of Recursive Concurrent Programs Markus Mller-Olm Westflische

Logic, Automata, and Games Sophie Pinchinat IRISA, university of Rennes 1, France Logic Summer

Complexity of Decision Problems in Computational Logic Vincent Jug X 2006 William Marsh Rice

Higher-Order Model Checking and Program Verification Naoki Kobayashi University of Tokyo

Semantics of linear logic and higher-order model-checking Charles Grellois Thse dirige par

Probabilistic Query Evaluation on Bounded- Treewidth Instances - PowerPoint PPT Presentation

Probabilistic Query Evaluation on Bounded- Treewidth Instances SIGMOD/PODS PH.D. SYMPOSIUM JUNE 26, 2016, SAN FRANCISCO Mikal Monet Supervised by Pierre Senellart Context 2 Boolean queries (yes/no) on relational instances Context 2

Improve Query Performance with the Query Log Analyzer Kees Vegter Field Engineer Query Log

Query Execution 2 and Query Optimization Instructor: Matei Zaharia cs245.stanford.edu Query

Query Processing Relevance feedback; query expansion; Web Search 1 Overview Indexes Query

Probabilistic model Probabilistic model c Probabilistic model Probabilistic model c c

Query Understanding: A Manifesto Daniel Tunkelang queryunderstanding.com Overview What is

Perfect Query FORMULA 5 critical sections in every successful query letter (c) 2019

Query Op)miza)on 1 Query op)miza)on Given an SQL query,

CS4224/CS5424 Lecture 9 Distributed Query Processing Query Processing Translates query into a

CS 4110 Probabilistic Programming Probabilistic Programming It's not about writing software.

A Generic Mapping-based Query Translation A Generic Mapping-based Query Translation from SPARQL

Information Retrieval &gt; Query Us User er Query Words Query Words Search Personalization

Module 13: Optimizing Query Performance Overview Introduction to the Query Optimizer

Chapter 3: Top-k Query Processing and Indexing 3.1 Top-k Algorithms 3.2 Approximate Top-k Query

CAS CS 460/660 Introduction to Database Systems Query Evaluation II 1.1 Cost-based Query

Query Execu*on Declara*ve Query (SQL) We start from

Database Systems II Query Compiler CMPT 454, Simon Fraser University, Fall 2009, Martin Ester

What do you do if a computational object fails a specification? Target ... We have

Automata-based analysis of recursive cryptographic protocols Thomas Wilke Joint work with Ralf

Polishness of some topologies related to automata Olivier Finkel Joint work with Olivier Carton

Automata-Based Analysis of Recursive Concurrent Programs Markus Mller-Olm Westflische

Logic, Automata, and Games Sophie Pinchinat IRISA, university of Rennes 1, France Logic Summer

Complexity of Decision Problems in Computational Logic Vincent Jug X 2006 William Marsh Rice

Higher-Order Model Checking and Program Verification Naoki Kobayashi University of Tokyo

Semantics of linear logic and higher-order model-checking Charles Grellois Thse dirige par

Information Retrieval > Query Us User er Query Words Query Words Search Personalization

Query Execuon Declarave Query (SQL) We start from