A Dichotomy for Homomorphism-Closed Queries on Probabilistic Graphs Antoine Amarilli 1 and İsmail İlkan Ceylan 2 September 16, 2020 1 Télécom Paris 2 University of Oxford 1/7
Uncertain data management In this talk, we manage data represented as a labeled graph 2/7
Uncertain data management In this talk, we manage data represented as a labeled graph WorksAt Antoine Télécom Paris Antoine Paris Sud Benny Paris Sud Benny Technion İsmail U. Oxford 2/7
Uncertain data management In this talk, we manage data represented as a labeled graph WorksAt Antoine Télécom Paris Antoine Paris Sud Benny Paris Sud Benny Technion İsmail U. Oxford MemberOf Télécom Paris ParisTech Télécom Paris IP Paris Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER 2/7
Uncertain data management In this talk, we manage data represented as a labeled graph WorksAt Antoine Télécom Paris ParisTech Antoine Télécom Paris Antoine Paris Sud Benny Paris Sud Paris Sud IP Paris Benny Technion İsmail U. Oxford Benny MemberOf Paris-Saclay Technion Télécom Paris ParisTech Télécom Paris IP Paris Paris Sud IP Paris Paris Sud Paris-Saclay CESAER Technion CESAER İsmail U. Oxford
Uncertain data management In this talk, we manage data represented as a labeled graph WorksAt Antoine Télécom Paris ParisTech Antoine Télécom Paris Antoine Paris Sud Benny Paris Sud Paris Sud IP Paris Benny Technion İsmail U. Oxford Benny MemberOf Paris-Saclay Technion Télécom Paris ParisTech Télécom Paris IP Paris Paris Sud IP Paris Paris Sud Paris-Saclay CESAER Technion CESAER İsmail U. Oxford
Uncertain data management In this talk, we manage data represented as a labeled graph WorksAt Antoine Télécom Paris ParisTech Antoine Télécom Paris Antoine Paris Sud Benny Paris Sud Paris Sud IP Paris Benny Technion İsmail U. Oxford Benny MemberOf Paris-Saclay Technion Télécom Paris ParisTech Télécom Paris IP Paris Paris Sud IP Paris Paris Sud Paris-Saclay CESAER Technion CESAER İsmail U. Oxford 2/7
Uncertain data management In this talk, we manage data represented as a labeled graph WorksAt Antoine Télécom Paris ParisTech Antoine Télécom Paris Antoine Paris Sud Benny Paris Sud Paris Sud IP Paris Benny Technion İsmail U. Oxford Benny MemberOf Paris-Saclay Technion Télécom Paris ParisTech Télécom Paris IP Paris Paris Sud IP Paris Paris Sud Paris-Saclay CESAER Technion CESAER İsmail U. Oxford → Problem: we are not certain about the true state of the data 2/7
Uncertain data model A. Télécom Paris ParisTech • Uncertain data model: TID , for tuple-independent database • Each fact (edge) carries a probability Paris Sud IP Paris B. Paris-Saclay Technion U. Oxford CESAER İ. 3/7
Uncertain data model 80% 90% A. Télécom Paris ParisTech • Uncertain data model: TID , for 10% 90% tuple-independent database • Each fact (edge) carries a probability 50% Paris Sud IP Paris 40% 90% B. 80% Paris-Saclay Technion 100% 100% U. Oxford CESAER İ. 3/7
Uncertain data model 80% 90% A. Télécom Paris ParisTech • Uncertain data model: TID , for 10% 90% tuple-independent database • Each fact (edge) carries a probability 50% • Each fact exists with its given probability Paris Sud IP Paris 40% • All facts are independent 90% B. 80% Paris-Saclay Technion 100% 100% U. Oxford CESAER İ. 3/7
Uncertain data model 80% 90% A. Télécom Paris ParisTech • Uncertain data model: TID , for 10% 90% tuple-independent database • Each fact (edge) carries a probability 50% • Each fact exists with its given probability Paris Sud IP Paris 40% • All facts are independent 90% B. 80% • Possible world W : subset of facts Paris-Saclay Technion 100% 100% U. Oxford CESAER İ. 3/7
Uncertain data model A. Télécom Paris ParisTech • Uncertain data model: TID , for tuple-independent database • Each fact (edge) carries a probability • Each fact exists with its given probability Paris Sud IP Paris • All facts are independent B. • Possible world W : subset of facts Paris-Saclay Technion U. Oxford CESAER İ. 3/7
Uncertain data model 80% 90% A. Télécom Paris ParisTech • Uncertain data model: TID , for 10% 90% tuple-independent database • Each fact (edge) carries a probability 50% • Each fact exists with its given probability Paris Sud IP Paris 40% • All facts are independent 90% B. 80% • Possible world W : subset of facts • Probability of W : Paris-Saclay Technion 100% 100% U. Oxford CESAER İ. 3/7
Uncertain data model 80% 90% A. Télécom Paris ParisTech • Uncertain data model: TID , for 10% 90% tuple-independent database • Each fact (edge) carries a probability 50% • Each fact exists with its given probability Paris Sud IP Paris 40% • All facts are independent 90% B. 80% • Possible world W : subset of facts • Probability of W : Paris-Saclay Technion 100% �� � �� � Pr ( W ) = Pr ( F ) 1 − Pr ( F ) � � × 100% F ∈ W F / ∈ W U. Oxford CESAER İ. 3/7
Homomorphism-closed queries • Query: maps a graph ( without probabilities ) to YES/NO 4/7
Homomorphism-closed queries • Query: maps a graph ( without probabilities ) to YES/NO • Conjunctive query (CQ): can I find a match of a pattern ? e.g., x y z 4/7
Homomorphism-closed queries • Query: maps a graph ( without probabilities ) to YES/NO • Conjunctive query (CQ): can I find a match of a pattern ? e.g., x y z • Union of conjunctive queries (UCQ): does one of the CQs match? 4/7
Homomorphism-closed queries • Query: maps a graph ( without probabilities ) to YES/NO • Conjunctive query (CQ): can I find a match of a pattern ? e.g., x y z • Union of conjunctive queries (UCQ): does one of the CQs match? → Homomorphism-closed query Q : if G satisfies Q and G has a homomorphism to G ′ then G ′ also satisfies Q 4/7
Homomorphism-closed queries • Query: maps a graph ( without probabilities ) to YES/NO • Conjunctive query (CQ): can I find a match of a pattern ? e.g., x y z • Union of conjunctive queries (UCQ): does one of the CQs match? → Homomorphism-closed query Q : if G satisfies Q and G has a homomorphism to G ′ then G ′ also satisfies Q They generalize CQs and UCQs , but also regular path queries (RPQs), Datalog , etc. 4/7
Problem statement: Probabilistic query evaluation (PQE) Here is the problem PQE ( Q ) : • We fix a query Q : x y z 5/7
Problem statement: Probabilistic query evaluation (PQE) Here is the problem PQE ( Q ) : • We fix a query Q : x y z • The input is a TID D : 90% 80% A. Télécom Paris ParisTech 10% 90% 50% Paris Sud IP Paris 40% 90% B. 80% Paris-Saclay Technion 100% 100% U. Oxford CESAER İ. 5/7
Problem statement: Probabilistic query evaluation (PQE) Here is the problem PQE ( Q ) : • We fix a query Q : x y z • The input is a TID D : 90% 80% A. Télécom Paris ParisTech 10% 90% 50% Paris Sud IP Paris 40% 90% B. 80% Paris-Saclay Technion 100% 100% U. Oxford CESAER İ. • The output is the probability that the query is true 5/7
Problem statement: Probabilistic query evaluation (PQE) Here is the problem PQE ( Q ) : • We fix a query Q : x y z • The input is a TID D : 90% 80% A. Télécom Paris ParisTech 10% 90% 50% Paris Sud IP Paris 40% 90% B. 80% Paris-Saclay Technion 100% 100% U. Oxford CESAER İ. • The output is the probability that the query is true → Question : What is the complexity of PQE ( Q ) depending on the query Q ? 5/7
Results on PQE Existing dichotomy on the unions of conjunctive queries (UCQs): Theorem [Dalvi and Suciu, 2012] • Some UCQs Q are safe and PQE ( Q ) is in PTIME • All others are unsafe and PQE ( Q ) is #P-hard 6/7
Results on PQE Existing dichotomy on the unions of conjunctive queries (UCQs): Theorem [Dalvi and Suciu, 2012] • Some UCQs Q are safe and PQE ( Q ) is in PTIME • All others are unsafe and PQE ( Q ) is #P-hard We study PQE for homomorphism-closed queries and show: 6/7
Results on PQE Existing dichotomy on the unions of conjunctive queries (UCQs): Theorem [Dalvi and Suciu, 2012] • Some UCQs Q are safe and PQE ( Q ) is in PTIME • All others are unsafe and PQE ( Q ) is #P-hard We study PQE for homomorphism-closed queries and show: Theorem [Amarilli and Ceylan, 2020] For any query Q closed under homomorphisms : • Either Q is equivalent to a safe UCQ and PQE ( Q ) is in PTIME 6/7
Results on PQE Existing dichotomy on the unions of conjunctive queries (UCQs): Theorem [Dalvi and Suciu, 2012] • Some UCQs Q are safe and PQE ( Q ) is in PTIME • All others are unsafe and PQE ( Q ) is #P-hard We study PQE for homomorphism-closed queries and show: Theorem [Amarilli and Ceylan, 2020] For any query Q closed under homomorphisms : • Either Q is equivalent to a safe UCQ and PQE ( Q ) is in PTIME • In all other cases, PQE ( Q ) is #P-hard 6/7
Recommend
More recommend