Weaker Forms of Monotonicity for Declarative Networking: a more fine-grained answer to the CALM-conjecture. Tom J Ameloot 1 , Bas Ketsman 1 , Frank Neven 1 and Daniel Zinn 2 1 Hasselt University 2 LogicBlox, Inc 1
Overview 1. Introduction 2. CALM 3. CALM Revision 1 4. CALM Revision 2 5. Datalog 6. Conclusion 2
Introduction ◮ Declarative Networking: Datalog based languages for parallel and distributed computing ◮ Cloud-computing: Setting with asynchronous communication via messages which can be arbitrarily delayed but not lost ◮ CALM-conjecture: No coordination = Monotonicity [Hellerstein, 2010] (CALM = Consistency And Logical Monotonicity) 3
Monotonicity Definition A query Q is monotone if Q ( I ) ⊆ Q ( I ∪ J ) for all database instances I and J . Notation M : class of monotone queries Example ◮ Q ∆ : Select triangles in a graph ∈ M ◮ Q < : Select open triangles in a graph �∈ M 4
CALM by Example Q ∆ : select all triangles ∈ M Input instance write to output Algorithm ◮ broadcast all data ◮ periodically output local triangles No coordination + Eventually consistent 5
CALM by Example Q < : select all open triangles �∈ M ?? Open triangle or Input instance fact not yet arrived?? Requires global coordination 6
CALM-conjecture CALM-conjecture No-coordination = Monotonicity [Hellerstein, 2010] ◮ [Ameloot, Neven, Van den Bussche, 2011]: TRUE ◮ for a setting where nodes have no information about the distribution of facts ◮ [Zinn, Green, Lud¨ ascher, 2012]: FALSE ◮ for settings where nodes have information about the distribution of facts ◮ TRUE when also refining montonicity 7
Overview 1. Introduction 2. CALM 3. CALM Revision 1 4. CALM Revision 2 5. Datalog 6. Conclusion 8
Relational Transducer Networks [Ameloot, Neven, Van den Bussche, 2011] ◮ Network N = { x , y , u , z } ◮ Transducer Π ◮ messages can be arbitrarily delayed but never get lost Semantics defined in terms of runs over a transition system 9
Relational Transducer Networks [Ameloot, Neven, Van den Bussche, 2011] Definition A transducer Π computes a query Q if ◮ for all networks N , Network independent ◮ for all databases I , Data distribution independent ◮ for all horizontal distributions H , and ◮ for every run of Π , out (Π) = Q ( I ) . Consistency requirement 10
Coordination-free Algorithms Q ∆ : select all triangles Data-communication Input instance Algorithm ◮ broadcast all data ◮ output triangles whenever new data arrives 11
Coordination-free Algorithms [Ameloot, Neven, Van den Bussche, 2011] Definition Π is coordination-free if for all inputs I there is a distribution on which Π computes Q ( I ) without having to do communication. Goal: separate data-communication from coordination-communication 12
Example: Ideal Distribution Q ∆ : select all triangles No communication required Input instance write to output Algorithm ◮ (broadcast all data) ◮ periodically output local triangles 13
CALM-conjecture [Ameloot, Neven, Van den Bussche, 2011] A query has a coordination-free and eventually consistent execution strategy iff the query is monotone Theorem F 0 = M Definition F 0 = set of queries which are distributedly computed by coordination-free transducers 14
Overview 1. Introduction F 0 = M 2. CALM � � → F 1 = M distinct 3. CALM Revision 1 � � F 2 = M disjoint 4. CALM Revision 2 5. Datalog 6. Conclusion 15
Policy-aware Transducers . . . . . . knows about missing fact Input instance . . . “Distribution Policy” not in active domain 16
Policy-aware Transducers [Zinn, Green, Lud¨ ascher, 2012] Definition A distribution policy P for σ and N is a total function from facts ( σ ) to the power set of N . Definition A policy-aware transducer is a transducer with access to P restricted to its active domain Definition F 1 = set of queries which are distributedly computed by policy-aware coordination-free transducers 17
Domain-distinct-monotonicity Definition A fact f is domain distinct from instance I when adom ( f ) �⊆ adom ( I ) . Example � f ′ I f 18
Domain-distinct-monotonicity Definition An instance J is domain distinct from instance I when every fact f ∈ J is domain distinct from I . Example I J 19
Domain-distinct-monotonicity Definition A query Q is domain-distinct-monotone if Q ( I ) ⊆ Q ( I ∪ J ) for all I and J for which J is domain distinct from I . Notation M distinct : class of domain-distinct-monotone queries M M distinct Remark M distinct : class of queries preserved under extensions 20
Domain-distinct-monotonicity Example Select open triangles in graph ∈ M distinct . Q ( I ) I Not domain-distinct from I 21
Revised CALM-conjecture A query has a coordination-free and eventually consistent execution strategy under distribution policies iff the query is domain-distinct-monotone Theorem F 1 = M distinct Definition F 1 = set of queries which are distributedly computed by policy-aware coordination-free transducers 22
Proof of M distinct ⊆ F 1 ◮ Monotonicity: Q ( J ) ⊆ Q ( I ) for every J ⊆ I ◮ Domain-distinct-monotonicity: Let I be an instance, C ⊆ adom ( I ) . Induced instance: I | C = { f ∈ I | adom ( f ) ⊆ C } I | C I C By domain-distinct-monotonicity: Q ( I | C ) ⊆ Q ( I ) 23
Proof of M distinct ⊆ F 1 ◮ F 1 setting: Let I be an instance, C ⊆ adom ( I ) . C is complete at node x when x knows for every fact f with adom ( f ) ⊆ C whether f ∈ I or f �∈ I . complete set = instance based on complete C = induced instance of I based on C Algorithm ◮ broadcast all present and deduced absent facts ◮ Evaluate query on complete sets 24
Overview 1. Introduction F 0 = M 2. CALM � � F 1 = M distinct 3. CALM Revision 1 � � → F 2 = M disjoint 4. CALM Revision 2 5. Datalog 6. Conclusion 25
Domain-guided Policies . . . . . . Input instance . . . “Distribution Policy” 26
Domain-guided Policies [Zinn, Green, Lud¨ ascher, 2012] Definition F 2 = queries which are distributedly computed under domain-guided distribution policies by policy-aware coordination-free transducers. 27
Domain-disjoint-monotonicity Definition An instance J is domain disjoint from instance I when adom ( I ) ∩ adom ( J ) = ∅ . Example � J ′ I J 28
Domain-disjoint-monotonicity Definition A query Q is domain-disjoint-monotone if Q ( I ) ⊆ Q ( I ∪ J ) for all I and J for which J is domain disjoint from I . Notation M disjoint : class of domain-disjoint-monotone queries M M distinct M disjoint 29
Revised CALM-conjecture A query has a coordination-free and eventually consistent execution strategy under domain-guided distribution policies iff the query is domain-disjoint-monotone Theorem F 2 = M disjoint Definition F 2 = queries which are distributedly computed under domain-guided distribution policies by policy-aware coordination-free transducers. 30
Intermediate Summary F 0 = = wILOG( � = ) Datalog( � = ) M � � � � � F 1 = M distinct = SP-Datalog SP-wILOG � � � � � semicon-Datalog ¬ F 2 = M disjoint = semicon-wILOG ¬ � Coordination Monotonicity Datalog + Datalog freeness value invention 31
Datalog Variants Datalog( � = ) � wILOG( � = ) = M ◮ Datalog( � = ) � M ∩ PTIME [Afrati, Cosmadakis, Yannakakis, 1994] ◮ wILOG( � = ) = M [Cabibbo,1998] SP-Datalog � SP-wILOG = M distinct ◮ SP-Datalog � M distinct ∩ PTIME [Afrati, Cosmadakis, Yannakakis, 1994] ◮ SP-wILOG = M distinct [Cabibbo,1998] Datalog variant of M disjoint ? 32
semicon-Datalog ¬ semicon-Datalog ¬ � semicon-wILOG ¬ = M disjoint Connected Rules O ( x , y , z ) ← E ( x , y ) , E ( y , z ) , E ( z , x ) is connected O ( x , y , z ) ← E ( x , y ) , E ( z , z ) is not connected Definition A stratified-Datalog program is semi-connected if all rules are connected except (possibly) those of the last stratum. Example Complement of transitive closure: TC ( x , y ) ← E ( x , y ) TC ( x , y ) ← E ( x , z ) , TC ( z , y ) O ( x , y ) ← ¬ TC ( x , y ) , x � = y 33
Conclusion and Future Work Conclusion ◮ Coordination-free evaluation = (refined) monotonicity ◮ (semi-)connected Datalog Can we put the CALM-conjecture to rest? Future Work ◮ Other settings / other distribution policies? ◮ Coordination-free + efficient evaluation? 34
Recommend
More recommend