Probabilistic Logic Programming & Knowledge Compilation - PowerPoint PPT Presentation

Probabilistic Logic Programming & Knowledge Compilation Wannes Meert DTAI, Dept. Computer Science, KU Leuven Dagstuhl, 18 September 2017 In collaboration with Jonas Vlasselaer, Guy Van den Broeck, Anton Dries, Angelika Kimmig, Hendrik Blockeel, Jesse Davis and Luc De Raedt

StarAI Dealing with uncertainty: - Probability theory - Graphical models ? Learning Reasoning with - Parameters relational data - Structure - Logic - Database - Programming statistical relational AI, Statistical relational learning, probabilistic logic learning, probabilistic programming, ... 2

ProbLog Uncertainty 0.8::stress(ann). 0.6::influences(ann,bob). 0.2::influences(bob,carl). → Multiple possible worlds ? Relational data stress(ann). Learning influences(ann,bob). influences(bob,carl). t(0.8)::stress(ann). t(_)::influences(ann,bob). smokes(X) :- stress(X).   t(_)::influences(bob,carl). smokes(X) :- influences(Y,X), smokes(Y). → One World 3

Introduction to ProbLog 4

Example h - toss (biased) coin & draw ball from each urn - win if (heads and a red ball) or (two balls of same color) Probabilistic fact 0.4 :: heads.   Annotated disjunction 0.3 :: col(1,red); 0.7 :: col(1,blue). 0.2 :: col(2,red); 0.3 :: col(2,green); 0.5 :: col(2,blue).   Logical rules / win :- heads, col(_,red). win :- col(1,C), col(2,C). background knowledge Evidence evidence(heads). query(win). Query 5

Example 0.4 :: heads. 0.3 :: col(1,red); 0.7 :: col(1,blue). 0.2 :: col(2,red); 0.3 :: col(2,green); 0.5 :: col(2,blue).   win :- heads, col(_,red). win :- col(1,C), col(2,C). (1-0.4)x0.3x0.3 0.4x0.3x0.3 (1-0.4)x0.3x0.2 H R R R G R G W W 6

All possible worlds 0.024 0.036 0.056 0.084 H H R R R R B R B R W W W 0.054 0.084 0.126 0.036 H H B G B G R G R G W 0.060 0.090 0.140 0.210 H H B B B B R B R B W W W 7

P(win) = ∑ 0.024 0.036 0.056 0.084 H H R R R R B R B R W W W 0.054 0.084 0.126 0.036 H H B G B G R G R G W 0.060 0.090 0.140 0.210 H H B B B B R B R B W W W 8

Alternative view: CP-logic probabilistic causal laws throws(john). 0.5::throws(mary). 0.8 :: break :- throws(mary). 0.6 :: break :- throws(john). John throws 1.0 doesn’t break Window breaks 0.4 0.6 Mary throws Mary throws doesn’t throw 0.5 0.5 0.5 doesn’t throw 0.5 doesn’t break Window breaks doesn’t break Window breaks 0.2 0.2 0.8 0.8 P(break)=0.6 × 0.5 × 0.8 + 0.6 × 0.5 × 0.2 + 0.6 × 0.5 + 0.4 × 0.5 × 0.8 [Vennekens et al 2003, Meert and Vennekens 2014] 9 9

Sato’s Distribution semantics query sum over possible worlds where Q is true X Y Y P ( Q ) = p ( f ) 1 − p ( f ) f 2 F f 62 F F [ R | = Q probability of possible world subset of probabilistic facts Prolog rules [Sato ICLP 95] 10

Examples from Tutorial Try yourself: https://dtai.cs.kuleuven.be/problog $ pip install problog 11

Tutorial: Bayes net 12

Tutorial: Higher-order functions 13

Tutorial: As a Python Library from problog.program import SimpleProgram from problog.logic import Constant,Var,Term,AnnotatedDisjunction coin,heads,tails,win = Term(‘coin'),Term('heads'),Term('tails'),Term('win') C = Var('C') p = SimpleProgram() p += coin(Constant('c1')) p += coin(Constant('c2')) p += AnnotatedDisjunction([heads(C,p=0.4), tails(C,p=0.6)], coin(C)) p += (win << heads(C)) p += query(win) lf = LogicFormula.create_from(p) # ground the program cnf = CNF.create_from(lf) # convert to CNF ddnnf = DDNNF.create_from(cnf) # compile CNF to ddnnf ddnnf.evaluate() 14

X Y Y P ( Q ) = p ( f ) 1 − p ( f ) Weighted Model Counting f 2 F f 62 F F [ R | = Q propositional formula in conjunctive normal form (CNF) Given by ProbLog program and query X Y WMC ( φ ) = w ( l ) I V | l ∈ I = φ weight of literal interpretations (truth for p::f, value assignments) of w(f) = p propositional variables w(¬f) = 1-p Possible worlds 15

Encodings/Compilers for WMC usage: problog [--knowledge {sdd,bdd,nnf,ddnnf,kbest,fsdd,fbdd}] ... ProbLog Grounding Tp-compilation Cycle breaking CNF Formula BDD d-DNNF SDD (+various tools) Also links to MaxSAT (decisions), Bayes net inference, ... 16

Impact of encoding y(1) ⇔ p(1,1) Noisy-OR y(2) ⇔ p(1,2) … . . . Y 1 Y 2 Y n y(n) ⇔ p(1,n) x ⇔ (y(1) ∧ p(2,1)) ∨ … ∨ (y(n) ∧ p(2,n)) X + × × WMC ( ¬ x ) = Q i w ( ¬ y i ) WMC ( x ) = w ( y 0 ) + w ( ¬ y 0 ) · w ( y 1 ) + . . . = P i w ( y i ) · Q j<i (1 − w ( y j )) = Q i (1 − w ( y i )) + ¬ x x × . . . ¬ y 0 ¬ y n × × Since w ( y i ) + w ( ¬ y i ) = 1, smooth ( · ) = 1 ¬ y 0 y 0 + smooth( Y 1 , . . . , Y n ) × × ¬ y 1 y 1 [Van den Broeck 2014, Meert 2016] smooth( Y 2 , . . . , Y n ) × . . . 17

Impact of encoding y(1) ⇔ p(1,1) Noisy-OR y(2) ⇔ p(1,2) … . . . Y 1 Y 2 Y n y(n) ⇔ p(1,n) x ⇔ (y(1) ∧ p(2,1)) ∨ … ∨ (y(n) ∧ p(2,n)) X + × × WMC ( ¬ x ) = Q i w ( ¬ y i ) WMC ( x ) = 1 − Q i w ( ¬ y i ) = Q i (1 − w ( y i )) = 1 − Q i (1 − w ( y i )) ¬ x + x × r × × × Since w ( y i ) + w ( ¬ y i ) = 1, smooth ( · ) = 1 . . . ¬ y 0 ¬ y n ¬ r r smooth( Y 0 , . . . , Y n ) × . . . ¬ y 0 ¬ y n [Van den Broeck 2014, Meert 2016] 18

Is KC just a toolbox for us? Yes, separate concerns and conveniently use what is available (and improve timings by waiting) No, to tackle some types of problems we need to interact while compiling or performing inference. 19

Tp-compilation Forward inference   Incremental compilation 20

Why Tp-compilation Domains with many cycles or long temporal chains: Social Genes webpages Sensor networks We encountered two problems: 1. Not always feasible to compile CNF 2. Not always feasible to create CNF 21

Before Grounding Loop breaking CNF conversion ‘Exact’ Knowledge Compilation Horn Approximation e.g. OBDD, d-DNNF, SDD [Selman and Kautz, AAAI ’91] [Fierens et al., TPLP ‘15] “Approximate” Compilation Sampling on the CNF e.g. via Weighted Partial MaxSAT e.g. MC-SAT [Renkens et al., AAAI ‘14] [Poon and Domingos, NCAI ‘06] 22

Tp-compilation • Generalizes the Tp operator from logic programming towards the probabilistic setting. • Tp operator (forward reasoning): o Start with what is known. o Derive new knowledge by applying the rules. o Continue until fixpoint (interpretation unchanged) SDD • Tp-compilation o Start with an empty formula for each probabilistic fact o Construct new formulas by applying the rules Apply-operator o Continue until fixpoint (formulas remains equivalent) Equivalence Bounds available at every iteration 23

Really a problem? • Fully connected graph with 10 nodes (90 edges) • CNF contains +25k variables and +100k clauses • Tp-compilation only requires 90 variables • Alzheimer network 24

Continuous observations Sensor measurements Circuit interacts with other representations 25

Ongoing work “ Continuous sensor measurements ” � normal(0.2,0.1)::vibration(X) :- op1(X). normal(0.6,0.2)::vibration(X) :- op2(X). normal(3.1,1.1)::vibration(X) :- fault(X). 0.2::fault(X) :- connected(X,Y), fault(Y). proefstand laat toe om complexere scenario’s te Restricted setting: - Sensor measurements are always available - Only used in head WP’s in meer detail worden uitgewerkt � – � 26 � �

Ongoing work Continuous values 0.15 0.1 Gaussian Mixture Model 0.05 0 0 2.5 5 7.5 10 12.5 15 17.5 20 22.5 25 t(0.5)::c(ID). t(normal(1, 10))::f(ID) :- c(ID). t(normal(10,10))::f(ID) :- \+c(ID). 1 ¬ θ ₆ θ ₆ 3 3 E ¬E ¬E E 5 5 5 5 3 θ ₄ ¬ θ ₄ ¬ θ ₄ θ ₄ θ ₄ ¬ θ ₄ ¬ θ ₄ θ ₄ ¬B B ⊥ evidence(f(1), 10). 7 7 7 7 7 7 7 7 7 evidence(f(2), 12). ¬ θ ₃ θ ₃ ¬ θ ₃ θ ₃ θ ₂ θ ₃ ¬ θ ₃ ⊥ ¬ θ ₃ θ ₃ θ ₁ θ ₃ ¬ θ ₃ ¬ θ ₂ ¬ θ ₃ θ ₃ θ ₃ ¬ θ ₃ ¬ θ ₁ ¬ θ ₃ θ ₃ ⊤ D ¬D ⊥ 9 9 9 9 9 9 9 9 9 9 9 evidence(f(3), 8 ). θ ₂ ¬ θ ₁ ¬ θ ₂ θ ₁ ¬ θ ₂ θ ₁ θ ₂ ⊤ θ ₂ ¬ θ ₁ ¬ θ ₂ ⊥ θ ₂ θ ₁ ¬ θ ₂ ⊥ ¬ θ ₂ θ ₁ θ ₂ ⊥ θ ₂ θ ₁ ¬ θ ₂ ⊤ ¬ θ ₂ ¬ θ ₁ θ ₂ ⊥ ¬ θ ₂ ¬ θ ₁ θ ₂ θ ₁ ¬ θ ₂ ¬ θ ₁ θ ₂ ⊤ θ ₂ ¬ θ ₁ ¬ θ ₂ ⊤ E ¬F ¬E ⊥ + weights are functions evidence(f(4), 11). evidence(f(5), 7 ). evidence(f(6), 13). evidence(f(7), 20). evidence(f(8), 21). 0.40::c(ID). evidence(f(9), 22). normal(10.16,2.11)::f(ID) :- c(ID). evidence(f(10), 18). normal(20.22,1.54)::f(ID) :- \+c(ID). evidence(f(11), 19). evidence(f(12), 19). evidence(f(13), 19). evidence(f(14), 23). 27 evidence(f(15), 21).

Resource Aware Circuits Memory and energy Circuits that are ‘hardware-friendly’ 28

“ ” Ongoing work Why resource aware? � Previous work: Decision Trees Integrate AI and hardware to achieve dynamic � On the system’s operating mode: attention-scalability → Adapt hardware dynamically and smart � Allows to extract the maximum of information of relevant information under our limited computational bandwidth. � - Resource-aware inference and fusion algorithms - Resource-scalable inference processors � Figure 24.2.4: (left) Schematic and decision tree algorithm for mixed- μ of-the-art sensor fusion allows to combine sensory information streams to improve sensory info 29 spokes partner’s

Probabilistic Logic Programming & Knowledge Compilation - PowerPoint PPT Presentation

Probabilistic Logic Programming & Knowledge Compilation Wannes Meert DTAI, Dept. Computer Science, KU Leuven Dagstuhl, 18 September 2017 In collaboration with Jonas Vlasselaer, Guy Van den Broeck, Anton Dries, Angelika Kimmig, Hendrik

CS 4110 Probabilistic Programming Probabilistic Programming It's not about writing software.

JIT Compilation Module Overview JIT Compilation Native vs. Managed Compilation Managed

Computational Logic A Motivational Introduction 1 Computational Logic programming algorithms

Probabilistic model Probabilistic model c Probabilistic model Probabilistic model c c

Logic Programming and MDPs for Planning Alborz Geramifard Winter 2009 Index Introduction Logic

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

Markov Logic Markov Logic Probability First-Order Logic Propositional Logic Markov Logic

Knowledge Compilation Guy Van den Broeck and Adnan Darwiche Jan 28, 2015, AAAI Knowledge

Probabilistic Relational Hoare Logic Main judgments Hoare Logic c : = : hoare [ c : pre

The logic of learning: The logic of learning: logic and knowledge representation logic and

The Compilation Process Preprocessing: o processes include-files, conditional compilation and

Lifted Probabilistic Inference by First-Order Knowledge Compilation Guy Van den Broeck Nima

Principles of Probabilistic Programming Lectures at EWSCS 2020 Winter School Joost-Pieter Katoen

Phases of Programming SI 413 Overview of compilation Programming Languages What does

An Introduction to Probabilistic Soft Logic Eriq Augustine and Golnoosh Farnadi UC Santa Cruz

A Short Introduction to Probabilistic Soft Logic Angelika Kimmig, Stephen H. Bach, Matthias

Proof-Theoretic Foundations of Indexing in Logic Programming Iliano Cervesato iliano@cmu.edu

Head Rules & Trees 2003 CSLI Publications Topics of Last Lecture Distinctions among

Week 10 - Monday What did we talk about last time? structs C combines the power and

Internet Software Technologies I t t S ft T h l i JavaScript part one JavaScript part

Cubical Indexed Induc ve Types Evan Cavallo Carnegie Mellon University jww Robert Harper

Update on effective volumes and energy reconstruction A. Trovato, INFN - LNS Detector layout

Cellular Automaton Tracking for VXD Cellular Automaton Tracking for VXD Cellular Automaton

Is This Class Thread-Safe? Inferring Documentation using Graph-Based Learning Andrew Habib,

Probabilistic Logic Programming & Knowledge Compilation - PowerPoint PPT Presentation

Probabilistic Logic Programming & Knowledge Compilation Wannes Meert DTAI, Dept. Computer Science, KU Leuven Dagstuhl, 18 September 2017 In collaboration with Jonas Vlasselaer, Guy Van den Broeck, Anton Dries, Angelika Kimmig, Hendrik

CS 4110 Probabilistic Programming Probabilistic Programming It's not about writing software.

JIT Compilation Module Overview JIT Compilation Native vs. Managed Compilation Managed

Computational Logic A Motivational Introduction 1 Computational Logic programming algorithms

Probabilistic model Probabilistic model c Probabilistic model Probabilistic model c c

Logic Programming and MDPs for Planning Alborz Geramifard Winter 2009 Index Introduction Logic

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

Markov Logic Markov Logic Probability First-Order Logic Propositional Logic Markov Logic

Knowledge Compilation Guy Van den Broeck and Adnan Darwiche Jan 28, 2015, AAAI Knowledge

Probabilistic Relational Hoare Logic Main judgments Hoare Logic c : = : hoare [ c : pre

The logic of learning: The logic of learning: logic and knowledge representation logic and

The Compilation Process Preprocessing: o processes include-files, conditional compilation and

Lifted Probabilistic Inference by First-Order Knowledge Compilation Guy Van den Broeck Nima

Principles of Probabilistic Programming Lectures at EWSCS 2020 Winter School Joost-Pieter Katoen

Phases of Programming SI 413 Overview of compilation Programming Languages What does

An Introduction to Probabilistic Soft Logic Eriq Augustine and Golnoosh Farnadi UC Santa Cruz

A Short Introduction to Probabilistic Soft Logic Angelika Kimmig, Stephen H. Bach, Matthias

Proof-Theoretic Foundations of Indexing in Logic Programming Iliano Cervesato iliano@cmu.edu

Head Rules &amp; Trees 2003 CSLI Publications Topics of Last Lecture Distinctions among

Week 10 - Monday What did we talk about last time? structs C combines the power and

Internet Software Technologies I t t S ft T h l i JavaScript part one JavaScript part

Cubical Indexed Induc ve Types Evan Cavallo Carnegie Mellon University jww Robert Harper

Update on effective volumes and energy reconstruction A. Trovato, INFN - LNS Detector layout

Cellular Automaton Tracking for VXD Cellular Automaton Tracking for VXD Cellular Automaton

Is This Class Thread-Safe? Inferring Documentation using Graph-Based Learning Andrew Habib,

Head Rules & Trees 2003 CSLI Publications Topics of Last Lecture Distinctions among