BN Semantics 3 Now its personal! Graphical Models 10708 Carlos - PDF document

Readings: K&F: 3.3, 3.4 BN Semantics 3 – Now it’s personal! Graphical Models – 10708 Carlos Guestrin Carnegie Mellon University September 22 nd , 2008 10-708 – � Carlos Guestrin 2006-2008 1 Independencies encoded in BN � We said: All you need is the local Markov assumption � (X i ⊥ NonDescendants Xi | Pa Xi ) � But then we talked about other (in)dependencies � e.g., explaining away � What are the independencies encoded by a BN? � Only assumption is local Markov � But many others can be derived using the algebra of conditional independencies!!! 10-708 – � Carlos Guestrin 2006-2008 2

Understanding independencies in BNs – BNs with 3 nodes Local Markov Assumption: A variable X is independent of its non-descendants given Indirect causal effect: its parents and only its parents X Z Y Indirect evidential effect: Common effect: X Z Y X Y Common cause: Z Z X Y 10-708 – � Carlos Guestrin 2006-2008 3 Understanding independencies in BNs – Some examples A B C E D G F H J I K 10-708 – � Carlos Guestrin 2006-2008 4

Understanding independencies in BNs – Some more examples A B C E D G F H J I K 10-708 – � Carlos Guestrin 2006-2008 5 An active trail – Example G E A B D H C F F’ F’’ When are A and H independent? 10-708 – � Carlos Guestrin 2006-2008 6

Active trails formalized � A trail X 1 – X 2 – · · · –X k is an active trail when variables O ⊆ {X 1 ,…,X n } are observed if for each consecutive triplet in the trail: � X i-1 → X i → X i+1 , and X i is not observed (X i ∉ O ) � X i-1 ← X i ← X i+1 , and X i is not observed (X i ∉ O ) � X i-1 ← X i → X i+1 , and X i is not observed (X i ∉ O ) � X i-1 → X i ← X i+1 , and X i is observed (X i ∈ O ), or one of its descendents 10-708 – � Carlos Guestrin 2006-2008 7 Active trails and independence? A B � Theorem : Variables X i and X j are independent C given Z ⊆ {X 1 ,…,X n } if the is E no active trail between X i D and X j when variables G Z ⊆ {X 1 ,…,X n } are observed F H J I K 10-708 – � Carlos Guestrin 2006-2008 8

More generally: Soundness of d-separation � Given BN structure G � Set of independence assertions obtained by d-separation: � I( G ) = {( X ⊥ Y | Z ) : d-sep G ( X ; Y | Z )} � Theorem: Soundness of d-separation � If P factorizes over G then I( G ) ⊆ I( P ) � Interpretation: d-separation only captures true independencies � Proof discussed when we talk about undirected models 10-708 – � Carlos Guestrin 2006-2008 9 Existence of dependency when not d-separated A B � Theorem: If X and Y are not d-separated given Z , C then X and Y are E dependent given Z under D some P that factorizes over G G F � Proof sketch : � Choose an active trail H J between X and Y given Z � Make this trail dependent I K � Make all else uniform (independent) to avoid “canceling” out influence 10-708 – � Carlos Guestrin 2006-2008 10

More generally: Completeness of d-separation � Theorem: Completeness of d-separation � For “almost all” distributions where P factorizes over to G , we have that I( G ) = I( P ) � “almost all” distributions : except for a set of measure zero of parameterizations of the CPTs (assuming no finite set of parameterizations has positive measure) � Means that if all sets X & Y that are not d-separated given Z , then ¬ ( X ⊥ Y|Z ) � Proof sketch for very simple case: 10-708 – � Carlos Guestrin 2006-2008 11 Interpretation of completeness � Theorem: Completeness of d-separation � For “almost all” distributions that P factorize over to G , we have that I( G ) = I( P ) � BN graph is usually sufficient to capture all independence properties of the distribution!!!! � But only for complete independence: � P � ( X = x ⊥ Y = y | Z = z ), ∀ x ∈ Val( X ), y ∈ Val( Y ), z ∈ Val( Z ) � Often we have context-specific independence (CSI) � ∃ x ∈ Val( X ), y ∈ Val( Y ), z ∈ Val( Z ): P � ( X = x ⊥ Y = y | Z = z ) � Many factors may affect your grade � But if you are a frequentist, all other factors are irrelevant ☺ 10-708 – � Carlos Guestrin 2006-2008 12

Algorithm for d-separation � How do I check if X and Y are d- separated given Z A B � There can be exponentially-many trails between X and Y C � Two-pass linear time algorithm E finds all d-separations for X D � 1. Upward pass G � Mark descendants of Z F � 2. Breadth-first traversal from X H J � Stop traversal at a node if trail is “blocked” I � (Some tricky details apply – see K reading) 10-708 – � Carlos Guestrin 2006-2008 13 What you need to know � d-separation and independence � sound procedure for finding independencies � existence of distributions with these independencies � (almost) all independencies can be read directly from graph without looking at CPTs 10-708 – � Carlos Guestrin 2006-2008 14

Announcements � Homework 1: � Due next Wednesday – beginning of class! � It’s hard – start early, ask questions � Audit policy � No sitting in, official auditors only, see course website Building BNs from independence properties � From d-separation we learned: � Start from local Markov assumptions, obtain all independence assumptions encoded by graph � For most P’ s that factorize over G , I( G ) = I( P ) � All of this discussion was for a given G that is an I-map for P � Now, give me a P , how can I get a G ? � i.e., give me the independence assumptions entailed by P � Many G are “equivalent”, how do I represent this? � Most of this discussion is not about practical algorithms, but useful concepts that will be used by practical algorithms � Practical algs next time 10-708 – � Carlos Guestrin 2006-2008 16

Minimal I-maps � One option: � G is an I-map for P � G is as simple as possible � G is a minimal I-map for P if deleting any edges from G makes it no longer an I-map 10-708 – � Carlos Guestrin 2006-2008 17 Obtaining a minimal I-map Flu, Allergy, SinusInfection, Headache � Given a set of variables and conditional independence assumptions � Choose an ordering on variables, e.g., X 1 , …, X n � For i = 1 to n � Add X i to the network � Define parents of X i , Pa Xi , in graph as the minimal subset of {X 1 ,…,X i-1 } such that local Markov assumption holds – X i independent of rest of {X 1 ,…,X i-1 }, given parents Pa Xi � Define/learn CPT – P(X i | Pa Xi ) 10-708 – � Carlos Guestrin 2006-2008 18

Minimal I-map not unique (or minimum) Flu, Allergy, SinusInfection, Headache � Given a set of variables and conditional independence assumptions � Choose an ordering on variables, e.g., X 1 , …, X n � For i = 1 to n � Add X i to the network � Define parents of X i , Pa Xi , in graph as the minimal subset of {X 1 ,…,X i-1 } such that local Markov assumption holds – X i independent of rest of {X 1 ,…,X i-1 }, given parents Pa Xi � Define/learn CPT – P(X i | Pa Xi ) 10-708 – � Carlos Guestrin 2006-2008 19 Perfect maps (P-maps) � I-maps are not unique and often not simple enough � Define “simplest” G that is I-map for P � A BN structure G is a perfect map for a distribution P if I( P ) = I( G ) � Our goal: � Find a perfect map! � Must address equivalent BNs 10-708 – � Carlos Guestrin 2006-2008 20

Inexistence of P-maps 1 � XOR (this is a hint for the homework) 10-708 – � Carlos Guestrin 2006-2008 21 Inexistence of P-maps 2 � (Slightly un-PC) swinging couples example 10-708 – � Carlos Guestrin 2006-2008 22

Obtaining a P-map � Given the independence assertions that are true for P � Assume that there exists a perfect map G * � Want to find G * � Many structures may encode same independencies as G * , when are we done? � Find all equivalent structures simultaneously! 10-708 – � Carlos Guestrin 2006-2008 23

BN Semantics 3 Now its personal! Graphical Models 10708 Carlos - PDF document

Readings: K&F: 3.3, 3.4 BN Semantics 3 Now its personal! Graphical Models 10708 Carlos Guestrin Carnegie Mellon University September 22 nd , 2008 10-708 Carlos Guestrin 2006-2008 1 Independencies encoded in BN We

Semantics 1 / 21 Outline What is semantics? Denotational semantics Semantics of naming What

Operational Semantics 1 / 14 Outline What is semantics? Operational Semantics What is

15-411: Dynamic Semantics Jan Ho ff mann Dynamic Semantics Static semantics: definition of

Polyteam Semantics Team Semantics Axiomatizations in team semantics Polyteams and Jonni

Semantics in Practice Semantics of Practice How do we write semantics? 1: pen-and-paper How do

Introductory Notes Jigsaw Semantics or: Dynamic Semantics Put Together Again Formal semantics

Polyteam Semantics Team Semantics Axiomatisations in team semantics Polyteams and

Semantics so far in course Lexical Semantics, Distributions, Previous semantics lectures

Preparatory course WS2011 - Semantics The job of semantics Referential theories Conceptual

Propositional Logic: Semantics Alice Gao Lecture 4, September 19, 2017 Semantics 1/56

File Systems: Semantics & Structure 11A. File Semantics Operating Systems Principles 11B.

Glue semantics (Slides available at http://www.ucl.ac.uk/~ucjtmgg/docs/LAGB2015-slides.pdf ) Glue

Formal Semantics in Modern Type Theories (and Event Semantics in MTT-Framework) Zhaohui Luo

Java: An Operational Java: An Operational Semantics Semantics Gaurav S. S. Kc Kc Gaurav B.

PL: A Whirlwind Tour Semantics and Foundations Program Semantics To analyze programs, we

Semantics Dr. Liam OConnor University of Edinburgh LFCS UNSW, Term 3 2020 1 Overview

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Sets Instructor: Shaddin Dughmi

Linear algebra and analysis recalls Lectures for PHD course on Numerical optimization Enrico

7. Separating Hyperplane Theorems I Daisuke Oyama Mathematics II May 1, 2020 Separating

The separation principle in stochastic control, revisited Workshop in honor of Eduardo Sontag on

Parameterized graph separation problems D aniel Marx Budapest University of Technology and

Lecture 4 Noisy Channel Coding I-Hsiang Wang Department of Electrical Engineering National

Special Topics Seminar Affine Laws and Learning Approaches for Witsenhausen Counterexample Hajir

Coauthors Adam Bene Watts Robin Kothari Avishay Tal January 30, 2019 2 / 51 Introduction

BN Semantics 3 Now its personal! Graphical Models 10708 Carlos - PDF document

Readings: K&F: 3.3, 3.4 BN Semantics 3 Now its personal! Graphical Models 10708 Carlos Guestrin Carnegie Mellon University September 22 nd , 2008 10-708 Carlos Guestrin 2006-2008 1 Independencies encoded in BN We

Semantics 1 / 21 Outline What is semantics? Denotational semantics Semantics of naming What

Operational Semantics 1 / 14 Outline What is semantics? Operational Semantics What is

15-411: Dynamic Semantics Jan Ho ff mann Dynamic Semantics Static semantics: definition of

Polyteam Semantics Team Semantics Axiomatizations in team semantics Polyteams and Jonni

Semantics in Practice Semantics of Practice How do we write semantics? 1: pen-and-paper How do

Introductory Notes Jigsaw Semantics or: Dynamic Semantics Put Together Again Formal semantics

Polyteam Semantics Team Semantics Axiomatisations in team semantics Polyteams and

Semantics so far in course Lexical Semantics, Distributions, Previous semantics lectures

Preparatory course WS2011 - Semantics The job of semantics Referential theories Conceptual

Propositional Logic: Semantics Alice Gao Lecture 4, September 19, 2017 Semantics 1/56

File Systems: Semantics &amp; Structure 11A. File Semantics Operating Systems Principles 11B.

Glue semantics (Slides available at http://www.ucl.ac.uk/~ucjtmgg/docs/LAGB2015-slides.pdf ) Glue

Formal Semantics in Modern Type Theories (and Event Semantics in MTT-Framework) Zhaohui Luo

Java: An Operational Java: An Operational Semantics Semantics Gaurav S. S. Kc Kc Gaurav B.

PL: A Whirlwind Tour Semantics and Foundations Program Semantics To analyze programs, we

Semantics Dr. Liam OConnor University of Edinburgh LFCS UNSW, Term 3 2020 1 Overview

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Sets Instructor: Shaddin Dughmi

Linear algebra and analysis recalls Lectures for PHD course on Numerical optimization Enrico

7. Separating Hyperplane Theorems I Daisuke Oyama Mathematics II May 1, 2020 Separating

The separation principle in stochastic control, revisited Workshop in honor of Eduardo Sontag on

Parameterized graph separation problems D aniel Marx Budapest University of Technology and

Lecture 4 Noisy Channel Coding I-Hsiang Wang Department of Electrical Engineering National

Special Topics Seminar Affine Laws and Learning Approaches for Witsenhausen Counterexample Hajir

Coauthors Adam Bene Watts Robin Kothari Avishay Tal January 30, 2019 2 / 51 Introduction

File Systems: Semantics & Structure 11A. File Semantics Operating Systems Principles 11B.