graphical models
play

Graphical Models Model Estimation and Validation Marco Scutari - PowerPoint PPT Presentation

Graphical Models Model Estimation and Validation Marco Scutari m.scutari@ucl.ac.uk Genetics Institute University College London September 27, 2011 Marco Scutari University College London Graphical Models Marco Scutari University College


  1. Graphical Models Model Estimation and Validation Marco Scutari m.scutari@ucl.ac.uk Genetics Institute University College London September 27, 2011 Marco Scutari University College London

  2. Graphical Models Marco Scutari University College London

  3. Graphical Models Graphical Models Graphical models are defined by: • a network structure, G = ( V , E ) , either an undirected graph (Markov networks, gene association networks, correlation networks, etc.) or a directed graph (Bayesian networks). Each node v i ∈ V corresponds to a random variable X i ; • a global probability distribution, X , which can be factorised into a small set of local probability distributions according to the edges e ij ∈ E present in the graph. This combination allows a compact representation of the joint distribution of large numbers of random variables and simplifies inference on the resulting parameter space. Marco Scutari University College London

  4. Graphical Models A Simple Bayesian Network: Watson’s Lawn SPRINKLER SPRINKLER SPRINKLER RAIN RAIN SPRINKLER TRUE FALSE RAIN TRUE FALSE 0.2 0.8 GRASS WET FALSE 0.4 0.6 TRUE 0.01 0.99 GRASS WET SPRINKLER RAIN TRUE FALSE FALSE FALSE 0.0 1.0 FALSE TRUE 0.8 0.2 TRUE FALSE 0.9 0.1 TRUE TRUE 0.99 0.01 Marco Scutari University College London

  5. Graphical Models Graphical Separation and Independence The main role of the graph structure is to express the conditional independence relationships among the variables in the model, thus specifying the factorisation of the global distribution. Different classes of graphs express these relationships with different semantics, which have in common the principle that graphical separation of two (sets of) nodes implies the conditional independence of the corresponding (sets of) random variables. For networks considered here, separation is defined as: • (u-)separation in Markov networks; • d-separation in Bayesian networks. Marco Scutari University College London

  6. Graphical Models Graphical Separation separation (undirected graphs) A B C d-separation (directed acyclic graphs) A B C A B C A B C Marco Scutari University College London

  7. Graphical Models Maps and Independence A graph G is a dependency map (or D-map) of the probabilistic dependence structure P of X if there is a one-to-one correspondence between the random variables in X and the nodes V of G , such that for all disjoint subsets A , B , C of X we have A ⊥ ⊥ P B | C = ⇒ A ⊥ ⊥ G B | C . Similarly, G is an independency map (or I-map) of P if A ⊥ ⊥ P B | C ⇐ = A ⊥ ⊥ G B | C . G is said to be a perfect map of P if it is both a D-map and an I-map, that is A ⊥ ⊥ P B | C ⇐ ⇒ A ⊥ ⊥ G B | C , and in this case P is said to be isomorphic to G . Graphical models are formally defined as I-maps under the respective definitions of graphical separation. Marco Scutari University College London

  8. Graphical Models Bayesian Networks, Equivalence Classes and Moral Graphs Following the definitions given in the previous couple of slides, the graph associated with a Bayesian network has three useful transforms: • the skeleton: the undirected graph underlying a Bayesian network, i.e. the graph we get if we disregard edges’ direction. • the equivalence class: the graph in which only edges which are part of a v-structure (i.e. A → C ← B ) and/or might result in one are directed. All valid combinations of the other edges’ directions result in networks representing the same dependence structure P . • the moral graph: the graph obtained by disregarding edges’ direction and joining the two parents in each v-structure with an edge. This is essentially a way to transform a Bayesian network into a Markov network. Marco Scutari University College London

  9. Graphical Models Equivalence Classes STAT STAT ANL ANL ALG ALG MECH MECH VECT VECT STAT STAT ANL ANL ALG ALG MECH MECH VECT VECT Marco Scutari University College London

  10. Graphical Models Factorisation into Local Distributions The most important consequence of defining graphical models as I-maps is the factorisation of the global distribution into local distributions: • in Markov networks, local distributions are associated with the cliques C i (maximal subsets of nodes in which each element is adjacent to all the others) in the graph, k � P( X ) = ψ i ( C i ) , i =1 and the ψ k functions are called potentials. • in Bayesian networks, each local distribution is associated with a single node X i and depends only on the joint distribution of its parents Π X i : p � P( X ) = P( X i | Π X i ) i =1 Marco Scutari University College London

  11. Graphical Models A Note About Potentials Potentials are non-negative functions representing the relative mass of probability of each clique C i . They are proper probability or density functions only when the graph is decomposable or triangulated, that is when it contains no induced cycles other than triangles. With any other type of graph inference becomes very hard, if possible at all, because ψ 1 , ψ 2 , . . . , ψ k have no direct statistical interpretation. In this case the global distribution factorises again according to the chain rule and can be written as � k i =1 P( C i ) P( X ) = (1) � k i =1 P( S i ) where S i are the nodes of C i which are also part of any other clique up to C i − 1 . Marco Scutari University College London

  12. Graphical Models Neighbourhoods and Markov Blankets Furthermore, for each node X i two sets are defined: • the neighbourhood, the set of nodes that are adjacent to X i . These nodes cannot be made independent from X i . • the Markov blanket, the set of nodes that completely separates X i from the rest of the graph. Generally speaking, it is the set of nodes that includes all the knowledge needed to do inference on X i , from estimation to hypothesis testing to prediction, because all the other nodes are conditionally independent from X i given its Markov blanket. These sets are related in Markov and Bayesian networks; in particular, Markov blankets can be shown to be the same using a moral graph. Marco Scutari University College London

  13. Graphical Models Neighbourhoods and Markov Blankets Bayesian network Markov network G C E L G C E L A A F B D K F B D K H H Parents Children Children's other Markov blanket Neighbours parents Marco Scutari University College London

  14. Graphical Models Markov networks vs Bayesian networks Markov networks and Bayesian networks do not appear to be closely related, as they are so different in construction and interpretation. • There are indeed dependency models that have an undirected perfect map but not a directed acyclic one, and vice versa. • However, it can be shown that every dependency structure that can be expressed by a decomposable graph can be modelled both by a Markov network and a Bayesian network. • It can also be shown that every dependency model expressible by an undirected graph is also expressible by a directed acyclic graph, with the addition of some auxiliary nodes. These two results indicate that there is a significant overlap between Markov and Bayesian networks, and that in many cases both can be used to the same effect. Marco Scutari University College London

  15. Graphical Models Probability Distributions: Discrete and Continuous Data used in graphical modelling should respect the following assumptions: • if all the variables X i are discrete, both the global and the local distributions are assumed to be multinomial. Local distributions are described using conditional probability tables; • if all the variables X i are continuous, the global distribution is assumed to be a multivariate Gaussian distribution, and the local distributions are univariate or multivariate Gaussian distributions. Local distributions are described using partial correlation coefficients; • if both continuous and discrete variables are present, we can assume a mixture or conditional Gaussian distribution, discretise continuous attributes or use a nonparametric approach. Marco Scutari University College London

  16. Graphical Models Other Distributional Assumptions Other fundamental distibutional assumptions are: • observations must be independent. If some form of temporal or spatial dependence is present, it must be specifically accounted for in the definition of the network (as in dynamic Bayesian networks ); • if the model will be used as a causal graphical model, that is, to infer cause-effect relationship from experimental or (more frequently) observational data, there must be no latent or hidden variables that influence the dependence structure of the model; • all the relationships between the variables in the network must be conditional independencies, because they are by definition the only ones that can be expressed by graphical models. Marco Scutari University College London

  17. Graphical Models A Gaussian Markov Network (MARKS) analysis mechanics algebra vectors statistics analysis mechanics algebra algebra vectors statistics Marco Scutari University College London

  18. Graphical Models A Discrete Bayesian Network (ASIA) smoking? visit to Asia? lung cancer? tuberculosis? bronchitis? either tuberculosis or lung cancer? dyspnoea? positive X-ray? Marco Scutari University College London

Recommend


More recommend