ehsan nazerfard nazerfard eecs wsu edu october 11 2011
play

Ehsan Nazerfard nazerfard@eecs.wsu.edu October 11, 2011 - PowerPoint PPT Presentation

Ehsan Nazerfard nazerfard@eecs.wsu.edu October 11, 2011 Introduction: Graphical Models Tutorial: Bayesian Networks Structure Learning Approaches RAI: Recursive Autonomy Identification for Bayesian Network Structure Learning Bayesian


  1. Ehsan Nazerfard nazerfard@eecs.wsu.edu October 11, 2011

  2.  Introduction: Graphical Models  Tutorial: Bayesian Networks Structure Learning Approaches  RAI: Recursive Autonomy Identification for Bayesian Network Structure Learning Bayesian Network Structure Learning Summary and Further Studies Analysis Next Steps Discussion Topics

  3. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI  Three main types of Graphical Models  Represent joint probability distribution. o Nodes: random variables o Edges: statistical dependencies

  4. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI  Three main types of Graphical Models  Represent joint probability distribution. o Nodes: random variables o Edges: direct influence in directed graphs

  5. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI  Why we need Graphical Models? o Intuitive way of representation of the relations between variables o Abstract out the conditional independence relations between variables between variables  Conditional independence o “Is A dependent on B , given the value of C ? o A B | C  P( A | B , C ) = P( A | C ) o A B | C  P( A , B | C ) = P( A | C )P( B | C )

  6. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI  Given , a Bayesian network is an X  ( X ,..., X ) 1 n annotated DAG that represents a unique JPD over : X   p ( X ,..., X ) p ( X | Pa ( X )) 1 n i i i i  Each node is annotated with a CPT that represents p ( X | pa ( X )) i i DAG: Directed Acyclic Graph JPD: Joint Probability Distribution CPT: Conditional Probability Table

  7. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI  Structure Learning o Find a structure of Bayesian Network (BN) that best describes the observed data.  Parameter Learning o Learning the parameters when the structure is Learning the parameters when the structure is known.  Generally parameter learning is a part of structure learning.

  8. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI  Goal o Find a network structure of BN that describes the observed data the most.  NP-Complete !! o Naïve Bayes … Naïve Bayes … o Using domain knowledge o Assumptions to make the problem tractable o The text books (generally) assume the network is already known. o …

  9. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI  Two main categories: 1) Score and Search-Based (S&S) approach Learning the network structures o Constraint-Based (CB) approach 2) Constraint-Based (CB) approach Learning the edges composing a structure o

  10. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI  Three main issues with S&S approach: 1. Search space 2. Search strategy 3. Model selection criterion 3. Model selection criterion

  11. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI  Number of possible DAGs containing n nodes:   n n         i 1 i ( n i ) f ( n ) ( 1 ) 2 f ( n i )   i    i 1 Curse of dimensionality Curse of dimensionality # of variables # of the possible DAGs 1 1 2 3 3 25 … … 8 78,370,2329,343 9 1,213,442,454,842,881 10 4,175,098,976,430,598,100

  12. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI  Any search method from Artificial Intelligence: o DFS, BFS, Best First Search, Simulated Annealing o A-star and IDA-star o …  How neighborhood is defined ? o Current structure + adding, deleting or reversing an arc. o No cycle is allowed  K2 algorithm [3] o Greedy search (total ordering is known)

  13. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI  Scoring function o Evaluates how well a given network G matches the data D . o The best BN is the one that maximizes the scoring function. function.     k o Based on ML: arg max ( p ( D | , G )) G G  G  Most frequently used: Bayesian Information Criterion (BIC) [4]

  14. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI  Input: Observational data set  Output: The resulting Bayesian network Score and Search-Based approach – Pseudo code 1 1 Generate the initial BN (random or from domain knowledge), Generate the initial BN (random or from domain knowledge), evaluate it and set it as the current network. 2 Evaluate the neighbors of the current BN. 3 If the best score of the neighbors is better than the score of the current BN, set neighbor with the best score as the current network and go to step 2. 4 Else stop the learning process.

  15. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI  Two main categories: 1) Score and Search-Based approach (S&S) Learning the network structures o Constraint-Based approach (CB) 2) Constraint-Based approach (CB) Learning the edges composing a structure o

  16. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI  Learning the edges of a structure Discovering the conditional independence (CI) o relations from the data Infer the structure from learned relations o

  17. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI  “Is A dependent on B , given the value of C ?  Examples * o child’s genes grandparents’ genes | parents’ genes o amount of speeding fine type of car | speed o amount of speeding fine type of car | speed o lung cancer yellow teeth | smoker o … * Borrowed from Dr. Zoubin Ghahramani’s GM Tutorial

  18. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI  Example o Child’s genes and his grandparents' genes o A D | B  Variable B d-separates A and D

  19. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI  Example: Rolling two dices …  o B C | o B C | D  V-Structure

  20. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI  Example: Dice example … o C : random numbers are in [1,6] o D E | C

  21. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI  Is B conditionally independent of C , given E ? o B C | E ?

  22. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI  Conditioned on no single variable makes C and D independent. o C D | { B, E }

  23. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI  Different between CB algorithms o completeness and complexity    Algorithms (not limited to) o TPDA: Three Phase Dependency Analysis, 1997 o TPDA: Three Phase Dependency Analysis, 1997 o SC: Sparse Candidate, 1999 o IC: Inductive Causation, 2000 o PC: Peter Spirtes and Clark Glymour, 2000 o MMHC: Max-Min Hill-Climbing, 2006 o RAI: Recursive Autonomy Identification, 2009

  24. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Title Bayesian Network Structure Learning by Recursive Autonomy Identifi fication Authors Authors Raanan Yehezkel and Boaz Lerner Raanan Yehezkel and Boaz Lerner Journal of Machine Learning Research (2009), pp. 1527-1570

  25. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI  Conditional independence tests  Edge direction (orientation rule)  Structure decomposition  Structure decomposition o Diminish the curse of dimensionality problem

  26. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI  d-separation resolution ( X , Y ) o Size of the smallest condition set that d-separates X and Y  d-separation resolution ( G ) o The highest d-separation resolution in the graph

  27. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI G A   Given , any two non-adjacent nodes in are G G d-separated given nodes either included in or its A G exogenous causes.  Formally: ...    A S { V V } s . t . X Y | S ex

  28. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI  Input: Observational data set  Output: Partial DAG to represent the Markov equivalent class RAI algorithm-Pseudo code 0 0 Start from a complete undirected graph Start from a complete undirected graph * Repeat the steps 1 to 3 from low to high graph d-separation resolution, until stopping criterion met (e.g. CI test threshold) 1 Test of CI between nodes, followed by the removal edges related to independence 2 Edge direction according to orientation rules (not always possible) 3 Graph decomposition into autonomous sub-structures. * For each sub-structure, apply RAI recursively (steps 1 to 3), while increasing the order of CI testing.

  29. Bayesian Networks Structure Learning: Introduction | Tutorial | RAI

Recommend


More recommend