Ehsan Nazerfard nazerfard@eecs.wsu.edu October 11, 2011
Introduction: Graphical Models Tutorial: Bayesian Networks Structure Learning Approaches RAI: Recursive Autonomy Identification for Bayesian Network Structure Learning Bayesian Network Structure Learning Summary and Further Studies Analysis Next Steps Discussion Topics
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Three main types of Graphical Models Represent joint probability distribution. o Nodes: random variables o Edges: statistical dependencies
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Three main types of Graphical Models Represent joint probability distribution. o Nodes: random variables o Edges: direct influence in directed graphs
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Why we need Graphical Models? o Intuitive way of representation of the relations between variables o Abstract out the conditional independence relations between variables between variables Conditional independence o “Is A dependent on B , given the value of C ? o A B | C P( A | B , C ) = P( A | C ) o A B | C P( A , B | C ) = P( A | C )P( B | C )
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Given , a Bayesian network is an X ( X ,..., X ) 1 n annotated DAG that represents a unique JPD over : X p ( X ,..., X ) p ( X | Pa ( X )) 1 n i i i i Each node is annotated with a CPT that represents p ( X | pa ( X )) i i DAG: Directed Acyclic Graph JPD: Joint Probability Distribution CPT: Conditional Probability Table
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Structure Learning o Find a structure of Bayesian Network (BN) that best describes the observed data. Parameter Learning o Learning the parameters when the structure is Learning the parameters when the structure is known. Generally parameter learning is a part of structure learning.
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Goal o Find a network structure of BN that describes the observed data the most. NP-Complete !! o Naïve Bayes … Naïve Bayes … o Using domain knowledge o Assumptions to make the problem tractable o The text books (generally) assume the network is already known. o …
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Two main categories: 1) Score and Search-Based (S&S) approach Learning the network structures o Constraint-Based (CB) approach 2) Constraint-Based (CB) approach Learning the edges composing a structure o
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Three main issues with S&S approach: 1. Search space 2. Search strategy 3. Model selection criterion 3. Model selection criterion
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Number of possible DAGs containing n nodes: n n i 1 i ( n i ) f ( n ) ( 1 ) 2 f ( n i ) i i 1 Curse of dimensionality Curse of dimensionality # of variables # of the possible DAGs 1 1 2 3 3 25 … … 8 78,370,2329,343 9 1,213,442,454,842,881 10 4,175,098,976,430,598,100
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Any search method from Artificial Intelligence: o DFS, BFS, Best First Search, Simulated Annealing o A-star and IDA-star o … How neighborhood is defined ? o Current structure + adding, deleting or reversing an arc. o No cycle is allowed K2 algorithm [3] o Greedy search (total ordering is known)
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Scoring function o Evaluates how well a given network G matches the data D . o The best BN is the one that maximizes the scoring function. function. k o Based on ML: arg max ( p ( D | , G )) G G G Most frequently used: Bayesian Information Criterion (BIC) [4]
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Input: Observational data set Output: The resulting Bayesian network Score and Search-Based approach – Pseudo code 1 1 Generate the initial BN (random or from domain knowledge), Generate the initial BN (random or from domain knowledge), evaluate it and set it as the current network. 2 Evaluate the neighbors of the current BN. 3 If the best score of the neighbors is better than the score of the current BN, set neighbor with the best score as the current network and go to step 2. 4 Else stop the learning process.
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Two main categories: 1) Score and Search-Based approach (S&S) Learning the network structures o Constraint-Based approach (CB) 2) Constraint-Based approach (CB) Learning the edges composing a structure o
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Learning the edges of a structure Discovering the conditional independence (CI) o relations from the data Infer the structure from learned relations o
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI “Is A dependent on B , given the value of C ? Examples * o child’s genes grandparents’ genes | parents’ genes o amount of speeding fine type of car | speed o amount of speeding fine type of car | speed o lung cancer yellow teeth | smoker o … * Borrowed from Dr. Zoubin Ghahramani’s GM Tutorial
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Example o Child’s genes and his grandparents' genes o A D | B Variable B d-separates A and D
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Example: Rolling two dices … o B C | o B C | D V-Structure
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Example: Dice example … o C : random numbers are in [1,6] o D E | C
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Is B conditionally independent of C , given E ? o B C | E ?
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Conditioned on no single variable makes C and D independent. o C D | { B, E }
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Different between CB algorithms o completeness and complexity Algorithms (not limited to) o TPDA: Three Phase Dependency Analysis, 1997 o TPDA: Three Phase Dependency Analysis, 1997 o SC: Sparse Candidate, 1999 o IC: Inductive Causation, 2000 o PC: Peter Spirtes and Clark Glymour, 2000 o MMHC: Max-Min Hill-Climbing, 2006 o RAI: Recursive Autonomy Identification, 2009
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Title Bayesian Network Structure Learning by Recursive Autonomy Identifi fication Authors Authors Raanan Yehezkel and Boaz Lerner Raanan Yehezkel and Boaz Lerner Journal of Machine Learning Research (2009), pp. 1527-1570
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Conditional independence tests Edge direction (orientation rule) Structure decomposition Structure decomposition o Diminish the curse of dimensionality problem
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI d-separation resolution ( X , Y ) o Size of the smallest condition set that d-separates X and Y d-separation resolution ( G ) o The highest d-separation resolution in the graph
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI G A Given , any two non-adjacent nodes in are G G d-separated given nodes either included in or its A G exogenous causes. Formally: ... A S { V V } s . t . X Y | S ex
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Input: Observational data set Output: Partial DAG to represent the Markov equivalent class RAI algorithm-Pseudo code 0 0 Start from a complete undirected graph Start from a complete undirected graph * Repeat the steps 1 to 3 from low to high graph d-separation resolution, until stopping criterion met (e.g. CI test threshold) 1 Test of CI between nodes, followed by the removal edges related to independence 2 Edge direction according to orientation rules (not always possible) 3 Graph decomposition into autonomous sub-structures. * For each sub-structure, apply RAI recursively (steps 1 to 3), while increasing the order of CI testing.
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
Recommend
More recommend