Bayesian Networks • You’ve heard about how Bayesian networks have revolutionized AI • You’ve seen what they are CS 331: Bayesian Networks 2 • There are two nagging questions: 1. How do you come up with a Bayesian network structure? 2. How do you do inference on Bayesian networks? • We will deal with the first one today… 1 2 Bayesian Network Topology • So how do you come up with the Bayesian network structure? Designing Bayesian Networks By • Two options: Hand 1. Design by hand 2. Learn it from data 3 4 Getting an Expert to Design the Designing the Network Topology Network by Hand • • Key point: Bayesian network exploits Could get a domain expert to help design the Bayesian network conditional independence to produce a • compact representation of the full joint Need the domain expert to come up with: distribution 1. Network Topology • Compactness is due to the fact that a 2. Parameters (i.e. probabilities) in the Bayesian network is a locally structured conditional probability tables system 5 6 1
What If The Network is Densely Locally Structured Systems Connected? Then your representation can’t take advantage of conditional independence for compactness • Possible but unlikely • Could drop a few links (sacrifice accuracy for compactness) 7 8 Constructing a Locally Structured Choosing the Wrong Order Bayesian Network • Needs: What happens if you add nodes in the wrong order? 1. Each variable to be directly influenced by a Compact network Not-So-Compact Network few others 2. Parents are the direct influences of a node JohnCalls Burglary Earthquake MaryCalls • Process: – Add “root causes” first Alarm Alarm – Then the variables they influence – Keep going until you reach the “leaves” JohnCalls MaryCalls Burglary Earthquake which do not have a direct causal influence on the other variables 9 10 Choosing the Wrong Order Choosing the Wrong Order Compact network Not-So-Compact Network Compact network Not-So-Compact Network JohnCalls JohnCalls Burglary Earthquake MaryCalls Burglary Earthquake MaryCalls Alarm Alarm Alarm Alarm JohnCalls MaryCalls Burglary Earthquake JohnCalls MaryCalls Burglary Earthquake Two more links Note: Both networks can represent the same joint probability distribution. The problem is that the one on the right doesn’t Some links result in conditional probability tables that require represent all the conditional independence relationships and some unnatural/difficult probability judgments eg. P(Earthquake | links need not be there Burglary, Alarm ) 12 2
Designing the Parameters in the Diagnostic versus Causal models Bayesian Network • Build causal models i.e. a link from Node X • As was mentioned previously, make sure the to Node Y indicates X causes Y probabilities in the CPT are natural and easy for an expert to come up with • Don’t build diagnostic models i.e. Links go • E.g. P(Earthquake | Burglary, Alarm ) is not from symptoms to causes natural but P( Alarm | Burglary, Earthquake ) is • Diagnostic models result in additional • In general, coming up with these probabilities can dependencies between otherwise be tricky independent causes • E.g. A physician can’t tell you exactly what • Causal models result in fewer parameters P( Headache | Flu ) is. and easier parameters to come up with 13 14 Designing the Parameters of the Example Bayesian Network • Possible solutions: • Monty Hall problem – Specify a range of values for that probability – What does the Bayes net look like? – Specify a distribution for the probability with a – What do the CPTs look like? known form – Could get expert to encode relative relationships e.g. “This value is twice as likely as the other one” – Get probabilities from studies or census 15 16 Learning Structure From Data • You can think of the structure and parameters of the Bayesian network as Learning Bayesian Network representing causal knowledge about the Structure From Data domain • If you don’t have an expert, you can learn both the structure and parameters from data 17 18 3
Learning Structure From Data Learning the Structure from Data • There are other good reasons for learning Two cases: the structure/parameters from data 1. Complete data • The actual causal model may be unavailable 2. Incomplete data or unknown • The actual causal model may be subject to We will describe what these mean! dispute (maybe because of a subjective bias by the domain expert) 19 20 Parameter Learning From Complete Complete Data Data • Let’s first assume that the Bayesian network • Your domain is fully observable (i.e. you can observe the values of all the random variables in the data) structure is fixed • Your data has no missing values • Learning the parameters from complete data is easy (will say more in naïve Bayes No missing values Has 3 missing values context next time) Age Gender Home Age Gender Home Zip Zip • We won’t deal with incomplete data in this 50-60 Male 97330 ? Male 97330 class 20-30 Female 97333 20-30 Female ? 40-50 Female 97331 ? Female 97331 21 22 Learning the Structure Learning the Structure • Involves a search over possible directed • This is clearly impossible to do an acyclic graph structures to find the best exhaustive search to find the optimal fitting one structure • However, for n nodes, there are the • Need to resort to local search methods e.g. following number of possible structures hill-climbing, simulated annealing [Robinson, 1973]: • We’ll illustrate this using a 3 node example. n 2 ( ! 2 ) O n 23 24 4
Local Search Methods Local Search Methods Neighborhood: A Initial State: B C A A Current State B C B C A A A B C B C B C Start with no links Start with a random set of links Add a link Remove a link Reverse a link 25 26 Things to Watch Out For The Evaluation Function • Need to avoid introducing cycles • How do we know if a Bayes net structure is good? • Need to re-estimate parameters everytime • you modify a link in the Bayes net Two types of evaluation functions: – Do you need to re-estimate the parameters for 1. Evaluate if conditional independence all nodes? relationships in the learned network match those in the data – No, just the ones that are affected by the modified link 2. Evaluate how well the learned network explains the data (in the probabilistic sense). • Lots of local optima problems. Use random restarts. 27 28 Example: Citizen scientists may Environmental Detection variables conditions Purple Finch confuse two species of finch Purple Finch House Finch True occupancy Observations status True occupancy House Finch Observations status • Habitat: Mixed and • Habitat: cities and coniferous woodlands; residential areas; coastal ornamental conifers in valleys that have become gardens. suburban. Environmental Detection variables conditions Photo credits: Chris Wood 5
Solution: Multi-species occupancy modeling Result: Species confused by eBirders Environmental Detection variables conditions Photo credits: Chris Wood What You Need To Know • How to get an expert to design a Bayesian network by hand • Briefly describe how you would use local search to learn the structure of a Bayesian network 33 6
Recommend
More recommend