an introduction to bayesian network inference using
play

An Introduction to Bayesian Network Inference using Variable - PowerPoint PPT Presentation

An Introduction to Bayesian Network Inference using Variable Elimination Jhonatan Oliveira Department of Computer Science University of Regina Outline Introduction F B Background Bayesian networks L D Variable


  1. An Introduction to 
 Bayesian Network Inference using Variable Elimination Jhonatan Oliveira Department of Computer Science University of Regina

  2. Outline • Introduction F B • Background • Bayesian networks L D • Variable Elimination • Repeated Computation H • Conclusions

  3. Introduction Bayesian networks are probabilistic graphical models used when reasoning under uncertainty .

  4. Uncertainty family out dog out • Conflicting information • Missing information bowel problem dog out

  5. Uncertainty family out dog out • Conflicting information • Missing information bowel problem dog out

  6. Uncertainty family out dog out • Conflicting information • Missing information bowel problem dog out

  7. Real World Applications

  8. Real World Applications TrueSkill™

  9. Real World Applications Turbo Codes

  10. Real World Applications Mars Exploration Rover

  11. Background Probability theory: introducing joint probability distribution, chain rule, and conditional independence

  12. Joint Probability Distribution • A multivariate function over a finite set of variables • Assigns a real number between 0 and 1 to each configuration (combination of variable’s values) of the variables • Summing all assigned real numbers yields 1

  13. Joint Probability Distribution Family Bowel Lights On Dog Out Hear Bark P(L,F,D,B,H) Out Problem 0 0 0 0 0 0.01 0 0 0 0 1 0.25 0 0 0 1 0 0.08 0 0 1 0 0 0.19

  14. Joint Probability Distribution Family Bowel Lights On Dog Out Hear Bark P(L,F,D,B,H) Out Problem 1st 0 0 0 0 0 0.01 Query 0.25 0 0 0 0 1 2nd 0 0 0 1 0 0.08 Query + 0 0 1 0 0 0.19

  15. Joint Probability Distribution The size issue = 32 probabilities

  16. Chain Rule P(…) = P(L) P(F|L) P(D|L,F) P(B|L,F,D) P(H|L,F,D,B) Conditional Probability Tables

  17. Chain Rule The size issue = 62 probabilities

  18. Conditional Independence Given: family out dog out dog out hear bark

  19. Conditional Independence Given: family out dog out dog out hear bark Independence I(family out, dog out, hear bark) : family out dog out hear bark

  20. 
 
 Conditional Independence • Given I(X,Y,Z): • P(X|Y,Z) = P(X|Y) 
 I(L,F,D) • Given I(L,F,D) • P(D|L,F) = P(D|F)

  21. Chain Rule & Conditional Independence P(L,F,D,B,H) Chain Rule P(L) P(F|L) P(D|L,F) P(B|L,F,D) P(H|L,F,D,B) I(D,F,L) P(L) P(F|L) P(D|F) P(B|L,F,D) P(H|L,F,D,B) I(B, ,F) P(L) P(F|L) P(D|F) P(B|L,D) P(H|L,F,D,B) ?

  22. Bayesian network A graphical interpretation of probability theory

  23. Directed Acyclic Graph Family out Bowel problem Lights on Dog out Hear bark

  24. Testing Independences F B L D H A set of variables X is d-separated from a set of variables Y in the DAG if all paths from X to Y are blocked

  25. Testing Independences F B L D H Is F d-separated from H given D? Yes, namely, I(F,D,H) holds in P(L,F,D,B,H)

  26. Testing Independences P(F) P(B) F B P(L|F) P(D|B,F) L D P(H|D) H The size issue = 18 probabilities

  27. Bayesian Network P(F) P(B) F B P(L|F) P(D|B,F) L D P(H|D) H A directed acyclic graph B and 
 a set of conditional probability tables P(U) = P(v | Pa(v)), where v is in B and Pa(v) are the parents of v

  28. Bayesian Network F B L D H P(L,F,D,B,H) = P(L|F) P(F) P(B) P(D|B,F) P(H|D)

  29. Inference P(L,F,D,B,H) P(L|F) part P(F) P(B) P(L) P(D|B,F) P(H|D)

  30. Inference P(L|F) P(F) P(L,F) F P(B) P(L) X + P(D|B,F) P(H|D)

  31. Inference Multiplication L F P(L|F) L F P(L,F) 0 0 0.8 0 0 0.64 F P(F) 0 1 0.3 X = 0 1 0.09 0 0.8 1 0.3 1 0 0.2 1 0 0.16 1 1 0.7 1 1 0.21

  32. Inference Marginalization L F P(L,F) F 0 0 0.2 L P(F) + 0 1 0.3 = 0 0.5 1 0.5 1 0 0.4 1 1 0.1

  33. Inference Algorithms P(L|F) P(F) Shafer-Shennoy Lauritzen and Spiegalhalter P(B) P(L) Hugin Lazy Propagation Variable Elimination P(D|B,F) P(H|D)

  34. Variable Elimination Eliminates all variables that are not in the query

  35. Variable Elimination Algorithm Input: factorization F , elimination ordering L , query X , evidence Y Output: P(X|Y) For each variable v in L : multiply all CPTs in F involving v yielding CPT P1 marginalize v out of P1 remove all CPTs from F involving v append P1 to F Multiply all remaining CPTs in F yielding P(X,Y) return P(X|Y) = P(X,Y) / P(Y)

  36. Variable Elimination Algorithm P(H | L)? F B L D H P(L,F,D,B,H) = P(L|F) P(F) P(B) P(D|B,F) P(H|D)

  37. Variable Elimination Algorithm Input Factorization: P(L|F) P(F) P(B) P(D|B,F) P(H|D) Query variable: H Evidence variable: L=1 Elimination ordering: B, F, D

  38. Variable Elimination Algorithm Eliminating B P(B,D|F) = P(B) P(D|B,F) P(D|F) = marginalize B from P(B,D|F) Factorization: P(L|F) P(F) P(H|D) P(D|F) Eliminating F P(D,F,L) = P(L|F) P(F) P(D|F) P(D,L) = marginalize F from P(D,F,L) Factorization: P(H|D) P(D,L)

  39. Variable Elimination Algorithm Eliminating D P(D,H,L) = P(H|D) P(D,L) P(H,L) = marginalize D from P(D,H,L) Factorization: P(H,L) Output P(L) = marginalize H from P(H,L) P(H|L) = P(H,L) / P(L)

  40. Repeated Computation Variable Elimination can perform repeated computation

  41. Variable Elimination Algorithm P(H | F)? F B L D H P(L,F,D,B,H) = P(L|F) P(F) P(B) P(D|B,F) P(H|D)

  42. Variable Elimination Algorithm Input Factorization: P(L|F) P(F) P(B) P(D|B,F) P(H|D) Query variable: H Evidence variable: F=1 Elimination ordering: L, B, D

  43. Variable Elimination Algorithm Eliminating L 1(F) = marginalize L from P(L|F) Factorization: P(F) P(B) P(D|B,F) P(H|D) Eliminating B P(B,D|F) = P(B) P(D|B,F) P(D|F) = marginalize B from P(B,D|F) Factorization: P(F) P(H|D) P(D|F)

  44. Variable Elimination Algorithm Eliminating D P(D,H|F) = P(H|D) P(D|F) P(H|F) = marginalize D from P(D,H|F) Factorization: P(F) P(H|F) Multiply all: P(F,H) = P(F) P(H|F) Output P(F) = marginalize H from P(F, H) P(H|F) = P(F,H) / P(F)

  45. Repeated Computation Eliminating B P(H|L) P(B,D|F) = P(B) P(D|B,F) P(D|F) = marginalize B from P(B,D|F) Factorization: P(L|F) P(F) P(H|D) P(D|F) Eliminating B P(B,D|F) = P(B) P(D|B,F) P(D|F) = marginalize B from P(B,D|F) P(H|F) Factorization: P(F) P(H|D) P(D|F)

  46. Repeated Computation • Store past computation • Find relevant computation for new query • Retrieve computation that can be reused

  47. Variable Elimination as a Join Tree P(B) P(D|B,F) P(L|F) P(F) P(H|D) P(D|F) P(D,L) D,F,L D,H,L D,B,F P(H,L) H,L Answering P(H|L)

  48. Variable Elimination as a Join Tree P(B) P(D|B,F) P(L|F) P(F) P(H|D) P(D|F) P(D,L) D,F,L D,H,L D,B,F P(H,L) H,L Answering P(H|F)

  49. Conclusions • Bayesian networks are useful F B probabilistic graphical models • Inference can be performed by Variable Elimination L D • Future work will investigate how to avoid repeated computation during Variable H Elimination

  50. References • Bonaparte Project: http://www.bonaparte-dvi.com/ • McEliece, Robert J.; MacKay, David J. C.; Cheng, Jung-Fu (1998), "Turbo decoding as an instance of Pearl's "belief propagation" algorithm", IEEE Journal on Selected Areas in Communications 16 (2): 140–152, doi:10.1109/49.661103, ISSN 0733-8716. • Microsoft True Skill: http://research.microsoft.com/en-us/projects/trueskill/ • N. Serrano, "A Bayesian Framework for Landing Site Selection during Autonomous Spacecraft Descent," Intelligent Robots and Systems, 2006 IEEE/RSJ International Conference on, Beijing, 2006, pp. 5112-5117 • Koller, D., & Friedman, N. (2009). Probabilistic Graphical Models - Principles and Techniques. MIT Press 2009. • Darwiche, A. (2009). Modeling and Reasoning with Bayesian Networks (1st ed.). Cambridge University Press. • Shafer, G., & Shenoy, P. P. (1989). Probability Propagation. • Charniak, E. (1991). Bayesian networks without tears. AI Magazine, 12(4), 50–63.

Recommend


More recommend