two applications of bayesian networks
play

Two applications of Bayesian networks Ji r Vomlel Laboratory for - PowerPoint PPT Presentation

Two applications of Bayesian networks Ji r Vomlel Laboratory for Intelligent Systems University of Economics, Prague Institute of Information Theory and Automation Academy of Sciences of the Czech Republic This presentation is


  1. Two applications of Bayesian networks Jiˇ r´ ı Vomlel Laboratory for Intelligent Systems University of Economics, Prague Institute of Information Theory and Automation Academy of Sciences of the Czech Republic This presentation is available at: http://www.utia.cas.cz/vomlel/

  2. Contents: • Bayesian networks as a model for reasoning with uncertainty • Building probabilistic models • Building “good” strategies using the models • Application 1: Adaptive testing • Application 2: Decision-theoretic troubleshooting

  3. An example of a Bayesian network: P ( X 1 ) P ( X 2 ) X 1 X 2 P ( X 3 | X 1 ) X 3 X 4 P ( X 4 | X 2 ) P ( X 5 | X 1 ) X 5 X 6 P ( X 6 | X 3 , X 4 ) X 7 X 8 X 9 P ( X 9 | X 6 ) P ( X 7 | X 5 ) P ( X 8 | X 7 , X 6 )

  4. Building Bayesian network models three basic approaches • Discussions with domain experts: expert knowledge is used to get the structure and parameters of the model • A dataset of records is collected and a machine learning method is used to to construct a model and estimate its parameters. • A combination of previous two: e.g. experts helps with the stucture, data are used to estimate parameters.

  5. An example of a strategy: X 3 = yes X 3 : 1 4 < 2 5 ? X 2 = yes X 3 = no X 2 : 1 5 < 1 4 ? X 1 = yes X 2 = no X 1 : 1 5 < 2 5 ? X 1 = no X 3 is more difficult question than X 2 which is more difficult than X 1 .

  6. Building strategies using the models For all terminal nodes ℓ ∈ L ( s ) of a strategy s we define: • steps that were performed to get to that node (e.g. questions answered in a certain way). It is called collected evidence e ℓ . • Using the probabilistic model of the domain we can compute probability of getting to a terminal node P ( e ℓ ) . • Also during the process, when we have collected certain evidence e we can update the probability of getting to a terminal node, which now corresponds to conditional probability P ( e ℓ )

  7. Building strategies using the models For all terminal nodes ℓ ∈ L ( s ) of a strategy s we have also defined: • an evaluation function f : ∪ s ∈S L ( s ) �→ R . For each strategy we can compute: • expected value of the strategy: � E f ( s ) = P ( e ℓ ) · f ( e ℓ ) ℓ ∈L ( s ) The goal: • find a strategy that maximizes (minimizes) its expected value

  8. Using entropy as an information measure “The lower the entropy of a probability distribution the more we know.” � H ( P ( S )) = − P ( S = s ) · log P ( S = s ) s

  9. Entropy in node n X 2 H ( e n ) = H ( P ( S | e n )) X 3 X 1 X 2 Expected entropy at the end of test t X 3 � E H ( t ) = P ( e ℓ ) · H ( e ℓ ) X 1 ℓ ∈L ( t ) X 3 X 2 X 1 T ... the set of all possible tests X 3 (e.g. of a given length) X 1 A test t ⋆ is optimal iff X 2 X 3 X 1 t ⋆ = arg min t ∈T E H ( t ) . X 2

  10. Application 1: Adaptive test of basic operations with fractions Examples of tasks: � 3 4 · 5 − 1 24 − 1 15 8 = 5 8 − 1 8 = 4 8 = 1 � T 1 : = 6 8 2 1 6 + 1 12 + 1 2 12 = 1 3 T 2 : = 12 = 12 4 1 4 · 1 1 1 4 · 3 2 = 3 T 3 : = 2 8 � 1 � 1 2 · 1 3 + 1 1 4 · 2 12 = 1 2 � � · 3 = T 4 : = 6 . 2 3

  11. Elementary and operational skills 1 2 > 1 2 3 > 1 CP Comparison (common nu- 3 , 3 merator or denominator) 7 + 2 1 7 = 1+2 = 3 AD Addition (comm. denom.) 7 7 2 5 − 1 5 = 2 − 1 = 1 SB Subtract. (comm. denom.) 5 5 1 2 · 3 3 MT Multiplication 5 = 10 � 1 � 3 2 , 2 � 6 , 4 � CD Common denominator = 3 6 4 6 = 2 · 2 2 · 3 = 2 CL Cancelling out 3 7 2 = 3 · 2+1 = 3 1 CIM Conv. to mixed numbers 2 2 3 1 2 = 3 · 2+1 = 7 CMI Conv. to improp. fractions 2 2

  12. Misconceptions Label Description Occurrence d = a + c a b + c MAD 14.8% b + d d = a − c a b − c MSB 9.4% b − d b = a · c a b · c MMT1 14.1% b b = a + c a b · c MMT2 8.1% b · b d = a · d a b · c MMT3 15.4% b · c a · c a b · c MMT4 d = 8.1% b + d c = a · b a b MC 4.0% c

  13. Student model HV1 ACL ACMI ACIM ACD CP MT CL CMI CIM CD AD SB MMT1 MMT4 MMT3 MMT2 MC MAD MSB

  14. Evidence model for task T 1 � 3 � 4 · 5 − 1 8 = 15 24 − 1 8 = 5 8 − 1 8 = 4 8 = 1 6 2 T 1 ⇔ MT & CL & ACL & SB & ¬ MMT 3 & ¬ MMT 4 & ¬ MSB ACL CL MT SB MMT4 MSB T1 MMT3 P ( X1 | T1 ) X1

  15. Skill Prediction Quality 92 adaptive average descending 90 ascending 88 Quality of skill predictions 86 84 82 80 78 76 74 0 2 4 6 8 10 12 14 16 18 20 Number of answered questions

  16. Application 2: Troubleshooting - Light print problem Actions Faults A 1 F 1 A 2 Problem F 2 A 3 F F 3 Questions F 4 Q 1 • Problems: F 1 Distribution problem , F 2 Defective toner , F 3 Corrupted dataflow , and F 4 Wrong driver setting . • Actions: A 1 Remove, shake and reseat toner , A 2 Try another toner , and A 3 Cycle power . • Questions: Q 1 Is the configuration page printed light?

  17. Troubleshooting strategy A 2 = yes A 1 = yes A 1 = no A 2 = no Q 1 = no A 1 A 2 Q 1 A 2 = no A 1 = no Q 1 = yes A 2 A 1 A 2 = yes A 1 = yes The task is to find a strategy s ∈ S minimising expected cost of repair � E CR ( s ) = P ( e ℓ ) · ( t ( e ℓ ) + c ( e ℓ ) ) . ℓ ∈L ( s )

  18. Going commercial... • Hugin Expert A/S. software product: Hugin - a Bayesian network tool. http://www.hugin.com/ • Educational Testing Service (ETS) the world’s largest private educational testing organization In 2000/2001 more than 3 millions students took the ETS’s largest exam SAT. Research unit doing research on adaptive test using Bayesian networks: http://www.ets.org/research/ • SACSO Project Systems for Automatic Customer Support Operations - research project of Hewlett Packard and Aalborg University. The troubleshooter offered as DezisionWorks by Dezide Ltd. http://www.dezide.com/

Recommend


More recommend