Two applications of Bayesian networks Jiˇ r´ ı Vomlel Laboratory for Intelligent Systems University of Economics, Prague Institute of Information Theory and Automation Academy of Sciences of the Czech Republic This presentation is available at: http://www.utia.cas.cz/vomlel/
Contents: • Bayesian networks as a model for reasoning with uncertainty • Building probabilistic models • Building “good” strategies using the models • Application 1: Adaptive testing • Application 2: Decision-theoretic troubleshooting
An example of a Bayesian network: P ( X 1 ) P ( X 2 ) X 1 X 2 P ( X 3 | X 1 ) X 3 X 4 P ( X 4 | X 2 ) P ( X 5 | X 1 ) X 5 X 6 P ( X 6 | X 3 , X 4 ) X 7 X 8 X 9 P ( X 9 | X 6 ) P ( X 7 | X 5 ) P ( X 8 | X 7 , X 6 )
Building Bayesian network models three basic approaches • Discussions with domain experts: expert knowledge is used to get the structure and parameters of the model • A dataset of records is collected and a machine learning method is used to to construct a model and estimate its parameters. • A combination of previous two: e.g. experts helps with the stucture, data are used to estimate parameters.
An example of a strategy: X 3 = yes X 3 : 1 4 < 2 5 ? X 2 = yes X 3 = no X 2 : 1 5 < 1 4 ? X 1 = yes X 2 = no X 1 : 1 5 < 2 5 ? X 1 = no X 3 is more difficult question than X 2 which is more difficult than X 1 .
Building strategies using the models For all terminal nodes ℓ ∈ L ( s ) of a strategy s we define: • steps that were performed to get to that node (e.g. questions answered in a certain way). It is called collected evidence e ℓ . • Using the probabilistic model of the domain we can compute probability of getting to a terminal node P ( e ℓ ) . • Also during the process, when we have collected certain evidence e we can update the probability of getting to a terminal node, which now corresponds to conditional probability P ( e ℓ )
Building strategies using the models For all terminal nodes ℓ ∈ L ( s ) of a strategy s we have also defined: • an evaluation function f : ∪ s ∈S L ( s ) �→ R . For each strategy we can compute: • expected value of the strategy: � E f ( s ) = P ( e ℓ ) · f ( e ℓ ) ℓ ∈L ( s ) The goal: • find a strategy that maximizes (minimizes) its expected value
Using entropy as an information measure “The lower the entropy of a probability distribution the more we know.” � H ( P ( S )) = − P ( S = s ) · log P ( S = s ) s
Entropy in node n X 2 H ( e n ) = H ( P ( S | e n )) X 3 X 1 X 2 Expected entropy at the end of test t X 3 � E H ( t ) = P ( e ℓ ) · H ( e ℓ ) X 1 ℓ ∈L ( t ) X 3 X 2 X 1 T ... the set of all possible tests X 3 (e.g. of a given length) X 1 A test t ⋆ is optimal iff X 2 X 3 X 1 t ⋆ = arg min t ∈T E H ( t ) . X 2
Application 1: Adaptive test of basic operations with fractions Examples of tasks: � 3 4 · 5 − 1 24 − 1 15 8 = 5 8 − 1 8 = 4 8 = 1 � T 1 : = 6 8 2 1 6 + 1 12 + 1 2 12 = 1 3 T 2 : = 12 = 12 4 1 4 · 1 1 1 4 · 3 2 = 3 T 3 : = 2 8 � 1 � 1 2 · 1 3 + 1 1 4 · 2 12 = 1 2 � � · 3 = T 4 : = 6 . 2 3
Elementary and operational skills 1 2 > 1 2 3 > 1 CP Comparison (common nu- 3 , 3 merator or denominator) 7 + 2 1 7 = 1+2 = 3 AD Addition (comm. denom.) 7 7 2 5 − 1 5 = 2 − 1 = 1 SB Subtract. (comm. denom.) 5 5 1 2 · 3 3 MT Multiplication 5 = 10 � 1 � 3 2 , 2 � 6 , 4 � CD Common denominator = 3 6 4 6 = 2 · 2 2 · 3 = 2 CL Cancelling out 3 7 2 = 3 · 2+1 = 3 1 CIM Conv. to mixed numbers 2 2 3 1 2 = 3 · 2+1 = 7 CMI Conv. to improp. fractions 2 2
Misconceptions Label Description Occurrence d = a + c a b + c MAD 14.8% b + d d = a − c a b − c MSB 9.4% b − d b = a · c a b · c MMT1 14.1% b b = a + c a b · c MMT2 8.1% b · b d = a · d a b · c MMT3 15.4% b · c a · c a b · c MMT4 d = 8.1% b + d c = a · b a b MC 4.0% c
Student model HV1 ACL ACMI ACIM ACD CP MT CL CMI CIM CD AD SB MMT1 MMT4 MMT3 MMT2 MC MAD MSB
Evidence model for task T 1 � 3 � 4 · 5 − 1 8 = 15 24 − 1 8 = 5 8 − 1 8 = 4 8 = 1 6 2 T 1 ⇔ MT & CL & ACL & SB & ¬ MMT 3 & ¬ MMT 4 & ¬ MSB ACL CL MT SB MMT4 MSB T1 MMT3 P ( X1 | T1 ) X1
Skill Prediction Quality 92 adaptive average descending 90 ascending 88 Quality of skill predictions 86 84 82 80 78 76 74 0 2 4 6 8 10 12 14 16 18 20 Number of answered questions
Application 2: Troubleshooting - Light print problem Actions Faults A 1 F 1 A 2 Problem F 2 A 3 F F 3 Questions F 4 Q 1 • Problems: F 1 Distribution problem , F 2 Defective toner , F 3 Corrupted dataflow , and F 4 Wrong driver setting . • Actions: A 1 Remove, shake and reseat toner , A 2 Try another toner , and A 3 Cycle power . • Questions: Q 1 Is the configuration page printed light?
Troubleshooting strategy A 2 = yes A 1 = yes A 1 = no A 2 = no Q 1 = no A 1 A 2 Q 1 A 2 = no A 1 = no Q 1 = yes A 2 A 1 A 2 = yes A 1 = yes The task is to find a strategy s ∈ S minimising expected cost of repair � E CR ( s ) = P ( e ℓ ) · ( t ( e ℓ ) + c ( e ℓ ) ) . ℓ ∈L ( s )
Going commercial... • Hugin Expert A/S. software product: Hugin - a Bayesian network tool. http://www.hugin.com/ • Educational Testing Service (ETS) the world’s largest private educational testing organization In 2000/2001 more than 3 millions students took the ETS’s largest exam SAT. Research unit doing research on adaptive test using Bayesian networks: http://www.ets.org/research/ • SACSO Project Systems for Automatic Customer Support Operations - research project of Hewlett Packard and Aalborg University. The troubleshooter offered as DezisionWorks by Dezide Ltd. http://www.dezide.com/
Recommend
More recommend