Decision support methods – revisited Anders Ringgaard Kristensen
Markov decision processes – aka “Dynamic programming” A true dynamic method – may run “for ever” Handles the combinatorial explosion in a very efficient way The most important limitations are: The Markov property • Full observability of state space • The curse of dimensionality • Advanced Quantitative Methods in Herd Management Dias 2
What is this? … … Advanced Quantitative Methods in Herd Management Slide 3
What is this? Advanced Quantitative Methods in Herd Management Slide 4
What is this? Advanced Quantitative Methods in Herd Management Slide 5
What is this? Advanced Quantitative Methods in Herd Management Slide 6
What is this? Advanced Quantitative Methods in Herd Management Slide 7
The Markov property Formal definition: Let i t be the state at stage t . The Markov property is satisfied if and only if P( i t+ 1 | i t , i t- 1 , … , i 1 ) = P ( i t+ 1 | i t ) In words: The distribution of the state at next stage depends only on the present state – previous states are not relevant. This property is crucial in Markov decision processes. Milk Milk Milk Milk Red edges violate the Markov property If our biological knowledge implies that the red edges should be there, we need to take it into account by 1. Memory variables 2. Bayesian updating techniques Advanced Quantitative Methods in Herd Management Dias 8
The Markov property – how to compensate I Milk Milk Milk Milk If the “biological truth” is as shown above, we may include memory variables in the state space. Milk Milk Milk Milk prev prev prev prev Milk Milk Milk Milk The trick has been used in numerous dairy cow (and sow) replacement studies. Advanced Quantitative Methods in Herd Management Dias 9
What is this? Advanced Quantitative Methods in Herd Management Slide 10
What is this Advanced Quantitative Methods in Herd Management Slide 11
What is this? Advanced Quantitative Methods in Herd Management Slide 12
What is this? Our sheep litter size model from mandatory report Y n = µ n + A + ε n Does the model make sense? Advanced Quantitative Methods in Herd Management Slide 13
The Markov property – how to compensate II A more general approach is to introduce a latent (unobservable) variable interpreted as the milk production potential (MPP) of the cow. Each time a new milk yield observation is made, the MPP is re- estimated using Bayesian updating. The estimated milk production potential, eMPP, is included in the state space. It works because eMPP is observable! eMPP eMPP eMPP eMPP Milk Milk Milk Milk The trick is used in several newer dairy cow replacement studies. Advanced Quantitative Methods in Herd Management Dias 14
Changing to an MDP Figure 12.8 in textbook … Kalman filter: Dynamic Linear Model: • Updates estimate for A n • Y n = µ n + A n + ε n each time a new litter size • A n = A n-1 is observed. Advanced Quantitative Methods in Herd Management Slide 15
The dynamic case – when time matters Time t Time t +1 Time t +2 Time t +3 Mast Mast Mast Mast Milk Milk Milk Milk Age Age Age Age Dec Dec Dec Util Util Util Util Replace Treat Gross Keep margin Advanced Quantitative Methods in Herd Management Dias 16
The curse of dimensionality – and how to avoid it When several state variables (cow traits) are considered at a realistic number of levels, the state space grows to prohibitive dimensions. Dairy cow replacement models often have millions of state combinations. The solution is to decompose the state space according to time and build a hierarchical model. Has a tremendous effect on computational performance – even models with millions of state combinations can be solved. The technique has been used in numerous dairy cow replacement models. Implemented in • The MLHMP software system (MLHMP) as used in this course • The MDP package for R • SIMBA – The Israeli MLHMP project Advanced Quantitative Methods in Herd Management Dias 17
A hierarchical Markov decision process with three levels BI: Breeding index ePP: Estimated Permanent Potential Cow Herd life of cow eTP: Estimated Temporary Potential BI level 2 nd lactation 1 st lactation Lact. ePP ePP ePP level ePP ePP ePP ePP ePP ePP ePP Month/ Week/ Day eTP eTP eTP eTP eTP eTP eTP Dec Other decisions and utilities are omitted for clarity Advanced Quantitative Methods in Herd Management Util Dias 18
Markov decision processes – summary In a Markov process we have: • A structure • Time as stages • A state being observed at each stage • Often defined by the values of several state variables • An action being taken when the state is known • A numerical content • Rewards depending on state and action • (Outputs of various kinds …) • Transition probabilities from state i to state j depending on action • The Markov property Algorithms • Value iteration: Finite time • Policy iteration: Infinite time Advanced Quantitative Methods in Herd Management Slide 19
Hierarchical Markov processes In a hierarchical model, each level is modeled by separate Markov decision processes. The uppermost level is called the founder Lower levels are called children / child levels They all have the usual properties of an MDP: • Structure (stages, states, actions) • Numerical content (rewards, outputs, transition probabilities) But: • The numerical content is only specified at the lowest level • Higher levels calculate their parameters from their children Advanced Quantitative Methods in Herd Management Slide 20
Hierarchical Markov processes In a state of a process ρ at child level n we know: • Stage and state of process ρ • Stage, state and action of process at level n -1 • … • State and action of founder Sow Parity Phase ρ Advanced Quantitative Methods in Herd Management Slide 21
Recommend
More recommend