The Kalman filter - and other methods Anders Ringgaard Kristensen Slide 1
Outline Filtering techniques applied to monitoring of daily gain in slaughter pigs: • Introduction • Basic monitoring • Shewart control charts • DLM and the Kalman filter • DLM and the Kalman filter • Simple case • Seasonality • Online monitoring • Used as input to decision support Slide 2
”E-kontrol”, slaughter pigs Quarterly calculated production results Presented as a table A result for each of the most recent quarters and aggregated recent quarters and aggregated Sometimes comparison with expected (target) values Offered by two companies: • Dansk Landbrugsrådgivning, Landscentret (as shown) • AgroSoft A/S One of the most important key figures: Average daily gain Slide 3
Average daily gain, slaughter pigs We have: We have: • 4 quarterly results • 1 annual result • 1 target value How do we interpret the results? Question 1: How is the figure calculated? Slide 4
How is the figure calculated? The basic principles are: • Total (live) weight of pigs delivered: xxxx * • Total weight of piglets inserted: −xxxx ** *** • Valuation weight at end of the quarter: +xxxx *** • Valuation weight at beginning of the quarter: −xxxx • Total gain during the quarter yyyy Daily gain = (Total gain)/(Days in feed) Registration sources? Registration sources? • * Slaughter house – rather precise • ** Scale – very precise • *** ??? – anything from very precise to very uncertain Slide 5
First finding: Observation error All measurements are encumbered with uncertainty (error), but it is most prevalent for the valuation weights. We define a (very simple) model: κ = τ + e o , where: • • κ is the calculated daily gain (as it appears in the report) κ is the calculated daily gain (as it appears in the report) • τ is the true daily gain (which we wish to estimate) • e o is the observation error which we assume is normally distributed N(0, σ o 2 ) The structure of the model (qualitative knowledge) is the equation The parameters (quantitative knowledge) is the value of σ o (the standard deviation of the observation error). It depends on the observation method. Slide 6
Observation error κ = τ + e o , e o ∼ N(0, σ o 2 ) τ What we measure is κ What we wish to know is τ The difference between The difference between the two variables is undesired noise κ We wish to filter the noise away, i.e. we wish to estimate τ from κ Slide 7
Second finding: Randomness The true daily gains τ vary at random. Even if we produce under exactly the same conditions in two successive quarters the results will differ. We shall denote the phenomenon as the “sample error”. We have, τ = θ + e s , where • • e s is the sample error expressing random variation. We assume e s ∼ N(0, σ s e s is the sample error expressing random variation. We assume e s ∼ N(0, σ s ) 2 ) • θ is the underlying permanent (and true) value This supplementary qualitative knowledge should be reflected in the stucture of the model: κ = τ + e o = θ + e s + e o The parameters of the model are now: σ s og σ o Slide 8
Sample error and measurement error What we measure is κ θ What we wish to know is θ The difference between the two variables is undesired two variables is undesired noise: τ • Sample noise • Observation noise We wish to filter the noise away, i.e. we wish to estimate θ from κ κ Slide 9
The model in practice: Preconditions The model is necessary for any meaningful interpretation of calculated production results. The standard deviation on the sample error, σ s , depends on the natural individual variation between pigs in a herd and the herd size. The standard deviation of the observation error, σ o , depends on the measurement method of valuation weights. For the interpretation of the calculated results, it is the total uncertainty, σ , For the interpretation of the calculated results, it is the total uncertainty, σ , that matters ( σ 2 = σ s 2 + σ ο 2 ) Competent guesses of the value of σ using different observation methods (1250 pigs): • Weighing of all pigs: σ = 3 g • Stratified sample: σ = 7 g • Random sample: σ = 20 g • Visual assessment: σ = 29 g Slide 10
Different observation methods θ τ κ κ κ κ σ = 3 g σ = 7 g σ = 20 g σ = 29 g Slide 11
The model in practice: Interpretation Calculated daily gain in a herd was 750 g, whereas the expected target value was 775 g. Shall we be worried? It depends on the observation method! A lower control limit (LCL) is the target minus 2 times the standard deviation, i.e. 775 – 2 σ Using each of the 4 observation methods, we obtain the following LCLs: • Weighing of all pigs: 775 g – 2 x 3 g = 769 • Stratified sample: 775 g – 2 x 7 g = 761 • Random sample: 775 g – 2 x 20 g = 735 • Visual assessment: 775 g – 2 x 29 g = 717 Slide 12
Is this good or bad? Daily gain in a herd over 4 years. Slide 13 Third finding: Dynamics, time g 600 650 700 750 750 800 850 900 950 2. quarter 97 3. quarter 97 4. quarter 97 Daily gain, slaughter pigs 1. quarter 98 2. quarter 98 3. quarter 98 4. quarter 98 1. quarter 99 Quarter 2. quarter 99 3. quarter 99 4. quarter 99 1. quarter 00 2. quarter 00 3. quarter 00 4. quarter 00 1. quarter 01 2. quarter 01
Modeling dynamics We extend our model to include time. At time n we model the calculated result as follows: κ n = τ sn + e on = θ + e sn + e on Only change from before is that we know we have a new result each quarter. We can calculate control limits for each quarter and plot everything in a We can calculate control limits for each quarter and plot everything in a diagram: A Shewart Control Chart … θ τ 4 … τ 1 τ 2 τ 3 κ 1 κ 2 κ 3 κ 4 Slide 14
Slide 15 A simple Shewart control chart: Weighing all pigs g 600 650 700 750 750 800 850 900 950 2. kvartal 97 3. kvartal 97 4. kvartal 97 1. kvartal 98 Daily gain, slaughter pigs 2. kvartal 98 Upper control limit Observed gain 3. kvartal 98 4. kvartal 98 1. kvartal 99 Periode Period 2. kvartal 99 3. kvartal 99 4. kvartal 99 Lower control limit Expected 1. kvartal 00 2. kvartal 00 3. kvartal 00 4. kvartal 00 1. kvartal 01 2. kvartal 01
Simple Shewart control chart: Visual assessment Slide 16 g 600 650 700 750 750 800 850 900 950 2. kvartal 97 3. kvartal 97 4. kvartal 97 1. kvartal 98 Daily gain, slaughter pigs Upper control limit Observed gain 2. kvartal 98 3. kvartal 98 4. kvartal 98 1. kvartal 99 Periode Period 2. kvartal 99 3. kvartal 99 Lower control limit Expected 4. kvartal 99 1. kvartal 00 2. kvartal 00 3. kvartal 00 4. kvartal 00 1. kvartal 01 2. kvartal 01
Interpretation: Conclusion Something is wrong! Possible explanations: • The pig farmer has serious problems with fluctuating daily gains. • Something is wrong with the model: • Structure – our qualitative knowledge • Parameters – the quantitative knowledge (standard deviations). Slide 17
More findings: κ n = θ + e sn + e on The true underlying daily gain in the herd, θ , may change over time: • Trend • Seasonal variation The sample error e sn may be auto correlated • Temporary influences The observation error e on is obviously auto The observation error e on is obviously auto correlated: • Valuation weight at the end of Quarter n is the same as the valuation weight at the start of Quarter n +1 Slide 18
”Dynamisk e-kontrol” Developed and described by Madsen & Ruby (2000). Principles: • Avoid labor intensive valuation weighing. • Calculate new daily gain every time pigs have been sent to slaughter (typically weekly) • Use a simple Dynamic Linear Model to monitor daily gain • κ n = θ n + e sn + e on = θ n + v n , where v n ∼ N(0, σ v 2 ) • θ n = θ n -1 + w n , where w n ∼ N(0, σ w 2 ) • The calculated results are filtered by the Kalman filter in order to remove random noise (sample error + observation error) Slide 19
”Dynamisk E-kontrol”, results Raw data to the left – filtered data to the right Figures from: • Madsen & Ruby (2000). An application for early detection of growth rate changes in the slaughter pig production unit. Computers and Electronics in Agriculture 25, 261-270. Still: Results only available after slaughter Slide 20
The Dynamic Linear Model (DLM) Example Observation equation κ n = θ n + v n , v n ∼ N(0, σ v 2 ) System equation θ n = θ n -1 + w n , w n ∼ N(0, σ w 2 ) θ 1 θ 2 θ 3 θ 4 τ 1 τ 2 τ 3 τ 4 κ 1 κ 2 κ 3 κ 4 Slide 21
Extending the model F n θ n is the true level described as a vector product. A general level, θ 0 n , and 4 seasonal effects θ 1 n , θ 2 n , θ 3 n and θ 4 n are included in the model. From the model we are able to predict the expected daily gain for expected daily gain for next quarter. As long as the forecast errors are small, production is in control (no large change in true underlying level)! Slide 22
Observed and predicted Daily gain 950 Blue: Observed 900 Pink: Predicted 850 800 g 750 700 700 650 600 7 7 8 8 9 9 0 0 9 9 9 9 9 9 0 0 l l l l l l l l a a a a a a a a t t t t t t t t r r r r r r r r a a a a a a a a v v v v v v v v k k k k k k k k . . . . . . . . 2 4 2 4 2 4 2 4 Quarter Slide 23
Recommend
More recommend