1 A simplified definition Example: Rolling a dice Let be the - - PDF document

1
SMART_READER_LITE
LIVE PREVIEW

1 A simplified definition Example: Rolling a dice Let be the - - PDF document

Outline Probabilities Conditional probabilities Bayes theorem Distributions Discrete Continuous Advanced Herd Management Distribution functions Probabilities and distributions Sampling from distributions Estimation


slide-1
SLIDE 1

1

Slide 1

Advanced Herd Management Probabilities and distributions

Anders Ringgaard Kristensen

Slide 2

Outline

Probabilities Conditional probabilities Bayes’ theorem Distributions

  • Discrete
  • Continuous

Distribution functions Sampling from distributions

  • Estimation
  • Hypotheses
  • Confidence intervals

Slide 3

Probabilities: Basic concepts

The probability concept is used in daily language. What do we mean when we say:

  • The probability of the outcome ”5” when rolling a dice

is 1/6?

  • The probability that cow no. 543 is pregnant is 0.40?
  • The probability that USA will attack North Korea within

5 years is 0.05?

Slide 4

Interpretations of probabilities

At least 3 different interpretations are observed:

  • A “frequentist” interpretation:
  • The probability expresses how frequent we will observe a

given outcome if exactly the same experiment is repeated a “large” number of times. The value is rather

  • bjective.
  • An objective belief interpretation:
  • The probability expresses our belief in a certain

(unobservable) state or event. The belief may be based

  • n an underlying frequentist interpretation of similar

cases and thus be rather objective.

  • A subjective belief interpretation:
  • The probability expresses our belief in a certain

unobservable (or not yet observed) event.

Slide 5

”Experiments”

An experiment may be anything creating an

  • utcome we can observe.

The is the set of all possible

  • utcomes.

An is a subset of i.e. ⊆ Two events 1 and 2 are called , if they have no common outcomes, i.e. if 1 2 = ∅

Slide 6

Example of experiment

Rolling a dice:

  • The sample space is = {1, 2, 3, 4, 5, 6}
  • Examples of events:
  • 1 = {1}
  • 2 = {1, 5}
  • 3 = {4, 5, 6}
  • Since 1 3 = ∅, 1 and 3 are disjoint.
  • 1 and 2 are disjoint, because 1 ∩ 2 = {1}
slide-2
SLIDE 2

2

Slide 7

A simplified definition

Let be the sample space of an experiment. A probability distribution P on is a function, so that

  • P() = 1.
  • For any event ∈ 0 ≤ P() ≤ 1
  • For any two disjoint events 1 and 2 ,
  • P(A1 ∪ A2) = P(A1) + P(A2)

Slide 8

Example: Rolling a dice

Like before: = {1, 2, 3, 4, 5, 6} A valid probability function on S is, for ⊆ :

  • P() = ||/6 where || is the size of (i.e. the

number of elements it contains)

  • P({1}) = P({2}) = P({3}) = P({4}) = P({5}) =

P({6}) = 1/6

  • P({1, 5}) = 2/6 = 1/3
  • P({1, 2, 3}) = 3/6 = 1/2

Notice, that many other valid probability functions could be defined (even though the one above is the only one that makes sense from a frequentist point of view).

Slide 9

Independence

If two events and are independent, then P(

) = P()P().

Example: Rolling two dices

  • = {(1, 1), (1, 2),…, (1, 6),…, (6, 6)}
  • For any ⊆ : P() = ||/36
  • = {(6, 1), (6, 2), …, (6, 6)} ⇒ P() = 6/36 = 1/6
  • = {(1, 6), (2, 6), …, (6, 6)} ⇒ P() = 6/36 = 1/6
  • = {(6, 6)} and P( (1/6)(1/6) = 1/36

Slide 10

Conditional probabilities Let and be two events, where P() > 0 The probability of given is written as P(|), and it is by definition

Slide 11

Example: Rolling a dice

Again, let = {1, 2, 3, 4, 5, 6}, and P() = ||/6. Define = {1, 2, 3}, and A = {2}. Then = {2}, and The logical result: If you know the

  • utcome is 1, 2 or 3, it is reasonable to

assume that all 3 values are equally probable.

Slide 12

Conditional sum rule Let 1, 2, … be pair wise disjoint events so that Let be an event so that P() > 0. Then

slide-3
SLIDE 3

3

Slide 13

Sum rule: Dice example

Define the 3 disjoint events 1 = {1, 2}, 2 = {3, 4}, 3 = {5, 6} Thus 1 ∪ 2 ∪ 3 = Define {1, 3, 5} (we know that P() = ½) P(| 1) = P( 1)/P(1) = (1/6)/(1/3) = ½ P(| 2) = P( 2)/P(2) = (1/6)/(1/3) = ½ P(| 3) = P( 3)/P(3) = (1/6)/(1/3) = ½ Thus

Slide 14

Bayes’ theorem Let A1, A2, …An be pair wise disjoint events so that Let B be an event so that P(B) > 0. Then Bayes’ theorem is extremely important in all kinds of reasoning under uncertainty. Updating of belief.

Slide 15

Updating of belief, I

In a dairy herd, the conception rate is known to be 0.40. Define M as the event ”mating” for a cow. Define Π+ as the event ”pregnant” for the same cow, and Π- as the event ”not pregnant”. Thus P(Π+ | M) = 0.40 is a conditional probability. Given that the cow has been mated, the probability of pregnancy is 0.40. Correspondingly, P(Π- | M) = 0.60 After 3 weeks the farmer observes the cow for heat. The farmer’s heat detection rate is 0.55. Define H+ as the event that the farmer detects heat. Thus, P(H+ | Π-) = 0.55, and P(H- | Π-) = 0.45 There is a slight risk that the farmer erroneously observes a pregnant cow to be in heat. We assume, that P(H+ | Π+) = 0.01 Notice, that all probabilities are figures that makes sense and are estimated on a routine basis (except P(H+ | Π+) which is a guess)

Slide 16

Updating of belief, II

Now, let us assume that the farmer observes the cow, and concludes, that it is not in heat. Thus, we have observed the event H- and we would like to know the probability, that the cow is pregnant, i.e. we wish to calculate P(Π+ | H-) We apply Bayes’ theorem: We know all probabilities in the formula, and get In other words, our belief in the event ”pregnant” increases from 0.40 to 0.59 based on a negative heat observation result

Slide 17

Summary of probabilities

Probabilities may be interpreted

  • As frequencies
  • As objective or subjective beliefs in certain events

The belief interpretation enables us to represent uncertain knowledge in a concise way. Bayes’ theorem lets us update our belief (knowledge) as new

  • bservations are done.

Slide 18

Discrete distributions

In some cases the probability is defined by a certain function defined over the sample space. In those cases, we say that the outcome is drawn from a standard distribution. There exist standard distributions for many natural phenomena. If the sample space is a countable set, we denote the corresponding distribution as discrete.

slide-4
SLIDE 4

4

Slide 19

Discrete distributions If is the random variable representing the outcome, the expected value of a discrete distribution is defined as The variance is defined as We shall look at two important discrete distributions:

  • The binomial distribution
  • The Poisson distribution.

Slide 20

The binomial distribution I

Consider an experiment with binary outcomes: Success (s) or failure (f)

  • Mating of a sow → Pregnant (s), not pregnant (f)
  • Tossing a coin → Heads (s), tails (f)
  • Testing for a disease → Present (s), not present (f)

Assume that the probability of success is and that the experiment is repeated times. Let be the total number of successes observed in the experiments. The sample space of the compound experiments is = {0, 1, 2, …, } The is then said to be distributed with parameters and .

Slide 21

The binomial distribution II The probability function P( = ) is (by

  • bjective frequentist interpretation)

given by where is the which may be calculated or looked up in a table.

Slide 22

The binomial distribution III The mean (expected value) of a binomial distribution is simply E() = . The variance is Var() = (1M) The binomial distribution is one of the most frequently used distribution for natural phenomena.

Slide 23

The binomial distribution IV

Three binomial distributions with n = 10 0,05 0,1 0,15 0,2 0,25 0,3 0,35 1 2 3 4 5 6 7 8 9 10 k P(k ) 0,2 0,5 0,8

Three binomial distributions, where 10, and = 0.2, 0.5 and 0.8, respectively.

Slide 24

The Poisson distribution I

If a certain phenomenon occurs a random with a constant intensity (but independently of each

  • thers) the total number of occurrences in a

time interval of a given length (or in a space of a given area) is Poisson distributed with parameter λ Examples:

  • Number of (nonMinfectious) disease cases per

month

  • Number of feeding system failures per year
  • Number of labor incidents per year
slide-5
SLIDE 5

5

Slide 25

The Poisson distribution II

The sample space for is {0, 1, 2, … } The probability function P( = ) is (by objective frequentist interpretation) given by The expected value is E() = λ The variance is Var() = λ The Poisson distribution may be used as an approximation for a binomial distribution with ”small” and ”large”

Slide 26

The Poisson distribution III

Three Poisson distributions with λ = 2, 6 and 12, respectively.

Three poisson distributions

0,05 0,1 0,15 0,2 0,25 0,3 5 10 15 20 25 k P(k ) 2 6 12

Slide 27

Continuous distributions

In some cases, the sample space of a distribution is countable. If, furthermore, is an interval on R, the random variable taking values in is said to have a continuous distribution. For any ∈ , we have P() = 0. Thus, no probability function exists for a continuous distribution. Instead, the distribution is defined by a function ().

Slide 28

Density functions The density function f has the following properties (for a, b ∈ R and a ≤ b) Thus, for a continuous distribution, f can

  • nly be interpreted as a probability

when integrated over an interval.

Slide 29

Continuous distributions

For a continuous distribution, the expected value E() is defined as And the variance is (just like the discrete case) We shall here look at 3 important distributions:

  • The uniform distribution
  • The normal distribution
  • The exponential distributions

Slide 30

The uniform distribution

If = [; ], and the random variable has a uniform distribution on , then the density function is The expected value and the variance are

Uniform

0,2 0,4 0,6 0,8 1 0,5 1 1,5 2 x f(x)

slide-6
SLIDE 6

6

Slide 31

The normal distribution I If = R, and the random variable has a normal distribution on , then the density function is The expected value and the variance simply turn out to be E() = µ, and Var() = σ2 We say that is N(µ, σ2), or ∼ N(µ, σ2)

Slide 32

The normal distribution II

The normal distribution may be used to represent almost all kinds of random outcome on the continuous scale in the real world. Exceptions are phenomena that are bounded in some sense (e.g. the waiting time to be served in a queue cannot be negative) It can be showed (central limit theorems) that if 1, 2, …, are random variables of (more or less) any kind, then the sum = 1 + 2 + …+ is normally distributed for sufficiently large. The normal distribution is the cornerstone among statistical distributions.

Slide 33

Normal distributions III

Three normal distributions with mean and standard deviation

Three normal distributions

0,1 0,2 0,3 0,4 0,5

  • 10
  • 5

5 10 x f(x) m=0, s=3 m=-5, s=1 m=0, s=1

Slide 34

Normal distributions IV

The normal distribution with µ = 0, and σ = 1 is called the normal distribution. A random variable being standard normally distributed is often denoted as

  • The density function of the standard

normal distribution is often denoted as φ. It follows that

Slide 35

Normal distributions V Let 1 ∼ N(µ1σ1

2), 2 ∼ N(µ2σ2 2), and

1 and 2 are independent. Define 1 = 1 + 2 and 1 = 1 − 2. Then

  • 1 ∼ N(µ1 + µ2σ1

2 + σ2 2)

  • 2 ∼ N(µ1 − µ2σ1

2 + σ2 2)

Let and be arbitrary real numbers, and let ∼ N(µ, σ2). Define = + . Then, ∼ N(µ + σ2)

Slide 36

Normal distributions VI From the previous slide it follows in particular, that if ∼ N(µ, σ2), then So, if is the density function of ∼ N(µ, σ2), then Thus, we can calculate the value of any density function for a normal distribution from the density distribution of the standard normal distribution.

slide-7
SLIDE 7

7

Slide 37

Exponential distribution I If = R+ = ]0; ∞[, and the random variable has an exponential distribution on , then the density function is The expected value and the variance are E() = λM1, and Var() = λM2 We say that is exponentially distributed with parameter λ.

Slide 38

Exponential distribution II

The exponential distribution is in many ways the complimentary to the Poisson distribution. If something happens at random at constant intensity, the number of events within a fixed time interval is Poisson distributed, and the waiting time between two events is exponentially distributed. Less frequently used in herd management.

Slide 39

Exponential distribution III

Three exponential distributions

0,2 0,4 0,6 0,8 1 2 4 6 8 10 x f(x) 1 0,5 0,2

Three exponential distributions with mean 1, 2 and 5, respectively.

Slide 40

Distribution functions I The distributions presented have all been defined by their probability functions (discrete distributions) and density functions (continuous distributions). We might just as well have used the distribution function , which is defined in the same way for both classes of distributions:

  • () = P( ≤ )

Slide 41

Distribution functions II Even though the definition is the same, the value of the distribution function is calculated in different ways for the two classes of distributions.

  • For discrete distributions
  • For continuous distributions

Slide 42

Distribution functions III

It follows directly, that for a continuous distribution, ’() = () The distribution function of the standard normal distribution is often denoted as Φ, and naturally Φ’(!) = φ(!) . No closed form (formula) exists for Φ, it must be looked up in tables. For discrete distributions, the distribution function most often doesn’t have a closed form, so it must be looked up in tables.

slide-8
SLIDE 8

8

Slide 43

Distribution functions IV Any distribution function has the following two properties:

  • () → 0 for → M∞
  • () → 1 for → ∞

Slide 44

Distribution function, Binomial

Three binomial distributions with n = 10 0,05 0,1 0,15 0,2 0,25 0,3 0,35 1 2 3 4 5 6 7 8 9 10 k P(k ) 0,2 0,5 0,8 Three binomial distributions with n = 10 0,2 0,4 0,6 0,8 1 1,2 1 2 3 4 5 6 7 8 9 10 k P(k ) 0,2 0,5 0,8

Probability functions to the left, distribution functions to the right.

Slide 45

Distribution function, Poisson

Three poisson distributions 0,05 0,1 0,15 0,2 0,25 0,3 5 10 15 20 25 k P(k) 2 6 12 Three poisson distributions 0,2 0,4 0,6 0,8 1 1,2 5 10 15 20 25 k P(k ) 2 6 12

Probability function Distribution function

Slide 46

Distribution function, uniform

Uniform

0,2 0,4 0,6 0,8 1 0,5 1 1,5 2 x f(x)

Uniform

0,2 0,4 0,6 0,8 1 0,5 1 1,5 2 x f(x)

Density function to the left Distribution function to the right

Slide 47

Distribution function, normal

Three normal distributions 0,1 0,2 0,3 0,4 0,5

  • 10
  • 5

5 10 x f(x ) m=0, s=3 m=-5, s=1 m=0, s=1

Three normal distributions 0,2 0,4 0,6 0,8 1

  • 10
  • 5

5 10 x f(x ) m=0, s=3 m=-5, s=1 m=0, s=1

Density function to the left Distribution function to the right

Slide 48

Distribution function, exponential

Three exponential distributions 0,2 0,4 0,6 0,8 1 2 4 6 8 10 x f(x ) 1 0,5 0,2 Three exponential distributions 0,2 0,4 0,6 0,8 1 2 4 6 8 10 x f(x ) 1 0,5 0,2

Density function to the left Distribution function to the right

slide-9
SLIDE 9

9

Slide 49

Sampling from a distribution

Assume that 12"n are sampled independently from the same distribution having the known expectation µ and the known standard deviation σ Then the mean of the sample has the expected value µ and the standard deviation In particular, if the ’s are N(µ, σ2) then the sample mean is N(µ, σ2/)

Slide 50

Sampling from a normal distribution

Assume that 12"n are sampled independently from the same normal distribution N(µ, σ2) where µ is unknown and σ is known. For some reason we expect (hope) that µ has a certain value µ0, and we would therefore like to test the following hypothesis:

  • H0: µ = µ0

How can we do that? Well, we know that the sample mean is N(µ, σ2/)

Slide 51

Observations close to the mean are far more likely than distant

  • bservations.

From the distribution function we can calculate the likelihood that an observation falls within the interval µ ± σ The likelihood that an observation falls within the interval µ ± 2σ Rule of thumb: 2/3 of the observations falls within ±σ and 95% within ±2σ

A normal distribution with standard deviation 3 0,1 0,2

  • 1
  • 9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10

x f(x) m=0, s=3 A normal distribution with standard deviation 3 0,2 0,4 0,6 0,8 1

  • 1
  • 9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10

x f(x) m=0, s=3

Hypothesis testing, normal dist. I

Slide 52

We can test our hypothesis H0 for instance by calculating a confidence interval for the mean. A 95% confidence interval for the sample mean (distributed as N(µ, σ2/)) under H0 is calculated as If the sample mean is included in the interval, we accept H0, otherwise we reject. If # µ nor σ are known, the sample mean becomes studentMt distributed (with M1 degrees

  • f freedom) instead. Then the confidence interval

becomes wider as consequence of the uncertainty on σ. For large the studentMt distribution converges towards a standard normal distribution.

Hypothesis testing, normal dist. II

Slide 53

Hypothesis testing, binomial

Assume that we have observed the outcome of successes out of in a binomial trial. We would like to test the hypothesis:

  • H0: = 0

Under H0, the expected number of successes is E0() = 0 and the variance is Var0 = 0(1M0) How likely is it that the observed value of is drawn from a binomial distribution with parameters 0 and ? Basically two approaches may be used:

  • Approximate with the normal distribution N(0,Var0). This

is a reasonable approach if is big. Remember that now has a different meaning! We have only one observation from the distribution

  • Use the distribution function of the binomial distribution
  • directly. Only valid approach for small .

Slide 54

Other distributions

Used as hyper distributions for parameters of other distributions in order to represent uncertainty:

  • The Gamma distribution (hyper distribution for the mean

and variance of a poisson)

  • The Beta distribution (hyper distribution for the parameter
  • f a binomial distribution)

Will be discussed briefly under advanced topics. Distributions for statistical tests:

  • The χ2 distribution.
  • The studentMt distribution
  • The F distribution

Those distributions will not be discussed very much in this course. Many other distributions are described in literature …

slide-10
SLIDE 10

10

Slide 55

What distribution …

… can I use to represent:

  • Litter size in sheep?
  • Litter size in sows?
  • Number of cows/sows conceiving after first

service.

  • Time to first estrus?
  • Milk yield of dairy cows?
  • Daily gain of slaughter pigs?