Probability and Random Processes Lecture 5 • Probability and random variables • The law of large numbers Mikael Skoglund, Probability and random processes 1/21 Why Measure Theoretic Probability? • Stronger limit theorems • Conditional probability/expectation • Proper theory for continuous and mixed random variables Mikael Skoglund, Probability and random processes 2/21
Probability Space • A probability space is a measure space (Ω , A , P ) • the sample space Ω is the ’universe,’ i.e. the set of all possible outcomes • the event class A is a σ -algebra of measurable sets called events • the probability measure is a measure on events in A with the property P (Ω) = 1 Mikael Skoglund, Probability and random processes 3/21 Interpretation • A random experiment generates an outcome ω ∈ Ω • For each A ∈ A either ω ∈ A or ω / ∈ A • An event A in A occurs if ω ∈ A with probability P ( A ) • since A is the σ -algebra of measurable sets, we are ensured that all ’reasonable’ combinations of events and sequences of events are measurable, i.e., have probabilities Mikael Skoglund, Probability and random processes 4/21
With Probability One • An event E ∈ A occurs with probability one if P ( E ) = 1 • almost everywhere, almost certainly, almost surely,. . . Mikael Skoglund, Probability and random processes 5/21 Independence • E and F in A are independent if P ( E ∩ F ) = P ( E ) P ( F ) • The events in a collection A 1 , . . . , A n are • pairwise independent if A i and A j are independent for i � = j • mutually independent if for any { i 1 , i 2 , . . . , i k } ⊆ { 1 , 2 , . . . , n } P ( A i 1 ∩ A i 2 ∩ · · · ∩ A i k ) = P ( A i 1 ) P ( A i 2 ) · · · P ( A i k ) • An infinite collection is mutually independent if any finite subset of events is mutually independent • ’mutually’ ⇒ ’pairwise’ but not vice versa Mikael Skoglund, Probability and random processes 6/21
Eventually and Infinitely Often • A probability space (Ω , A , P ) and an infinite sequence of events { A n } , define � ∞ � ∞ ∞ � ∞ � � � � � lim inf A n = A k , lim sup A n = A k n =1 n =1 k = n k = n • ω ∈ lim inf A n iff there is an N such that ω ∈ A n for all n > N , that is, the event lim inf A n occurs eventually, { A n eventually } • ω ∈ lim sup A n iff for any N there is an n > N such that ω ∈ A n , that is, the event lim sup A n occurs infinitely often { A n i.o. } Mikael Skoglund, Probability and random processes 7/21 Borel–Cantelli • The Borel–Cantelli lemma: A probability space (Ω , A , P ) and an infinite sequence of events { A n } 1 if � n P ( A n ) < ∞ , then P ( { A n i.o } ) = 0 2 if the events { A n } are mutually independent and � n P ( A n ) = ∞ , then P ( { A n i.o } ) = 1 Mikael Skoglund, Probability and random processes 8/21
Random Variables • A probability space (Ω , A , P ) . A real-valued function X ( ω ) on Ω is called a random variable if it’s measurable w.r.t. (Ω , A ) • Recall: measurable ⇒ X − 1 ( O ) ∈ A for any open O ⊂ R X − 1 ( A ) ∈ A for any A ∈ B (the Borel sets) ⇐ ⇒ • Notation: • the event { ω : X ( ω ) ∈ B } → ’ X ∈ B ’ • P ( { X ∈ A } ∩ { X ∈ B } ) → ’ P ( X ∈ A, X ∈ B ) ’, etc. Mikael Skoglund, Probability and random processes 9/21 Distributions • X is measurable ⇒ P ( X ∈ B ) is well-defined for any B ∈ B • The distribution of X is the function µ X ( B ) = P ( X ∈ B ) , for B ∈ B • µ X is a probability measure on ( R , B ) • The probability distribution function of X is the real-valued function F X ( x ) = P ( { ω : X ( ω ) ≤ x } ) = (notation) = P ( X ≤ x ) • F X is (obviously) the distribution function of the finite measure µ X on ( R , B ) , i.e. F X ( x ) = µ X (( −∞ , x ]) Mikael Skoglund, Probability and random processes 10/21
Independence • Two random variables X and Y are pairwise independent if the events { X ∈ A } and { Y ∈ B } are independent for any A and B in B • A collection of random variables X 1 , . . . , X n is mutually independent if the events { X i ∈ B i } are mutually independent for all B i ∈ B Mikael Skoglund, Probability and random processes 11/21 Expectation • For a random variable on (Ω , A , P ) , the expectation of X is defined as � E [ X ] = X ( ω ) dP ( ω ) Ω • For any Borel-measurable real-valued function g � � E [ g ( X )] = g ( x ) dF X ( x ) = g ( x ) dµ X ( x ) in particular � E [ X ] = xdµ X ( x ) Mikael Skoglund, Probability and random processes 12/21
Variance • The variance of X , Var( X ) = E [( X − E [ X ]) 2 ] • Chebyshev’s inequality: For any ε > 0 , P ( | X − E [ X ] | ≥ ε ) ≤ Var( X ) ε 2 • Kolmogorov’s inequality: For mutually independent random k =1 with Var( X k ) < ∞ , set S j = � j variables { X k } n k =1 X k , 1 ≤ j ≤ n , then for any ε > 0 � � ≤ Var( S n ) P max | S j − E [ S j ] | ≥ ε ε 2 j ( n = 1 ⇒ Chebyshev) Mikael Skoglund, Probability and random processes 13/21 The Law of Large Numbers • A sequence { X n } is iid if the random variables X n all have the same distribution and are mutually independent • For any iid sequence { X n } with µ = E [ X n ] < ∞ , the event n 1 � lim X k = µ n n →∞ k =1 occurs with probability one • Toward the end of the course, we will generalize this result to stationary and ergodic random processes. . . Mikael Skoglund, Probability and random processes 14/21
• S n = n − 1 � n X n → µ with probability one ⇒ S n → µ in probability, i.e., n →∞ P ( {| S n − µ | ≥ ε } ) = 0 lim for each ε > 0 • in general ’in probability’ does not imply ’with probability one’ (convergence in measure does not imply convergence a.e.) Mikael Skoglund, Probability and random processes 15/21 The Law of Large Numbers: Proof • Lemma 1: For a nonnegative random variable X ∞ ∞ � � P ( X ≥ n ) ≤ E [ X ] ≤ P ( X ≥ n ) n =1 n =0 • Lemma 2: For mutually independent random variables { X n } with � n Var( X n ) < ∞ it holds that � n ( X n − E [ X n ]) converges with probability one • Lemma 3 (Kronecker’s Lemma): Given a sequence { a n } with 0 ≤ a 1 ≤ a 2 ≤ · · · and lim a n = ∞ , and another sequence { x k } such that lim � k x k exists, then n 1 � lim a k x k = 0 a n n →∞ k =1 Mikael Skoglund, Probability and random processes 16/21
• Assume without loss of generality (why?) that µ = 0 • Lemma 1 ⇒ � � P ( | X n | ≥ n ) = P ( | X 1 | ≥ n ) < ∞ n =1 n =1 • Let E = {| X k | ≥ k i.o. } , Borel–Cantelli ⇒ P ( E ) = 0 ⇒ we can concentrate on ω ∈ E c • Let Y n = X n χ {| X n | <n } ; if ω ∈ E c then there is an N such that Y n ( ω ) = X n ( ω ) for n ≥ N , thus for ω ∈ E c n n 1 1 � � lim X k = 0 ⇐ ⇒ lim Y k = 0 n n n →∞ n →∞ k =1 k =1 • Note that E [ Y n ] → µ = 0 as n → ∞ Mikael Skoglund, Probability and random processes 17/21 • Letting Z n = n − 1 Y n , it can be shown that � ∞ n =1 Var( Z n ) < ∞ (requires some work). Hence, according to Lemma 2 the limit n � Z = lim ( Z k − E [ Z k ]) n →∞ k =1 exists with probability one. • Furthermore, by Lemma 3 n n 1 ( Y k − E [ Y k ]) = 1 � � k ( Z k − E [ Z k ]) → 0 n n k =1 k =1 where also n 1 � E [ Y k ] → 0 n k =1 since E [ Y k ] → E [ X k ] = E [ X 1 ] = 0 Mikael Skoglund, Probability and random processes 18/21
Proof of Lemma 2 • Assume w.o. loss of generality that E [ X n ] = 0 , set S n = � n k =1 X k • For E n ∈ A with E 1 ⊂ E 2 ⊂ · · · it holds that �� � P E n = lim n →∞ P ( E n ) n Therefore, for any m ≥ 0 � ∞ � n � � � � P {| S m + k − S m | ≥ ε } = lim n →∞ P {| S m + k − S m | ≥ ε } k =1 k =1 � � = lim n →∞ P 1 ≤ k ≤ n | S m + k − S m | ≥ ε max Mikael Skoglund, Probability and random processes 19/21 • Let Y k = X m + k and k � T k = Y j = S m + k − S m , j =1 then Kolmogorov’s inequality implies � � P 1 ≤ k ≤ n | T k − E [ T k ] | ≥ ε max = m + n � � ≤ Var( S m + n − S m ) = 1 � P 1 ≤ k ≤ n | S m + k − S m | ≥ ε max Var( X k ) ε 2 ε 2 k = m +1 • Hence � ∞ � ∞ ≤ 1 � � P {| S m + k − S m | ≥ ε } Var( X k ) ε 2 k =1 k = m +1 Mikael Skoglund, Probability and random processes 20/21
• Since � n Var( X n ) < ∞ , we get for any ε > 0 � ∞ � � m →∞ P lim {| S m + k − S m | ≥ ε } = 0 k =1 • Now, let E = { ω : { S n ( ω ) } does not converge } . Then ω ∈ E iff { S n ( ω ) } is not a Cauchy sequence ⇒ for any n there is a k and an r such that | S n + k − S n | ≥ r − 1 . Hence, equivalently, ∞ �� �� ��� � | S n + k − S n | ≥ 1 � E = r r =1 n k • For F 1 ⊃ F 2 ⊃ F 3 · · · , P ( ∩ k F k ) = lim P ( F k ) , hence for any r > 0 � n � ∞ � ∞ �� ��� �� ���� � � | S n + k − S n | ≥ 1 | S ℓ + k − S ℓ | ≥ 1 � � � P = P r r n =1 k n =1 ℓ =1 k � n �� ��� �� �� � � | S ℓ + k − S ℓ | ≥ 1 | S n + k − S n | ≥ 1 � ≤ lim = lim n →∞ P n →∞ P r r ℓ =1 k k Mikael Skoglund, Probability and random processes 21/21
Recommend
More recommend