Chebyshev’s inequality Define Y := ( X − E ( X )) 2 By Markov’s inequality � Y ≥ a 2 � P ( | X − E ( X ) | ≥ a ) = P
Chebyshev’s inequality Define Y := ( X − E ( X )) 2 By Markov’s inequality � Y ≥ a 2 � P ( | X − E ( X ) | ≥ a ) = P ≤ E ( Y ) a 2
Chebyshev’s inequality Define Y := ( X − E ( X )) 2 By Markov’s inequality � Y ≥ a 2 � P ( | X − E ( X ) | ≥ a ) = P ≤ E ( Y ) a 2 = Var ( X ) a 2
Age of students at NYU Mean: 20 years, standard deviation: 3 years How many are younger than 30?
Age of students at NYU Mean: 20 years, standard deviation: 3 years How many are younger than 30? P ( A ≥ 30 ) ≤ P ( | A − 20 | ≥ 10 )
Age of students at NYU Mean: 20 years, standard deviation: 3 years How many are younger than 30? P ( A ≥ 30 ) ≤ P ( | A − 20 | ≥ 10 ) ≤ Var ( A ) 100 9 = 100 At least 91 %
Expectation operator Mean and variance Covariance Conditional expectation
Covariance The covariance of X and Y is Cov ( X , Y ) := E (( X − E ( X )) ( Y − E ( Y ))) = E ( XY − Y E ( X ) − X E ( Y ) + E ( X ) E ( Y )) = E ( XY ) − E ( X ) E ( Y ) If Cov ( X , Y ) = 0, X and Y are uncorrelated
Covariance Cov ( X , Y ) 0.5 0.9 0.99 Cov ( X , Y ) 0 -0.9 -0.99
Variance of the sum � ( X + Y − E ( X + Y )) 2 � Var ( X + Y ) = E � ( X − E ( X )) 2 � � ( Y − E ( Y )) 2 � = E + E + 2 E (( X − E ( X )) ( Y − E ( Y ))) = Var ( X ) + Var ( Y ) + 2 Cov ( X , Y )
Variance of the sum � ( X + Y − E ( X + Y )) 2 � Var ( X + Y ) = E � ( X − E ( X )) 2 � � ( Y − E ( Y )) 2 � = E + E + 2 E (( X − E ( X )) ( Y − E ( Y ))) = Var ( X ) + Var ( Y ) + 2 Cov ( X , Y ) If X and Y are uncorrelated, then Var ( X + Y ) = Var ( X ) + Var ( Y )
Independence implies uncorrelation Cov ( X , Y ) = E ( XY ) − E ( X ) E ( Y ) = E ( X ) E ( Y ) − E ( X ) E ( Y ) = 0
Uncorrelation does not imply independence X , Y are independent Bernoulli with parameter 1 2 Let U = X + Y and V = X − Y Are U and V independent? Are they uncorrelated?
Uncorrelation does not imply independence p U ( 0 ) p V ( 0 ) p U , V ( 0 , 0 )
Uncorrelation does not imply independence p U ( 0 ) = P ( X = 0 , Y = 0 ) = 1 4 p V ( 0 ) p U , V ( 0 , 0 )
Uncorrelation does not imply independence p U ( 0 ) = P ( X = 0 , Y = 0 ) = 1 4 p V ( 0 ) = P ( X = 1 , Y = 1 ) + P ( X = 0 , Y = 0 ) = 1 2 p U , V ( 0 , 0 )
Uncorrelation does not imply independence p U ( 0 ) = P ( X = 0 , Y = 0 ) = 1 4 p V ( 0 ) = P ( X = 1 , Y = 1 ) + P ( X = 0 , Y = 0 ) = 1 2 p U , V ( 0 , 0 ) = P ( X = 0 , Y = 0 ) = 1 4
Uncorrelation does not imply independence p U ( 0 ) = P ( X = 0 , Y = 0 ) = 1 4 p V ( 0 ) = P ( X = 1 , Y = 1 ) + P ( X = 0 , Y = 0 ) = 1 2 p U , V ( 0 , 0 ) = P ( X = 0 , Y = 0 ) = 1 4 � = p U ( 0 ) p V ( 0 ) = 1 8
Uncorrelation does not imply independence Cov ( U , V ) = E ( UV ) − E ( U ) E ( V ) = E (( X + Y ) ( X − Y )) − E ( X + Y ) E ( X − Y ) − E 2 ( X ) + E 2 ( Y ) X 2 � Y 2 � � � = E − E
Uncorrelation does not imply independence Cov ( U , V ) = E ( UV ) − E ( U ) E ( V ) = E (( X + Y ) ( X − Y )) − E ( X + Y ) E ( X − Y ) − E 2 ( X ) + E 2 ( Y ) X 2 � Y 2 � � � = E − E = 0
Correlation coefficient Pearson correlation coefficient of X and Y ρ X , Y := Cov ( X , Y ) . σ X σ Y Covariance between X /σ X and Y /σ Y
Correlation coefficient σ Y = 1, σ Y = 3, σ Y = 3, Cov ( X , Y ) = 0 . 9, Cov ( X , Y ) = 0 . 9, Cov ( X , Y ) = 2 . 7, ρ X , Y = 0 . 9 ρ X , Y = 0 . 3 ρ X , Y = 0 . 9
Cauchy-Schwarz inequality For any X and Y � | E ( XY ) | ≤ E ( X 2 ) E ( Y 2 ) . and � E ( Y 2 ) � E ( X 2 ) E ( Y 2 ) ⇐ E ( XY ) = ⇒ Y = E ( X 2 ) X � E ( Y 2 ) � E ( XY ) = − E ( X 2 ) E ( Y 2 ) ⇐ ⇒ Y = − E ( X 2 ) X
Cauchy-Schwarz inequality We have Cov ( X , Y ) ≤ σ X σ Y and equivalently | ρ X , Y | ≤ 1 In addition | ρ X , Y | = 1 ⇐ ⇒ Y = c X + d where � σ Y if ρ X , Y = 1 , σ X c := d := E ( Y ) − c E ( X ) − σ Y if ρ X , Y = − 1 , σ X
Covariance matrix of a random vector The covariance matrix of � X is defined as Var ( X 1 ) Cov ( X 1 , X 2 ) · · · Cov ( X 1 , X n ) Cov ( X 2 , X 1 ) Var ( X 2 ) · · · Cov ( X 2 , X n ) Σ � X = . . . ... . . . . . . Cov ( X n , X 2 ) Cov ( X n , X 2 ) · · · Var ( X n ) � T � X T � � � � X � � � � = E − E X E X
Covariance matrix after a linear transformation Σ A � X + � b
Covariance matrix after a linear transformation �� � T � � T � � � � � A � X + � A � X + � A � X + � A � X + � Σ A � b = E b b − E b b E X + �
Covariance matrix after a linear transformation �� � T � � T � � � � � A � X + � A � X + � A � X + � A � X + � Σ A � b = E b b − E b b E X + � � T � X T � A T + � � A T + A E � � b T + � X � � � � � b � b T = A E b E X X � T � T � � � A T − A E � � b T − � � A T − � � � � � � b � b T − A E X E X X b E X
Covariance matrix after a linear transformation �� � T � � T � � � � � A � X + � A � X + � A � X + � A � X + � Σ A � b = E b b − E b b E X + � � T � X T � A T + � � A T + A E � � b T + � X � � � � � b � b T = A E b E X X � T � T � � � A T − A E � � b T − � � A T − � � � � � � b � b T − A E X E X X b E X � � T � � X T � � � � X � � � � A T = A − E E X E X
Covariance matrix after a linear transformation �� � T � � T � � � � � A � X + � A � X + � A � X + � A � X + � Σ A � b = E b b − E b b E X + � � T � X T � A T + � � A T + A E � � b T + � X � � � � � b � b T = A E b E X X � T � T � � � A T − A E � � b T − � � A T − � � � � � � b � b T − A E X E X X b E X � � T � � X T � � � � X � � � � A T = A − E E X E X X A T = A Σ �
Variance in a fixed direction For any unit vector � u � u T � � u T Σ � Var � X = � X � u
Direction of maximum variance To find direction of maximum variance we must solve u T Σ � arg max u || 2 = 1 � X � u || �
Linear algebra Symmetric matrices have orthogonal eigenvectors X = U Λ U T Σ � λ 1 0 · · · 0 0 λ 2 · · · 0 � T � � � � � � � � � = u 1 u 2 · · · u n u 1 u 2 · · · u n · · · 0 0 · · · λ n
Linear algebra || u || 2 = 1 u T Au λ 1 = max || u || 2 = 1 u T Au u 1 = arg max u T Au λ k = max || u || 2 = 1 , u ⊥ u 1 ,..., u k − 1 u T Au u k = arg max || u || 2 = 1 , u ⊥ u 1 ,..., u k − 1
Direction of maximum variance √ λ 1 = 1 . 22, √ λ 1 = 1 . 38, √ λ 1 = 1, √ λ 2 = 1 √ λ 2 = 0 . 71 √ λ 2 = 0 . 32
Coloring Goal: Transform uncorrelated samples with unit variance so that they have a prescribed covariance matrix Σ 1. Compute the eigendecomposition Σ = U Λ U T . 2. Set √ � y := U Λ � x where √ λ 1 0 · · · 0 √ λ 2 √ 0 0 · · · Λ := · · · √ λ n 0 0 · · ·
Coloring Σ � Y
Coloring √ √ T U T Σ � Y = U ΛΣ � Λ X
Coloring √ √ T U T Σ � Y = U ΛΣ � Λ X √ √ T U T = U Λ I Λ
Recommend
More recommend