SPRING 2020 CE 311S : CE 311S : PR PROB OBABILITY ABILITY AND AND ST STATISTICS TISTICS Week 8 – Class 2 03/11/2020 PRIYADARSHAN PATIL Teaching Assistant, The University of Texas at Austin
Administrative stuff Online assignment 4 is due tomorrow ⚫ Spring break ⚫
Agenda Jointly distributed random variables ⚫ Multiple discrete random variables ⚫ Multiple continuous random variables ⚫ Covariance and correlation ⚫
Learning goals By the end of this class, you should be able to: ⚫ Understand a joint PMF (PDF) and CDF ⚫ Calculate marginal PMF (PDF) ⚫ Calculate expected values for the RV and functions of the RV ⚫ Compute covariance and correlation coefficient ⚫
Introduction to joint random variables Random variables are often linked with each other ⚫ Examples: Years in college and Credits completed, Years of work ⚫ experience and salary, Auto and Renters insurance We are interested in understanding how random variables behave ⚫ when studied together
Example: Insurance • Some customers purchase both auto and homeowner's insurance from the same company. • Let X and Y represent the deductibles of Y the auto and homeowners' policies for a 0 50 150 randomly selected customer. X and Y follow the joint PMF shown in the table: 0 0.25 0.06 0.15 X 100 0.07 0.15 0.04 200 0.14 0.05 0.09
Joint random variables In general, PMF 𝑄 𝑌𝑍 𝑦, 𝑧 is the probability of 𝑌 = 𝑦 and 𝑍 = 𝑧 ⚫ 𝑌𝑍 𝑦, 𝑧 ≥ 0 ∀ (𝑦, 𝑧) 𝑏𝑜𝑒 σ 𝑌 σ 𝑍 𝑄 For a valid PMF, 𝑄 𝑌𝑍 𝑦, 𝑧 = 1 ⚫ The marginal PMF of X provides us the distribution of X when we ⚫ aren’t concerned with Y 𝑸 𝒀 𝒚 = 𝑸 𝒀𝒁 (𝒚, 𝒛) 𝒛∈𝑺 𝒁
Things to note: Y • Sum of all entries equals 1 0 50 150 Sum • Each value is non-negative 0 0.25 0.06 0.15 0.46 • Sum of all values in the first row is X 100 0.07 0.15 0.04 0.26 P(X=0) when not considering Y 200 0.14 0.05 0.09 0.28 • Applies to all rows and columns Sum 0.46 0.26 0.28 1 • Joint CDF is written as: 𝐺 𝑌𝑍 𝑦, 𝑧 = 𝑄(𝑌 ≤ 𝑦 ∩ 𝑍 ≤ 𝑧)
Example: Insurance Y • Calculate the marginal PMFs of X and Y 0 50 150 Sum • 𝑄 𝑌 0 = 0.46 0 0.25 0.06 0.15 0.46 • 𝑄 𝑌 100 = 0.26 X 100 0.07 0.15 0.04 0.26 • 𝑄 𝑌 200 = 0.28 • 𝑄 200 0.14 0.05 0.09 0.28 𝑍 0 = 0.46 • 𝑄 Sum 0.46 0.26 0.28 1 𝑍 50 = 0.26 • 𝑄 𝑍 150 = 0.28
Independence Y 0 50 150 Sum • Two RVs are independent if 𝑄 𝑌𝑍 𝑦, 𝑧 = 𝑄 𝑌 𝑦 𝑄 𝑍 𝑧 𝑔𝑝𝑠 𝑏𝑚𝑚 𝑦, 𝑧 0 0.25 0.06 0.15 0.46 • Are X and Y independent? X 100 0.07 0.15 0.04 0.26 200 0.14 0.05 0.09 0.28 Sum 0.46 0.26 0.28 1
Expected value Y 0 50 150 Sum • The expected value of any function ℎ 𝑌𝑍 is σ 𝑌 σ 𝑍 ℎ 𝑦, 𝑧 𝑄 𝑌𝑍 (𝑦, 𝑧) 0 0.25 0.06 0.15 0.46 • What is the expected value of the total X 100 0.07 0.15 0.04 0.26 deductible (X+Y)? 200 0.14 0.05 0.09 0.28 Sum 0.46 0.26 0.28 1
Y Expected value 0 50 150 0 0.25 0.06 0.15 X 100 0.07 0.15 0.04 • Create two tables, one with 𝑄 𝑌𝑍 𝑦, 𝑧 and one with ℎ(𝑦, 𝑧) values 200 0.14 0.05 0.09 • Take the product of corresponding Y values and add 0 50 150 • 𝐹[ℎ(𝑦, 𝑧)] = 0 ∗ 0.25 + 50 ∗ 0.06+… 0 0 50 150 • 𝐹[ℎ(𝑦, 𝑧)] = 137 X 100 100 150 250 200 200 250 350
Y Expected value 0 50 150 0 0.25 0.06 0.15 X 100 0.07 0.15 0.04 • Two ways to calculate 𝐹[𝑌] 𝑏𝑜𝑒 𝐹[𝑍]: 200 0.14 0.05 0.09 • Take the product of corresponding values and add. Y • Solve it using the marginal PMF 0 50 150 • 𝐹[𝑌] = 82 𝑏𝑜𝑒 𝐹[𝑍] = 55 0 0 0 0 X 100 100 100 100 200 200 200 200
Joint continuous random variables All the concepts we studied apply to continuous distributions ⚫ Similar changes as applied to single random variables ⚫ Mass changes to density, summation to integration, etc. ⚫
Joint continuous random variables The joint density function 𝑔 𝑌𝑍 𝑦, 𝑧 is valid if 𝑔 𝑌𝑍 𝑦, 𝑧 ≥ 0 ∀𝑦, 𝑧 ⚫ ∞ ∞ 𝑔 𝑌𝑍 𝑦, 𝑧 𝑒𝑧 𝑒𝑦 = 1 and if −∞ −∞ The marginal density functions are: ⚫ ∞ 𝑔 ∞ 𝑔 𝑌𝑍 𝑦, 𝑧 𝑒𝑧 and f 𝑍 y = 𝑌𝑍 𝑦, 𝑧 𝑒𝑦 𝑔 𝑌 𝑦 = −∞ −∞
Joint continuous random variables X and Y are independent if 𝑔 𝑌𝑍 𝑦, 𝑧 = 𝑔 𝑌 𝑦 𝑔 𝑍 𝑧 ∀𝑦, 𝑧 ⚫ ∞ ∞ ℎ 𝑌𝑍 𝑦, 𝑧 𝑔 𝐹 ℎ 𝑌, 𝑍 = 𝑌𝑍 𝑦, 𝑧 𝑒𝑧 𝑒𝑦 ⚫ −∞ −∞
Example: Column lifetime A test column you built for your materials class can either fail via ⚫ the rebars rusting, or by the concrete flaking off. Let X be the years before the rebars rust to failure, and Y be the ⚫ years before the concrete flakes off. 𝑌𝑍 𝑦, 𝑧 = 𝑑𝑓 −𝑦 𝑓 −2𝑧 for 𝑦 ≥ 0, 𝑧 ≥ 0 The joint pdf is: 𝑔 ⚫
Example: Column lifetime 𝑌𝑍 𝑦, 𝑧 = 2𝑓 −𝑦 𝑓 −2𝑧 for 𝑦 ≥ 0, 𝑧 ≥ 0 𝑔 ⚫ What is the marginal distribution of X? This is the pdf for years till ⚫ rebar rusting What is the marginal distribution of Y? This is the pdf for years till ⚫ concrete flaking
Example: Column lifetime 𝑌𝑍 𝑦, 𝑧 = 2𝑓 −𝑦 𝑓 −2𝑧 for 𝑦 ≥ 0, 𝑧 ≥ 0 𝑔 ⚫ X and Y are independent if 𝑔 𝑌𝑍 𝑦, 𝑧 = 𝑔 𝑌 𝑦 𝑔 𝑍 𝑧 ∀𝑦, 𝑧 ⚫ Are X and Y independent? ⚫
Example: Column lifetime 𝑌𝑍 𝑦, 𝑧 = 2𝑓 −𝑦 𝑓 −2𝑧 for 𝑦 ≥ 0, 𝑧 ≥ 0 𝑔 ⚫ What is the expected time till the rebars rust to failure? ⚫
Covariance and correlation When two RVs are not independent, we require a measure of how ⚫ dependent they are. Covariance of RVs X and Y is defined as ⚫ 𝐷𝑝𝑤 𝑌, 𝑍 = 𝐹 𝑌 − 𝐹[𝑌 ]𝐹[𝑍 − 𝐹[𝑍]] Equivalently, ⚫ 𝐷𝑝𝑤 𝑌, 𝑍 = 𝐹 𝑌𝑍 − 𝐹 𝑌 𝐹[𝑍]
Y Covariance 0 50 150 0 0.25 0.06 0.15 X 100 0.07 0.15 0.04 Recall, 𝐹 𝑌 = 82 and 𝐹 𝑍 = 55 ⚫ 200 0.14 0.05 0.09 𝐹[𝑌𝑍] = 4550 ⚫ Y 𝐷𝑝𝑤 𝑌, 𝑍 = 4550 − 82 ∗ 55 = 40 ⚫ 0 50 150 0 0 0 0 X 100 0 5000 15000 200 0 10000 30000
Covariance Interpretation ⚫ If covariance is positive, when X is above average, Y usually is too; ⚫ and when X is below average, Y usually is too. If covariance is negative, when X is above average, Y is usually below ⚫ average, and vice versa. If X and Y are independent, their covariance is zero. (The converse is ⚫ not true). The magnitude does not mean much (depends on units of X and Y) ⚫
Correlation To gain more insight from the magnitude, we define the ⚫ correlation coefficient as follows: 𝜍 𝑌𝑍 = 𝐷𝑝𝑤 𝑌, 𝑍 𝜏 𝑌 𝜏 𝑍 The correlation coefficient is always between −1, +1 ⚫ It quantifies the strength of the linear relationship between X and Y ⚫
Correlation If 𝜍 𝑌𝑍 = 1, then 𝑍 = 𝑏𝑌 + 𝑐 for some 𝑏 > 0 ⚫ If 𝜍 𝑌𝑍 = −1, then 𝑍 = 𝑏𝑌 + 𝑐 for some 𝑏 < 0 ⚫ If 𝜍 𝑌𝑍 = 0, there is no linear relationship between X and Y ⚫ If 𝜍 𝑌𝑍 =0, it does not imply that X and Y are independent ⚫
Covariance - properties 𝐷𝑝𝑤 𝑌, 𝑌 = 𝑊𝑏𝑠 𝑌 ⚫ If 𝑌 and 𝑍 are independent, 𝐷𝑝𝑤 𝑌, 𝑍 = 0 ⚫ 𝐷𝑝𝑤 𝑌, 𝑍 = 𝐷𝑝𝑤 𝑍, 𝑌 ⚫ 𝐷𝑝𝑤 𝑏𝑌, 𝑍 = 𝑏𝐷𝑝𝑤 𝑌, 𝑍 ⚫ 𝐷𝑝𝑤 𝑌 + 𝑑, 𝑍 = 𝐷𝑝𝑤 𝑌, 𝑍 ⚫ 𝐷𝑝𝑤(𝑌 + 𝑍, 𝑎) = 𝐷𝑝𝑤(𝑌, 𝑎) + 𝐷𝑝𝑤(𝑍, 𝑎) ⚫
Covariance – special formulae 𝑛 𝑏 𝑗 𝑌 𝑗 , σ 𝑘=1 𝑛 σ 𝑘=1 𝑜 𝑜 𝐷𝑝𝑤 σ 𝑗=1 𝑘 = σ 𝑗=1 𝑐 𝑘 𝑍 𝑏 𝑗 𝑐 𝑘 𝐷𝑝𝑤(𝑌 𝑗 , 𝑍 𝑘 ) ⚫ 𝑊𝑏𝑠 𝑏𝑌 + 𝑐𝑍 = 𝑏 2 𝑊𝑏𝑠 𝑦 + 𝑐 2 𝑊𝑏𝑠 𝑍 + 2𝑏𝑐𝐷𝑝𝑤(𝑌, 𝑍) ⚫
Covariance - examples 𝐷𝑝𝑤 𝑌 1 + 2𝑌 2 , 3𝑍 1 + 4𝑍 2 = 3𝐷𝑝𝑤 𝑌 1 , 𝑍 1 + 6𝐷𝑝𝑤 𝑌 2 , 𝑍 1 + ⚫ 4𝐷𝑝𝑤 𝑌 1 , 𝑍 2 + 8𝐷𝑝𝑤(𝑌 2 , 𝑍 2 ) Let 𝑌 and 𝑍 be independent standard normal random ⚫ variables. What is 𝐷𝑝𝑤(1 + 𝑌 + 𝑌𝑍 2 , 1 + 𝑌)
Summary Joint discrete (continuous) random variables have a joint ⚫ PMF (PDF) and CDF Marginal distributions for each of the RVs can be calculated ⚫ by summing (integrating) across the other random variable Expected values for functions of joint random variables are ⚫ like expected values for single random variables Covariance and correlation coefficient are measures for ⚫ determining the linear relation between two RVs
Any Questions? Thank you for attending ⚫ Have a fun (and safe) spring break ⚫
Recommend
More recommend