Covariance and correlation P RACTICIN G S TATIS TICS IN TERVIEW QUES TION S IN R Zuzanna Chmielewska Actuary
Covariance and correlation PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Covariance Formula for a sample: n ( x − ) ⋅ ( y − ) ∑ i =1 x y i i cov ( X , Y ) = n − 1 Formula for a population: ( x − ) ⋅ ( y − ) n ∑ i =1 x y i i cov ( X , Y ) = n PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Covariance Formula for a sample: n ( x − ) ⋅ ( y − ) ∑ i =1 x y i i cov ( X , Y ) = n − 1 PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Covariance Formula for a population: ( x − ) ⋅ ( y − ) n ∑ i =1 x y i i cov ( X , Y ) = n PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Covariance - numerical example x = 3, x = 5, x = 7 ( x − ) ⋅ ( y − ) = 14 n ∑ i =1 x y 1 2 3 i i i i n ( x − )⋅( y − ) ∑ i =1 y = 6, y = 11, y = 13 x y = 7 1 2 3 n −1 = 5 x = 10 y ( x − ) ⋅ ( y − ) = 8 x y 1 1 ( x − ) ⋅ ( y − ) = 0 x y 2 2 ( x − ) ⋅ ( y − ) = 6 x y 3 3 PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Correlation coef�cient cov ( X , Y ) corr ( X , Y ) = σ ⋅ σ x y PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Correlation coef�cient PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Correlation coef�cient PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Correlation coef�cient PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Correlation coef�cient PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Correlation coef�cient PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Correlation coef�cient PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Nonlinear relationships PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Correlation does not imply causation! PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Summary covariance correlation coef�cient PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Let's practice! P RACTICIN G S TATIS TICS IN TERVIEW QUES TION S IN R
Linear regression model P RACTICIN G S TATIS TICS IN TERVIEW QUES TION S IN R Zuzanna Chmielewska Actuary
Linear regression model PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Linear regression model PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Linear regression model y = β + β ⋅ x + ... + β ⋅ x + e 0 1 i 1 i p ip i where: y - dependent variable, i x - independent variables, ij β - parameters, j e - error. i PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Linear predictor function ^ = β + β ⋅ x + ... + β ⋅ x y i 0 1 i 1 p ip PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
^ = β + β ⋅ x y i 0 1 i PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
^ = β + β ⋅ x y i 0 1 i PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
^ = β + β ⋅ x y i 0 1 i PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
^ = β + β ⋅ x y i 0 1 i PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Log-transformation Examples: ^ = β + β ⋅ ln ( x ) + ... + β ⋅ x y i 0 1 i 1 p ip ln ( ) = β + β ⋅ x ^ + ... + β ⋅ x y i 0 1 i 1 p ip PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Assumptions Linear relationship Normally distributed errors Homoscedastic errors Independent observations PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Linear model in R model <- lm(dist ~ speed, data = cars) print(model) Call: lm(formula = dist ~ speed, data = cars) Coefficients: (Intercept) speed -17.579 3.932 PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Linear model in R model <- lm(dist ~ speed, data = cars) new_car <- data.frame(speed = 17.5) predict(model, newdata = new_car) 1 51.23806 PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Diagnostic plots model <- lm(dist ~ speed, data = cars) plot(model) PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Summary linear regression model linear predictor function lm() in R diagnostic plots PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Let's practice! P RACTICIN G S TATIS TICS IN TERVIEW QUES TION S IN R
Logistic regression model P RACTICIN G S TATIS TICS IN TERVIEW QUES TION S IN R Zuzanna Chmielewska Actuary
Logistic regression's application PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Logistic function 1 f ( x ) = 1 + e − x PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Logistic function 1 f ( x ) = ∈ (0, 1) 1 + e − x PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Logistic regression model Probability prediction: 1 p = P ( y = 1) = i i 1 + e −( β + β ⋅ x +...+ β ⋅ x ) 0 1 i 1 p ip Logit prediction: p i l = ln ( ) = β + β ⋅ x + ... + β ⋅ x 0 1 i 1 i p ip 1 − p i PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Logistic regression in R model <- glm(y ~ x, data = df, family = "binomial") PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Logistic regression in R model <- glm(y ~ x, data = df, family = "binomial") predict(model, newdata = new_df, type = "response") PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Summary logistic regression model prediction of a binary response variable logistic regression in R with glm() PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Let's practice! P RACTICIN G S TATIS TICS IN TERVIEW QUES TION S IN R
Model evaluation P RACTICIN G S TATIS TICS IN TERVIEW QUES TION S IN R Zuzanna Chmielewska Actuary
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Cross-validation PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Cross-validation PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Cross-validation PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Cross-validation PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Cross-validation PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Cross-validation PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Cross-validation PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Confusion matrix PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Confusion matrix PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Confusion matrix PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Confusion matrix PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Classi�cation metrics TP + TN accuracy = TP + TN + F P + F N TP precision = TP + F P TP recall = TP + F N PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Classi�cation metrics Precision Recall PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Regression metrics PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Regression metrics PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Regression metrics Root Mean Squared Error √ 1 ∑ i =1 ^ i 2 RMSE = ( y − ) n y i n Mean Absolute Error 1 ∑ i =1 MAE = ∣ y − ^ i ∣ n y i n PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Regression metrics Root Mean Squared Error Mean Absolute Error 1 ∑ i =1 √ 1 ∑ i =1 MAE = ∣ y − ^ i ∣ n y ^ i 2 RMSE = ( y − ) n y i n i n straightforward interpretation height weight to large errors PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Summary validation set approach cross-validation confusion matrix classi�cation metrics regression metrics PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Let's practice! P RACTICIN G S TATIS TICS IN TERVIEW QUES TION S IN R
Wrapping up P RACTICIN G S TATIS TICS IN TERVIEW QUES TION S IN R Zuzanna Chmielewska Actuary
Congratulations! PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Chapter 1 Probability distributions: discrete distributions continuous distributions central limit theorem PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Chapter 2 Exploratory Data Analysis: descriptive statistics categorical data time-series principal component analysis PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Chapter 3 Statistical tests: normality tests inference for a mean comparing two means ANOVA PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Recommend
More recommend