Vis u ali z ing bi v ariate relationships C OR R E L ATION AN D R E G R E SSION IN R Ben Ba u mer Assistant Professor at Smith College
Bi v ariate relationships Both v ariables are n u merical Response v ariable a . k . a . y, dependent E x planator y v ariable Something y o u think might be related to the response a . k . a . x, independent , predictor CORRELATION AND REGRESSION IN R
Graphical representations P u t response on v ertical a x is P u t e x planator y on hori z ontal a x is CORRELATION AND REGRESSION IN R
Scatterplot ggplot(data = possum, aes(y = totalL, x = tailL)) + geom_point() CORRELATION AND REGRESSION IN R
Scatterplot ggplot(data = possum, aes(y = totalL, x = tailL)) + geom_point() + scale_x_continuous("Length of Possum Tail (cm)") + scale_y_continuous("Length of Possum Body (cm)") CORRELATION AND REGRESSION IN R
Bi v ariate relationships Can think of bo x plots as sca � erplots … … b u t w ith discreti z ed e x planator y v ariable cut() f u nction discreti z es Choose appropriate n u mber of " bo x es " CORRELATION AND REGRESSION IN R
Scatterplot ggplot(data = possum, aes(y = totalL, x = cut(tailL, breaks = 5))) geom_point() CORRELATION AND REGRESSION IN R
Scatterplot ggplot(data = possum, aes(y = totalL, x = cut(tailL, breaks = 5))) geom_boxplot() CORRELATION AND REGRESSION IN R
Let ' s practice ! C OR R E L ATION AN D R E G R E SSION IN R
Characteri z ing bi v ariate relationships C OR R E L ATION AN D R E G R E SSION IN R Ben Ba u mer Assistant Professor at Smith College
Characteri z ing bi v ariate relationships Form ( e . g . linear , q u adratic , non - linear ) Direction ( e . g . posti v e , negati v e ) Strength ( ho w m u ch sca � er / noise ?) O u tliers CORRELATION AND REGRESSION IN R
CORRELATION AND REGRESSION IN R
Sign legibilit y CORRELATION AND REGRESSION IN R
NIST CORRELATION AND REGRESSION IN R
NIST 2 CORRELATION AND REGRESSION IN R
Non - linear CORRELATION AND REGRESSION IN R
Fan shape CORRELATION AND REGRESSION IN R
Let ' s practice ! C OR R E L ATION AN D R E G R E SSION IN R
O u tliers C OR R E L ATION AN D R E G R E SSION IN R Ben Ba u mer Assistant Professor at Smith College
O u tliers ggplot(data = mlbBat10, aes(x = SB, y = HR)) + geom_point() CORRELATION AND REGRESSION IN R
Add transparenc y ggplot(data = mlbBat10, aes(x = SB, y = HR)) + geom_point(alpha = 0.5) CORRELATION AND REGRESSION IN R
Add some jitter ggplot(data = mlbBat10, aes(x = SB, y = HR)) + geom_point(alpha = 0.5, position = "jitter") CORRELATION AND REGRESSION IN R
Add some jitter ggplot(data = mlbBat10, aes(x = SB, y = HR)) + geom_point(alpha = 0.5, position = "jitter") CORRELATION AND REGRESSION IN R
Identif y the o u tliers mlbBat10 %>% filter(SB > 60 | HR > 50) %>% select(name, team, position, SB, HR) name team position SB HR 1 J Pierre CWS OF 68 1 2 J Bautista TOR OF 9 54 CORRELATION AND REGRESSION IN R
Let ' s practice ! C OR R E L ATION AN D R E G R E SSION IN R
Recommend
More recommend