scatter plots
play

Scatter plots IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G - PowerPoint PPT Presentation

Scatter plots IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G P L OT 2 Rick Sca v e a Fo u nder , Sca v e a Academ y 48 geometries geom _* abline conto u r dotplot ji er pointrange ribbon spoke area co u nt errorbar label


  1. Scatter plots IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G P L OT 2 Rick Sca v e � a Fo u nder , Sca v e � a Academ y

  2. 48 geometries geom _* abline conto u r dotplot ji � er pointrange ribbon spoke area co u nt errorbar label pol y gon r u g step bar crossbar errorbarh line qq segment te x t bin 2 d c u r v e freqpol y linerange qq _ line sf tile blank densit y he x map q u antile sf _ label v iolin bo x plot densit y2 d histogram path raster sf _ te x t v line col densit y_2 d hline point rect smooth INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  3. Common plot t y pes Plot t y pe Possible Geoms Sca � er plots points , ji � er , abline , smooth , co u nt INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  4. Scatter plots ggplot(iris, aes(x = Sepal.Length, Each geom can accept speci � c aesthetic y = Sepal.Width)) + mappings , e . g . geom _ point (): geom_point() Essential x,y INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  5. Scatter plots ggplot(iris, aes(x = Sepal.Length, Each geom can accept speci � c aesthetic y = Sepal.Width, mappings , e . g . geom _ point (): col = Species)) + geom_point() Essential Optional alpha , color , � ll , shape , si z e , x,y stroke INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  6. Geom - specific aesthetic mappings # These result in the same plot! ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, col = Species)) + geom_point() ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width)) + geom_point(aes(col = Species)) Control aesthetic mappings of each la y er independentl y: INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  7. head(iris, 3) # Raw data Species Sepal.Length Sepal.Width Petal.Length Petal.Width 1 setosa 5.1 3.5 1.4 0.2 2 setosa 4.9 3.0 1.4 0.2 3 setosa 4.7 3.2 1.3 0.2 iris %>% group_by(Species) %>% summarise_all(mean) -> iris.summary iris.summary # Summary statistics # A tibble: 3 x 5 Species Sepal.Length Sepal.Width Petal.Length Petal.Width <fct> <dbl> <dbl> <dbl> <dbl> 1 setosa 5.01 3.43 1.46 0.246 2 versicolor 5.94 2.77 4.26 1.33 3 virginica 6.59 2.97 5.55 2.03 INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  8. ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, col = Species)) + # Inherits both data and aes from ggplot() geom_point() + # Different data, but inherited aes geom_point(data = iris.summary, shape = 15, size = 5) INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  9. Shape attrib u te v al u es INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  10. E x ample ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, col = Species)) + geom_point() + geom_point(data = iris.summary, shape = 21, size = 5, fill = "black", stroke = 2) INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  11. On - the - fl y stats b y ggplot 2 See the second co u rse for the stats la y er . Note : A v oid plo � ing onl y the mean w itho u t a meas u re of spread , e . g . the standard de v iation . INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  12. position = " jitter " ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, col = Species)) + geom_point(position = "jitter") INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  13. geom _ jitter () A short - c u t to geom_point(position = "jitter") ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, col = Species)) + geom_jitter() INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  14. Don ' t forget to adj u st alpha Combine ji � ering w ith alpha - blending if necessar y ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, col = Species)) + geom_jitter(alpha = 0.6) INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  15. Hollo w circles also help shape = 1 is a . hollo w circle . Not necessar y to also u se alpha - blending . ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, col = Species)) + geom_jitter(shape = 1) INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  16. Let ' s practice ! IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G P L OT 2

  17. Histograms IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G P L OT 2 Rick Sca v e � a Fo u nder , Sca v e � a Academ y

  18. Common plot t y pes Plot t y pe Possible Geoms Sca � er plots points , ji � er , abline , smooth , co u nt Bar plots histogram , bar , col , errorbar Line plots line , path INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  19. Histograms ggplot(iris, aes(x = Sepal.Width)) + geom_histogram() A plot of binned v al u es i . e . a statistical f u nction `stat_bin()` using `bins = 30`. Pick better value with `binwidth`. INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  20. Defa u lt of 30 e v en bins ggplot(iris, aes(x = Sepal.Width)) + geom_histogram() A plot of binned v al u es i . e . a statistical f u nction # Default bin width: diff(range(iris$Sepal.Width))/30 [1] 0.08 INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  21. Int u iti v e and meaningf u l bin w idths ggplot(iris, aes(x = Sepal.Width)) + geom_histogram(binwidth = 0.1) Al w a y s set a meaningf u l bin w idths for y o u r data . No spaces bet w een bars . INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  22. Re - position tick marks ggplot(iris, aes(x = Sepal.Width)) + geom_histogram(binwidth = 0.1, center = 0.05) Al w a y s set a meaningf u l bin w idths for y o u r data . No spaces bet w een bars . X a x is labels are bet w een bars . INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  23. Different Species ggplot(iris, aes(x = Sepal.Width, fill = Species)) + geom_histogram(binwidth = .1, center = 0.05) INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  24. Defa u lt position is " stack " ggplot(iris, aes(x = Sepal.Width, fill = Species)) + geom_histogram(binwidth = .1, center = 0.05, position = "stack") INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  25. position = " dodge " ggplot(iris, aes(x = Sepal.Width, fill = Species)) + geom_histogram(binwidth = .1, center = 0.05, position = "dodge") INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  26. position = " fill " ggplot(iris, aes(x = Sepal.Width, fill = Species)) + geom_histogram(binwidth = .1, center = 0.05, position = "fill") INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  27. Final Slide IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G P L OT 2

  28. Bar plots IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G P L OT 2 Rick Sca v e � a Fo u nder , Sca v e � a Academ y

  29. Bar Plots , w ith a categorical X - a x is Use geom _ bar () or geom _ col () Geom Stat Action geom_bar() " co u nt " Co u nts the n u mber of cases at each x position geom_col() " identit y" Plot act u al v al u es All positions from before are a v ailable T w o t y pes Absol u te co u nts Distrib u tions INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  30. Bar Plots , w ith a categorical X - a x is Use geom _ bar () or geom _ col () Geom Stat Action geom_bar() " co u nt " Co u nts the n u mber of cases at each x position geom_col() " identit y" Plot act u al v al u es INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  31. Bar Plots , w ith a categorical X - a x is Use geom _ bar () or geom _ col () Geom Stat Action geom_bar() " co u nt " Co u nts the n u mber of cases at each x position geom_col() " identit y" Plot act u al v al u es All positions from before are a v ailable T w o t y pes Absol u te co u nts Distrib u tions INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  32. Habits of mammals str(sleep) 'data.frame': 76 obs. of 3 variables: $ vore : Factor w/ 4 levels "carni","herbi",..: 1 4 2 4 2 2 1 1 2 2 ... $ total: num 12.1 17 14.4 14.9 4 14.4 8.7 10.1 3 5.3 ... $ rem : num NA 1.8 2.4 2.3 0.7 2.2 1.4 2.9 NA 0.6 ... INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  33. Bar plot ggplot(sleep, aes(vore)) + geom_bar() INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  34. Plotting distrib u tions instead of absol u te co u nts iris_summ_long # Calculate Descriptive Statistics: iris %>% select(Species, Sepal.Width) %>% Species a v g stde v gather(key, value, -Species) %>% group_by(Species) %>% setosa 3.43 0.38 summarise(avg = mean(value), v ersicolor 2.77 0.31 stdev = sd(value)) -> iris_summ_long v irginica 2.97 0.32 INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  35. Plotting distrib u tions ggplot(iris_summ_long, aes(x = Species, y = avg)) + geom_col() + geom_errorbar(aes(ymin = avg - stdev, ymax = avg + stdev), width = 0.1) INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  36. Let ' s practice ! IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G P L OT 2

  37. Line plots IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G P L OT 2 Rick Sca v e � a Fo u nder , Sca v e � a Academ y

  38. Common plot t y pes Plot t y pe Possible Geoms Sca � er plots points , ji � er , abline , smooth , co u nt Bar plots histogram , bar , col , errorbar Line plots line , path INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

Recommend


More recommend