case study i bag plot
play

Case Study I Bag Plot Data Visualization with ggplot2 ggplot2 2.0 - PowerPoint PPT Presentation

DATA VISUALIZATION WITH GGPLOT2 Case Study I Bag Plot Data Visualization with ggplot2 ggplot2 2.0 Write your own extensions Extremely flexible Create bag plot John Tukey (box plots) 2D box plot Data Visualization


  1. DATA VISUALIZATION WITH GGPLOT2 Case Study I Bag Plot

  2. Data Visualization with ggplot2 ggplot2 2.0 ● Write your own extensions ● Extremely flexible ● Create bag plot ● John Tukey (box plots) ● 2D box plot

  3. Data Visualization with ggplot2 data set > dim(df) [1] 202 2 > head(df) type Value 1 1 99.43952 2 1 99.76982 3 1 101.55871 4 1 100.07051 5 1 100.12929 6 1 101.71506

  4. Data Visualization with ggplot2 2 box plots > ggplot(df, aes(x = type, Value)) + geom_boxplot() + facet_wrap(~type, ncol = 2, scales = "free") 1 2 ● ● 104 152 150 102 Value 148 100 146 98 ● ● 1 2 type

  5. Data Visualization with ggplot2 slope plot > df$ID <- seq_len(nrow(df) / 2) > ggplot(df, aes(x = type, Value, group = ID)) + geom_line(alpha = 0.3) 140 Value 120 100 1 2 type

  6. Data Visualization with ggplot2 Distribution of slope 50 slope Box plot? 45 40

  7. Data Visualization with ggplot2 2 distinct variables > head(dat) group1 group2 1 99.43952 149.2896 2 99.76982 150.2569 3 101.55871 149.7533 4 100.07051 149.6525 5 100.12929 149.0484 6 101.71506 149.9550

  8. Data Visualization with ggplot2 Sca � er plot > ggplot(dat, aes(x = group1, y = group2)) + geom_point() ● ● ● 152 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 150 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● group2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 148 ● 146 ● 98 100 102 104 group1

  9. Data Visualization with ggplot2 2D density plot > library(viridis) > ggplot(dat, aes(x = group1, y = group2)) + stat_density_2d(geom = "tile", aes(fill = ..density..), 
 contour = FALSE) + scale_fill_viridis() 152.5 density 150.0 0.15 group2 0.10 0.05 147.5 145.0 98 100 102 104 group1

  10. Data Visualization with ggplot2 Bag plot > library(aplpack) > bagplot(dat[1:2]) ● hull 152 ● ● ● ● ● bag ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 150 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● group2 ● ● ● ● ● ● ● loop ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 148 ● ● ● ● ● 146 ● 98 100 102 104 group1

  11. Data Visualization with ggplot2 aplpack > library(aplpack) > plot_data <- compute.bagplot(x = dat$group1, y = dat$group2) > names(plot_data) [1] "center" "hull.center" "hull.bag" "hull.loop" 
 [5] "pxy.bag" "pxy.outer" "pxy.outlier" "hdepths" [9] "is.one.dim" "prdata" "xy" "xydata"

  12. Data Visualization with ggplot2 ggplot2 > ggplot(dat, aes(x = group1, y = group2)) + geom_point() ● ● ● 152 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 150 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● group2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 148 ● 146 ● 98 100 102 104 group1

  13. Data Visualization with ggplot2 ggplot2 > ggplot(dat, aes(x = group1, y = group2)) + stat_bag(alpha = 0.2) 152 150 group2 148 146 98 100 102 104 group1

  14. Data Visualization with ggplot2 Remarks ● Useful but not popular ● Poorly understood ● Learn to use ggplot2 extensions

  15. DATA VISUALIZATION WITH GGPLOT2 Let’s practice!

  16. DATA VISUALIZATION WITH GGPLOT2 Case Study II Weather (Part 1)

  17. Data Visualization with ggplot2 Weather Source: h � p://www.edwardtu � e.com/

  18. Data Visualization with ggplot2 present > dim(present) [1] 153 5 > head(present, n = 4) month day year temp new_day 1 1 1 2016 41 1 2 1 2 2016 37 2 3 1 3 2016 40 3 4 1 4 2016 33 4 > tail(present, n = 4) month day year temp new_day 148 5 28 2016 79 148 149 5 29 2016 80 149 150 5 30 2016 73 150 151 5 31 2016 76 151

  19. Data Visualization with ggplot2 Time series > ggplot(present, aes(x = new_day, y = temp)) + geom_line() 80 60 temp 40 20 0 50 100 150 new_day

  20. Data Visualization with ggplot2 past > str(past) 'data.frame': 7645 obs. of 11 variables: $ month : num 1 1 1 1 1 1 1 1 1 1 ... $ day : num 1 2 3 4 5 6 7 8 9 10 ... $ year : num 1995 1995 1995 1995 1995 ... $ temp : num 44 41 28 31 21 27 42 35 34 29 ... $ new_day : int 1 2 3 4 5 6 7 8 9 10 ... $ upper : num 51 48 57 55 56 62 52 57 54 47 ... $ lower : num 17 15 16 15 21 14 14 12 21 8.5 ... $ avg : num 35.6 35.4 34.9 35.1 35.9 ... $ se : num 2.19 1.83 2.46 2.53 1.92 ... $ avg_upper: num 40.2 39.2 40 40.5 39.9 ... $ avg_lower: num 31 31.5 29.7 29.8 31.9 ...

  21. Data Visualization with ggplot2 Each year separately > ggplot(past, aes(x = new_day, y = temp, group = year)) + geom_line(alpha = 0.2) 75 temp 50 25 0 100 200 300 new_day

  22. Data Visualization with ggplot2 present + past > ggplot(past, aes(x = new_day, y = temp, group = year)) + geom_line(alpha = 0.4) + geom_line(data = present, aes(group = 1), col = "red") 75 temp 50 25 0 100 200 300 new_day

  23. Data Visualization with ggplot2 present + past > ggplot(past, aes(x = new_day, y = temp, group = year)) + geom_line(alpha = 0.4) + geom_line(data = present, aes(group = 1), col = "red") 75 temp 50 25 0 100 200 300 new_day

  24. Data Visualization with ggplot2 Linerange 75 temp 50 25 0 100 200 300 new_day

  25. Data Visualization with ggplot2 Records ● ● 75 ● ● ● ● temp ● ● ● ● 50 ● ● 25 ● ● ● 0 100 200 300 new_day

  26. Data Visualization with ggplot2 Custom legend ● ● 75 ● ● ● ● temp ● ● ● ● 50 ● ● ● New record high past record high 95% CI range 25 Current year ● ● past record low ● New record low ● 0 100 200 300 new_day

  27. DATA VISUALIZATION WITH GGPLOT2 Let’s practice!

  28. DATA VISUALIZATION WITH GGPLOT2 Case Study II Weather (Part 2)

  29. Data Visualization with ggplot2 Up to now ● ● 75 ● ● ● ● temp ● ● ● ● 50 ● ● ● New record high past record high 95% CI range 25 Current year ● ● past record low ● New record low ● 0 100 200 300 new_day

  30. Data Visualization with ggplot2 Situation ● Many data frames ● Plot summary data frame as a layer ● stat_summary()

  31. Data Visualization with ggplot2 stat_historical() > ggplot(my_data, aes(x = new_day, y = temp, fill = year)) + stat_historical() 75 temp 50 25 0 100 200 300 new_day

  32. Data Visualization with ggplot2 stat_present() > ggplot(my_data, aes(x = new_day, y = temp, fill = year)) + stat_historical() + stat_present() 75 temp 50 25 0 100 200 300 new_day

  33. Data Visualization with ggplot2 stat_extremes() > ggplot(my_data, aes(new_day, temp, fill = year)) + stat_historical() + stat_present() + stat_extremes(aes(colour = ..record..)) ● ● 75 ● ● ● ● temp ● ● ● ● 50 ● ● 25 ● ● ● 0 100 200 300 new_day

  34. Data Visualization with ggplot2 Specific layers > ggplot(my_data, aes(new_day, temp, fill = year)) + stat_historical() + # stat_present() + stat_extremes(aes(colour = ..record..)) ● ● 75 ● ● ● ● temp ● ● ● ● 50 ● ● 25 ● ● ● 0 100 200 300 new_day

  35. Data Visualization with ggplot2 Face � ing PARIS REYKJAVIK 75 ● 50 ● ● 25 ● ● ● temp NEW YORK LONDON ● ● 75 ● ● ● ● ● ● ● ● ● ● ● ● ● 50 ● ● 25 ● ● ● 0 100 200 300 0 100 200 300 new_day

  36. DATA VISUALIZATION WITH GGPLOT2 Let’s practice!

  37. DATA VISUALIZATION WITH GGPLOT2 Wrap-up

  38. Data Visualization with ggplot2 Statistics Design Graphical Communication Data Analysis & Perception

  39. Data Visualization with ggplot2 Explore Explain Confirm Inform and and Analyse Persuade

  40. Data Visualization with ggplot2 Element Description Data The dataset being plo � ed. Aesthetics The scales onto which we map our data. Geometries The visual elements used for our data.

Recommend


More recommend