csss 569 visualizing data and models
play

CSSS 569 Visualizing Data and Models Lab 4: Advanced ggplot2 Kai - PowerPoint PPT Presentation

CSSS 569 Visualizing Data and Models Lab 4: Advanced ggplot2 Kai Ping (Brian) Leung Department of Political Science, UW January 30, 2020 Introduction Recap of what weve covered last week Making a scatterplot from scratch in ggplot2


  1. CSSS 569 Visualizing Data and Models Lab 4: Advanced ggplot2 Kai Ping (Brian) Leung Department of Political Science, UW January 30, 2020

  2. Introduction ◮ Recap of what we’ve covered last week ◮ Making a scatterplot from scratch in ggplot2 (from Chris’s slides) 1. Decide on dimensions: aspect ratio, axis limits 2. Add axis labels, plot titles 3. Choose data markers: points, symbols, text 4. Scaling & transformation, add ticks if needed 5. Choose a color palette 6. Add annotations: labels, arrows, notes 7. Add best-fit line(s) & confidence intervals 8. Add extra plots (e.g., rugs) to make a confection 9. Repeat as small multiples ( facet_grid and facet_wrap ) ◮ Next week we’ll implement them using tile ◮ Unpack the inner working of ggplot2 ◮ data, aes(. . . ), geom(. . . , inherit.aes = TRUE) ◮ Customized theme: theme_cavis.R ◮ Exercise to reproduce a graph

  3. Roadmap for today Today’s lab is structured around three exercises: Predicted probability of voting Clinton Perot Bush 1.0 0.8 0.6 0.4 0.2 White Non−white 0.0 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 Ideological self−placement (from very liberal to very conservative)

  4. Roadmap for today Today’s lab is structured around three exercises: First Difference in Predicted Probabilities in winning CY Young −75% −50% −25% 0% 25% 50% ● era ● ● walks ● ● strikeout ● ● innings ● ● Model 1 ● winpct ● ● Model 2 −75% −50% −25% 0% 25% 50% First Difference in Predicted Probabilities in winning CY Young

  5. Roadmap for today Today’s lab is structured around three exercises: Incidence of Measles in the US Wyoming Wisconsin West Virginia Washington Virginia Vermont Utah Texas Tennessee South Dakota South Carolina Rhode Island Cases per Pennsylvania 100,000 people Oregon Oklahoma Ohio >1000 North Dakota North Carolina New York 500−1000 New Mexico New Jersey New Hampshire 100−500 Nevada Nebraska Montana Missouri 10−100 Mississippi Minnesota Michigan 1−10 Massachusetts Maryland Maine 0−1 Louisiana Kentucky Kansas 0 Iowa Indiana Illinois NA Idaho Hawaii Georgia Florida District Of Columbia Delaware Connecticut Colorado California Arkansas Arizona Alaska Alabama 1930 1940 1950 1960 1970 1980 1990 2000

  6. Roadmap for today 1. Last exercise: 1992 Presidential Election

  7. Roadmap for today 1. Last exercise: 1992 Presidential Election ◮ Use of scale_{...}

  8. Roadmap for today 1. Last exercise: 1992 Presidential Election ◮ Use of scale_{...} ◮ Use of facet_grid and facet-specific labels

  9. Roadmap for today 1. Last exercise: 1992 Presidential Election ◮ Use of scale_{...} ◮ Use of facet_grid and facet-specific labels 2. Ropeladder exercise: Cy Young award

  10. Roadmap for today 1. Last exercise: 1992 Presidential Election ◮ Use of scale_{...} ◮ Use of facet_grid and facet-specific labels 2. Ropeladder exercise: Cy Young award ◮ pivot_longer and pivot_wider

  11. Roadmap for today 1. Last exercise: 1992 Presidential Election ◮ Use of scale_{...} ◮ Use of facet_grid and facet-specific labels 2. Ropeladder exercise: Cy Young award ◮ pivot_longer and pivot_wider ◮ Sorting using fct_reorder

  12. Roadmap for today 1. Last exercise: 1992 Presidential Election ◮ Use of scale_{...} ◮ Use of facet_grid and facet-specific labels 2. Ropeladder exercise: Cy Young award ◮ pivot_longer and pivot_wider ◮ Sorting using fct_reorder ◮ Use of scale_{...}

  13. Roadmap for today 1. Last exercise: 1992 Presidential Election ◮ Use of scale_{...} ◮ Use of facet_grid and facet-specific labels 2. Ropeladder exercise: Cy Young award ◮ pivot_longer and pivot_wider ◮ Sorting using fct_reorder ◮ Use of scale_{...} 3. Heatmap exercise: Measles in US

  14. Roadmap for today 1. Last exercise: 1992 Presidential Election ◮ Use of scale_{...} ◮ Use of facet_grid and facet-specific labels 2. Ropeladder exercise: Cy Young award ◮ pivot_longer and pivot_wider ◮ Sorting using fct_reorder ◮ Use of scale_{...} 3. Heatmap exercise: Measles in US ◮ Use of geom_tile and various ways to scale_color/fill_{...}

  15. Roadmap for today 1. Last exercise: 1992 Presidential Election ◮ Use of scale_{...} ◮ Use of facet_grid and facet-specific labels 2. Ropeladder exercise: Cy Young award ◮ pivot_longer and pivot_wider ◮ Sorting using fct_reorder ◮ Use of scale_{...} 3. Heatmap exercise: Measles in US ◮ Use of geom_tile and various ways to scale_color/fill_{...} 4. Highlight ggplot2 extension packages (See more here)

  16. Last exercise: 1992 Presidential Election Predicted probability of voting Clinton Perot Bush 1.0 0.8 0.6 0.4 0.2 White Non−white 0.0 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 Ideological self−placement (from very liberal to very conservative)

  17. Last exercise: motivation ◮ There are many ways to do small multiples:

  18. Last exercise: motivation ◮ There are many ways to do small multiples: ◮ plot + facet_grid(nonwhite ~ vote92) Clinton Perot Bush 0.8 Predicted prob. of voting 0.6 Non−white 0.4 0.2 0.0 0.8 0.6 White 0.4 0.2 0.0 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 Ideological self−placement (from very liberal to very conservative)

  19. Last exercise: motivation ◮ Thoughtful juxtaposition facilitates meaningful comparison and provokes further inquiry

  20. Last exercise: motivation ◮ Thoughtful juxtaposition facilitates meaningful comparison and provokes further inquiry ◮ Sometimes, data overlapping might be the interesting phenomenon. . . Predicted probability of voting Clinton Perot Bush 1.0 0.8 0.6 0.4 0.2 White Non−white 0.0 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 Ideological self−placement (from very liberal to very conservative)

  21. Last exercise: 1992 Presidential Election # Prerequisite # Load package library (tidyverse) library (RColorBrewer) # Load data presVoteEV <- read_csv ("data/presVoteEV.csv") # Load theme source ("theme/theme_cavis.R") # Get nice color brewer <- brewer.pal (9, "Set1") blue <- brewer[2] orange <- brewer[5]

  22. Last exercise: 1992 Presidential Election # Factorize variables presVoteEV <- presVoteEV %>% mutate ( nonwhite = factor (nonwhite), vote92 = factor (vote92, levels = c ("Clinton", "Perot", "Bush")) )

  23. Last exercise: 1992 Presidential Election p <- ggplot (presVoteEV, aes (x = rlibcon, y = pe, ymin = lower, ymax = upper, color = nonwhite, fill = nonwhite)) + facet_grid ( ~ vote92) + geom_line () + theme_cavis_hgrid print (p) Clinton Perot Bush 1.00 0.75 0 pe 0.50 1 0.25 0.00 2 4 6 2 4 6 2 4 6 rlibcon

  24. Last exercise: 1992 Presidential Election p <- p + scale_color_manual (values = c (blue, orange), labels = c ("White", "Non-white")) print (p) Clinton Perot Bush 1.00 0.75 White pe 0.50 Non−white 0.25 0.00 2 4 6 2 4 6 2 4 6 rlibcon

  25. Last exercise: 1992 Presidential Election p + geom_ribbon (alpha = 0.5, show.legend = FALSE) + scale_fill_manual (values = c (blue, orange)) Clinton Perot Bush 1.00 0.75 White pe 0.50 Non−white 0.25 0.00 2 4 6 2 4 6 2 4 6 rlibcon

  26. Last exercise: 1992 Presidential Election p + geom_ribbon (alpha = 0.5, linetype = 0, show.legend = FALSE) + scale_fill_manual (values = c (blue, orange)) Clinton Perot Bush 1.00 0.75 White pe 0.50 Non−white 0.25 0.00 2 4 6 2 4 6 2 4 6 rlibcon

  27. Last exercise: 1992 Presidential Election p <- p + geom_ribbon (alpha = 0.5, linetype = 0, show.legend = FALSE) + scale_fill_manual (values = c (blue, NA)) print (p) Clinton Perot Bush 1.00 0.75 White pe 0.50 Non−white 0.25 0.00 2 4 6 2 4 6 2 4 6 rlibcon

  28. Last exercise: 1992 Presidential Election p + geom_line ( aes (y = upper)) + geom_line ( aes (y = lower)) Clinton Perot Bush 1.00 0.75 White pe 0.50 Non−white 0.25 0.00 2 4 6 2 4 6 2 4 6 rlibcon

  29. Last exercise: 1992 Presidential Election p <- p + geom_line ( aes (y = upper, linetype = nonwhite), show.legend = FALSE) + geom_line ( aes (y = lower, linetype = nonwhite), show.legend = FALSE) + scale_linetype_manual (values = c (0, 2)) # 0 = blank; 2 = dashed print (p) Clinton Perot Bush 1.00 0.75 White pe 0.50 Non−white 0.25 0.00 2 4 6 2 4 6 2 4 6 rlibcon

  30. Last exercise: 1992 Presidential Election p <- p + scale_x_continuous (breaks = 1 : 7) + scale_y_continuous (breaks = seq (0, 1, 0.2), limits = c (0, 1), expand = c (0, 0)) print (p) Clinton Perot Bush 1.0 0.8 0.6 White pe 0.4 Non−white 0.2 0.0 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 rlibcon

  31. Last exercise: 1992 Presidential Election p <- p + theme (legend.position = c (0.06, 0.13), legend.key.size = unit (0.2, "cm")) + labs (y = "Predicted prob. of voting", x = "Ideological self-placement") print (p) Clinton Perot Bush 1.0 Predicted prob. of voting 0.8 0.6 0.4 0.2 White Non−white 0.0 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 Ideological self−placement

Recommend


More recommend