data visualization
play

DATA VISUALIZATION INTRODUCTION TO DATA ANALYSIS LEARNING GOALS - PowerPoint PPT Presentation

INTRODUCTION TO DATA ANALYSIS DATA VISUALIZATION INTRODUCTION TO DATA ANALYSIS LEARNING GOALS obtain a basic understanding of better/worse plotting understand the idea of hypothesis-driven visualization develop a basic understanding


  1. INTRODUCTION TO DATA ANALYSIS DATA VISUALIZATION

  2. INTRODUCTION TO DATA ANALYSIS LEARNING GOALS ▸ obtain a basic understanding of better/worse plotting ▸ understand the idea of hypothesis-driven visualization ▸ develop a basic understanding of the 'grammar of graphs' ▸ get familiar with frequent visualization strategies ▸ barplots, densities, violins, error bars etc. ▸ be able to fine-tune graphs for better visualization

  3. Motivation

  4. INTRODUCTION TO DATA ANALYSIS WHY VISUALIZE? ▸ a picture can be worth a million words (and numbers) ▸ every data analysis should start with a ‘getting to know the data’ phase ▸ visualization of different aspects of data is key to get intimate with the data ▸ data visualization as a means of communication (with others) ▸ hypothesis-driven visualization: obtain visual (suggestive) evidence regarding a research question of relevance

  5. INTRODUCTION TO DATA ANALYSIS WHY VISUALIZE? ▸ a picture can be worth a million words (and numbers) ▸ summary statistics can be misleading (because of information loss) ▸ every data analysis should start with a ‘getting to know the data’ phase ▸ use extensive visualization to get intimate with the data ▸ data visualization as a means of communication (with others / with yourself) ▸ hypothesis-driven visualization: obtain visual (suggestive) evidence regarding a research question of relevance

  6. INTRODUCTION TO DATA ANALYSIS BEYOND SUMMARY STATISTICS

  7. INTRODUCTION TO DATA ANALYSIS MOTIVATING EXAMPLE :: ANSCOMBE’S QUARTET ▸ famous data set, ships with core R messy start tidy up nice!

  8. INTRODUCTION TO DATA ANALYSIS MOTIVATING EXAMPLE :: ANSCOMBE’S QUARTET input data summarise all four groups look very similar!

  9. INTRODUCTION TO DATA ANALYSIS MOTIVATING EXAMPLE :: ANSCOMBE’S QUARTET ▸ quite different patterns despite similar correlation

  10. The good, the bad and the info-graphic

  11. INTRODUCTION TO DATA ANALYSIS PRINCIPLES OF GOOD VISUALIZATION ▸ maximize data-ink ratio (Tufte 1983) ▸ maximize information, minimize ink ▸ contra chart junk ▸ ink vs. processing effort ▸ analogy to language ▸ information flow ▸ ease of processing ▸ bound by conventional rules ▸ hypothesis-driven visualization ▸ relevance of information

  12. INTRODUCTION TO DATA ANALYSIS EXAMPLE OF UNINFORMATIVE PLOTTING

  13. INTRODUCTION TO DATA ANALYSIS EXAMPLE OF INFORMATIVE HYPOTHESIS-DRIVEN PLOTTING

  14. INTRODUCTION TO DATA ANALYSIS EXAMPLE OF UNINFORMATIVE PLOTTING

  15. INTRODUCTION TO DATA ANALYSIS EXAMPLE OF (STILL) UNINFORMATIVE PLOTTING

  16. INTRODUCTION TO DATA ANALYSIS EXAMPLE OF INFORMATIVE HYPOTHESIS-DRIVEN PLOTTING

  17. INTRODUCTION TO DATA ANALYSIS INFOGRAPHICS ▸ ≠ hypothesis-driven visualization ▸ purposes: ▸ memorability ▸ eye-catchiness ▸ persuasion ▸ ….

  18. Basics of ggplot

  19. INTRODUCTION TO DATA ANALYSIS BASICS OF GGPLOT ▸ “ grammar of layered graphs ” ▸ incremental composition ▸ layers ▸ system of rich convenience functions & defaults ▸ grouping ▸ multiple ways of customization

  20. INTRODUCTION TO DATA ANALYSIS INCREMENTAL COMPOSITION create a plot display the plot output 😊

  21. INTRODUCTION TO DATA ANALYSIS INCREMENTAL COMPOSITION output

  22. INTRODUCTION TO DATA ANALYSIS INCREMENTAL COMPOSITION ▸ piping data into 1 st argument slot ▸ declaring mapping globally for all subsequent calls to `geom_` functions output

  23. INTRODUCTION TO DATA ANALYSIS FULL EXAMPLE

  24. INTRODUCTION TO DATA ANALYSIS FULL EXAMPLE title subtitle legend for group distinction y-axis label grid lines data points y-axis tick labels linear regression lines

  25. INTRODUCTION TO DATA ANALYSIS FULL EXAMPLE :: CODE

  26. INTRODUCTION TO DATA ANALYSIS LAYERED GRAMMAR OF GRAPHS output equivalent ▸ `geom_` functions are wrappers ▸ default stat. transform, position, axis type etc. ▸ defaults can be overwritten

  27. Layers

  28. INTRODUCTION TO DATA ANALYSIS LAYERS

  29. INTRODUCTION TO DATA ANALYSIS LAYER ORDER

  30. INTRODUCTION TO DATA ANALYSIS OPACITY

  31. INTRODUCTION TO DATA ANALYSIS DIFFERENT DATA FOR DIFFERENT LAYERS

  32. Grouping

  33. INTRODUCTION TO DATA ANALYSIS GROUPING ▸ group information for uniform display in terms of color, shape, etc.

  34. INTRODUCTION TO DATA ANALYSIS GLOBAL GROUPING ▸ global grouping applies to all subsequent layers

  35. INTRODUCTION TO DATA ANALYSIS OVERWRITING GROUPING INFORMATION ▸ overwriting grouping information locally

  36. INTRODUCTION TO DATA ANALYSIS DIFFERENT GROUPING IN DIFFERENT LAYERS ▸ each layer has its own grouping information

  37. Geoms & plot types

  38. INTRODUCTION TO DATA ANALYSIS SCATTER PLOTS

  39. INTRODUCTION TO DATA ANALYSIS CURVE AND LINE FITS

  40. INTRODUCTION TO DATA ANALYSIS LINE PLOTS

  41. INTRODUCTION TO DATA ANALYSIS BAR PLOTS

  42. INTRODUCTION TO DATA ANALYSIS BAR PLOTS

  43. INTRODUCTION TO DATA ANALYSIS BAR PLOTS CAN BE UNDERINFORMATIVE ▸ suboptimal data-ink ratio ▸ lacks distributional information

  44. INTRODUCTION TO DATA ANALYSIS BAR PLOTS CAN OKAY ▸ choice proportions ▸ with 95% bootstrapped CIs

  45. INTRODUCTION TO DATA ANALYSIS HISTOGRAMS ▸ fix bins ▸ count number of data points in each bin ▸ plot as bar

  46. INTRODUCTION TO DATA ANALYSIS BOX PLOTS ▸ visualize common summary statistics ▸ mean ▸ 25% & 75% quantile ▸ …

  47. INTRODUCTION TO DATA ANALYSIS DENSITY PLOTS ▸ “generalized histogram” ▸ uses kernel estimation to predict smoothed curves

  48. INTRODUCTION TO DATA ANALYSIS VIOLIN PLOTS ▸ “mirrored density plots” ▸ good for multi-group comparisons

  49. INTRODUCTION TO DATA ANALYSIS RUG PLOTS ▸ show data points near axis

  50. INTRODUCTION TO DATA ANALYSIS RUG PLOTS ▸ show data points near axis

  51. INTRODUCTION TO DATA ANALYSIS ANNOTATION

  52. INTRODUCTION TO DATA ANALYSIS ANNOTATION

  53. Faceting

  54. INTRODUCTION TO DATA ANALYSIS FACET GRID

  55. INTRODUCTION TO DATA ANALYSIS FACET WRAP

  56. Bells & whistles

  57. INTRODUCTION TO DATA ANALYSIS READY-MADE THEMES

  58. INTRODUCTION TO DATA ANALYSIS TWEAKING AN EXISTING THEME

Recommend


More recommend