graphics in r
play

Graphics in R STAT 133 Gaston Sanchez Department of Statistics, - PowerPoint PPT Presentation

Graphics in R STAT 133 Gaston Sanchez Department of Statistics, UCBerkeley gastonsanchez.com github.com/gastonstat/stat133 Course web: gastonsanchez.com/stat133 R Graphics 2 Understanding Graphics in R 2 main graphics systems


  1. Graphics in R STAT 133 Gaston Sanchez Department of Statistics, UC–Berkeley gastonsanchez.com github.com/gastonstat/stat133 Course web: gastonsanchez.com/stat133

  2. R Graphics 2

  3. Understanding Graphics in R 2 main graphics systems "graphics" & "grid" 3

  4. Basics of Graphics in R Graphics Systems ◮ "graphics" and "grid" are the two main graphics systems in R ◮ "graphics" is the traditional system, also referred to as base graphics ◮ "grid" prodives low-level functions for programming plotting functions 4

  5. Basics of Graphics in R Graphics Engine ◮ Underneath "graphics" and "grid" there is the package "grDevices" ◮ "grDevices" is the graphics engine in R ◮ It provides the graphics devices and support for colors and fonts 5

  6. maps diagram plotrix JavaGD graphics Cairo grDevices tikzDevice grid ggplot2 lattice 6

  7. Basics of Graphics in R Package "graphics" The package "graphics" is the traditional system; it provides functions for complete plots, as well as low-level facilities. Many other graphics packages are built on top of graphics like "maps" , "diagram" , "pixmap" , and many more. 7

  8. Understanding Graphics in R Package "grid" The "grid" package does not provide functions for drawing complete plots. "grid" is not used directly to produce statistical plots. Instead, it is used to build other graphics packages like "lattice" or "ggplot2" . 8

  9. In this course ◮ In this course we’ll focus on the packages "graphics" and "ggplot2" ◮ "graphics" is the traditional plotting system in R, and many functions and packages are built on top of it. ◮ "ggplot2" excels at providing graphics for visualizing multivariate data sets —in data.frame format—, while taking care of many issues for superior visual displays. 9

  10. R Graphics by Paul Murrell 10

  11. Some Resources ◮ R Graphics by Paul Murrell book and webpage ◮ R Graphics Cookbook by Winston Chang http://www.cookbook-r.com/Graphs/ ◮ ggplot2: Elegant Graphics for Data Analysis by Hadley Wickham ◮ R Graphs Cookbook by Hrishi Mittal ◮ Graphics for Statistics and Data Analysis with R by Kevin Keen 11

  12. Traditional (Base) Graphics 12

  13. Base Graphics in R Types of graphics functions Graphics functions can be divided into two main types: ◮ high-level functions produce complete plots, e.g. barplot(), boxplot(), dotchart() ◮ low-level functions add further output to an existing plot, e.g. text(), points(), legend() 13

  14. The plot() function ◮ plot() is the most important high-level function in traditional graphics ◮ The first argument to plot() provides the data to plot ◮ The provided data can take different forms: e.g. vectors, factors, matrices, data frames. ◮ To be more precise, plot() is a generic function ◮ You can create your own plot() method function 14

  15. Basic Plots with plot() In its basic form, we can use plot() to make graphics of: ◮ one single variable ◮ two variables ◮ multiple variables 15

  16. Plots of One Variable 16

  17. High-level graphics of a single variable Function Data Description numeric scatterplot plot() plot() factor barplot 1-D table barplot plot() numeric can be either a vector or a 1-D array (e.g. row or column from a matrix) 17

  18. One variable objects Vector / Factor 1-D table row (matrix) column (matrix) row (data.frame) column (data.frame) 18

  19. plot() of one variable # plot numeric vector num_vec <- (c(1:10))^2 plot(num_vec) # plot factor set.seed(4) abc <- factor(sample(c('A', 'B', 'C'), 20, replace = TRUE)) plot(abc) # plot 1D-table abc_table <- table(abc) plot(abc_table) 19

  20. plot() of one variable 100 ● 8 8 80 ● 6 6 ● num_vec abc_table 60 ● 4 4 40 ● ● 2 20 2 ● ● ● ● 0 0 0 2 4 6 8 10 A B C A B C Index abc 20

  21. More high-level graphics of a single variable Function Data Description numeric barplot barplot() numeric pie chart pie() numeric dotplot dotchart() numeric boxplot boxplot() hist() numeric histogram numeric 1-D scatterplot stripchart() stem() numeric stem-and-leaf plot 21

  22. Plots of one variable # barplot numeric vector barplot(num_vec) # pie chart pie(1:3) # dot plot dotchart(num_vec) 22

  23. Plots of one variable 100 ● 80 ● 2 ● 1 ● 60 ● ● 40 ● ● 20 ● 3 ● 0 0 20 40 60 80 100 23

  24. Plots of one variable # barplot numeric vector boxplot(num_vec) # pie chart hist(num_vec) # dot plot stripchart(num_vec) # stem-and-leaf stem(num_vec) 24

  25. boxplot() # boxplot boxplot(iris$Sepal.Length) 8.0 7.5 7.0 6.5 6.0 5.5 5.0 4.5 25

  26. hist() # histogram hist(iris$Sepal.Length) Histogram of iris$Sepal.Length 30 25 20 Frequency 15 10 5 0 4 5 6 7 8 iris$Sepal.Length 26

  27. Test your knowledge What option does not apply to histograms: A) adjacent bars (no gaps) B) area of bars indicate proportions C) bins of equal length D) bars can be reordered 27

  28. stripchart() # strip-chart (1-D scatter plot) # (for small sample sizes) stripchart(num_vec) 0 20 40 60 80 100 28

  29. stem() # stem-and-leaf plot # (for small sample sizes) stem(num_vec) ## ## The decimal point is 1 digit(s) to the right of the | ## ## 0 | 1496 ## 2 | 56 ## 4 | 9 ## 6 | 4 ## 8 | 1 ## 10 | 0 29

  30. Kernel Density Curve ◮ Surprisingly, R does not have a specific function to plot density curves ◮ R does have the density() function which computes a kernel density estimate ◮ We can pass a "density" object to plot() in order to get a density curve. 30

  31. Kernel Density Curve # kernel density curve dens <- density(num_vec) plot(dens) density.default(x = num_vec) 0.008 Density 0.004 0.000 −50 0 50 100 150 N = 10 Bandwidth = 19.41 31

  32. Test your knowledge What type of plot is based on the five-number summary A) bar chart B) box plot C) histogram D) scatterplot 32

  33. Plots of Two Variables 33

  34. High-level graphics of two variables Function Data Description numeric, numeric scatterplot plot() plot() numeric, factor stripcharts factor, numeric boxplots plot() plot() factor, factor spineplot 2-column numeric matrix scatterplot plot() 2-column numeric data.frame scatterplot plot() 2-D table mosaicplot plot() 34

  35. Two variable objects 2 numeric vectors 2-D table num vector, factor (frequency or factor, num vector crosstable) 2 factors 2-column 2-column (numeric matrix) (numeric data.frame) 35

  36. Plots of two variables # plot numeric, numeric plot(iris$Petal.Length, iris$Sepal.Length) # plot numeric, factor plot(iris$Petal.Length, iris$Species) # plot factor, numeric plot(iris$Species, iris$Petal.Length) # plot factor, factor plot(iris$Species, iris$Species) 36

  37. Plots of two variables # plot numeric, numeric plot(iris$Petal.Length, iris$Sepal.Length) ● ● ● ● ● 7.5 ● ● ● ● ● ● iris$Sepal.Length ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 6.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 4.5 ● ● ● ● ● ● ● ● 1 2 3 4 5 6 7 iris$Petal.Length 37

  38. Plots of two variables # plot numeric, factor plot(iris$Petal.Length, iris$Species) 3.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 iris$Species 2.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1.5 1.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1 2 3 4 5 6 7 iris$Petal.Length 38

  39. Plots of two variables # plot factor, numeric plot(iris$Species, iris$Petal.Length) 7 6 5 4 3 ● 2 ● 1 setosa versicolor virginica 39

  40. Plots of two variables # plot factor, factor plot(iris$Species, iris$Species) 1.0 virginica 0.8 0.6 versicolor y 0.4 setosa 0.2 0.0 setosa versicolor virginica x 40

  41. Plots of two variables # some fake data set.seed(1) # hair color hair <- factor( sample(c('blond', 'black', 'brown'), 100, replace = TRUE)) # eye color eye <- factor( sample(c('blue', 'brown', 'green'), 100, replace = TRUE)) 41

Recommend


More recommend