More ggplot Steve Bagley somgen223.stanford.edu 1 More ggplot - - PowerPoint PPT Presentation

more ggplot
SMART_READER_LITE
LIVE PREVIEW

More ggplot Steve Bagley somgen223.stanford.edu 1 More ggplot - - PowerPoint PPT Presentation

More ggplot Steve Bagley somgen223.stanford.edu 1 More ggplot somgen223.stanford.edu 2 iris %>% ggplot ( aes (Petal.Length)) + geom_histogram () Histograms 20 count 10 0 2 4 6 Petal.Length A histogram only needs a single


slide-1
SLIDE 1

More ggplot

Steve Bagley

somgen223.stanford.edu 1

slide-2
SLIDE 2

More ggplot

somgen223.stanford.edu 2

slide-3
SLIDE 3

Histograms

iris %>% ggplot(aes(Petal.Length)) + geom_histogram()

10 20 2 4 6

Petal.Length count

  • A histogram only needs a single variable (column). It guesses the size of a bin and

counts the number of occurrences in each bin.

somgen223.stanford.edu 3

slide-4
SLIDE 4

Adjusting the binwidth

iris %>% ggplot(aes(Petal.Length)) + geom_histogram(binwidth = 1)

10 20 30 2 4 6

Petal.Length count

somgen223.stanford.edu 4

slide-5
SLIDE 5

Adjusting the binwidth

iris %>% ggplot(aes(Petal.Length)) + geom_histogram(binwidth = 0.1)

5 10 2 4 6

Petal.Length count

  • This histogram includes all the measurements from all the species.

somgen223.stanford.edu 5

slide-6
SLIDE 6

Histogram by species with facets

iris %>% ggplot(aes(Petal.Length)) + geom_histogram() + facet_wrap(vars(Species))

setosa versicolor virginica 2 4 6 2 4 6 2 4 6 10 20

Petal.Length count

somgen223.stanford.edu 6

slide-7
SLIDE 7

Histogram by species with overlays

iris %>% ggplot(aes(Petal.Length, fill = Species)) + geom_histogram(color = "black", alpha = 0.5, position = "identity")

10 20 2 4 6

Petal.Length count Species

setosa versicolor virginica

  • position = "identity" means plot the three histograms on the same plot.
  • color = "black" makes the histogram border color be black. fill is the color
  • f the contents of the bar.
  • alpha = 0.5 makes the histogram bars partially transparent. Note that the

colors in overlapping regions add.

somgen223.stanford.edu 7

slide-8
SLIDE 8

Frequency polygon (better for overlapping distributions)

iris %>% ggplot(aes(Petal.Length, color = Species)) + geom_freqpoly()

10 20 2 4 6

Petal.Length count Species

setosa versicolor virginica somgen223.stanford.edu 8

slide-9
SLIDE 9

Changing the color scale

  • Scales control the mapping from the data to some visual property of the graph.
  • In ggplot, color generally refers to the color of a point or line, or the color of

the boundary of a solid object, such as the bar of a histogram.

  • fill refers to the color filling a solid object.
  • Some scales apply to discrete-valued data. Some scales apply to

continuous-valued data.

  • alpha is a number between 0 and 1 that describes the transparency of a geom,

with 1 being completely opaque.

somgen223.stanford.edu 9

slide-10
SLIDE 10

Changing the bar color

BOD %>% ggplot(aes(Time, demand)) + geom_col()

5 10 15 20 2 4 6

Time demand

  • geom_col makes a barplot when you have the actual y-value to plot. (Compare

geom_bar.)

somgen223.stanford.edu 10

slide-11
SLIDE 11

geom_col with color

BOD %>% ggplot(aes(Time, demand)) + geom_col(color = "red")

5 10 15 20 2 4 6

Time demand

somgen223.stanford.edu 11

slide-12
SLIDE 12

geom_col with fill

BOD %>% ggplot(aes(Time, demand)) + geom_col(fill = "red")

5 10 15 20 2 4 6

Time demand

somgen223.stanford.edu 12

slide-13
SLIDE 13

geom_col with fill and color

BOD %>% ggplot(aes(Time, demand)) + geom_col(fill = "red", color = "black")

5 10 15 20 2 4 6

Time demand

somgen223.stanford.edu 13

slide-14
SLIDE 14

Colors for discrete scales

  • For data with a discrete number of values (and usually not a large number), you

can use the default colors, or change them.

  • scale_color_brewer has a number of different color palettes.
  • They are documented here: R color brewer palettes
  • The khroma package can help with palettes for those with color blindness.

somgen223.stanford.edu 14

slide-15
SLIDE 15

Example using default colors

ggplot(iris, aes(Petal.Width, Petal.Length)) + geom_point(aes(color = Species))

2 4 6 0.0 0.5 1.0 1.5 2.0 2.5

Petal.Width Petal.Length Species

setosa versicolor virginica somgen223.stanford.edu 15

slide-16
SLIDE 16

Example using color brewer

ggplot(iris, aes(Petal.Width, Petal.Length)) + geom_point(aes(color = Species)) + scale_color_brewer(palette = "Set1")

2 4 6 0.0 0.5 1.0 1.5 2.0 2.5

Petal.Width Petal.Length Species

setosa versicolor virginica somgen223.stanford.edu 16

slide-17
SLIDE 17

Example using your own list of colors

ggplot(iris, aes(Petal.Width, Petal.Length)) + geom_point(aes(color = Species)) + scale_color_manual(values = c("purple", "red", "orange"))

2 4 6 0.0 0.5 1.0 1.5 2.0 2.5

Petal.Width Petal.Length Species

setosa versicolor virginica somgen223.stanford.edu 17

slide-18
SLIDE 18

Changing scale of an axis

ggplot(iris, aes(Petal.Width, log10(Petal.Length))) + geom_point(aes(color = Species)) + scale_x_log10()

0.0 0.2 0.4 0.6 0.8 0.1 0.3 1.0

Petal.Width log10(Petal.Length) Species

setosa versicolor virginica

  • x-axis: original data, log scale
  • y-axis: log of original data

somgen223.stanford.edu 18

slide-19
SLIDE 19

Add annotations

  • Annotations are text or graphics that are added to the graph, usually to highlight

some point or region.

  • geom_text adds text to each point.
  • annotate adds text (or other things) to a small number of points.

somgen223.stanford.edu 19

slide-20
SLIDE 20

Annotation example

BOD %>% ggplot(aes(Time, demand)) + geom_point() + annotate("text", x = 3, y = 18.5, label = "outlier?", color = "red")

  • utlier?

10.0 12.5 15.0 17.5 20.0 2 4 6

Time demand

somgen223.stanford.edu 20

slide-21
SLIDE 21

Reading

  • Read: 10 Mastering the grammar | ggplot2: Elegant Graphics for Data Analysis
  • Skim: as much of the rest of the book as you can
  • Skim: 28 Graphics for communication | R for Data Science
  • Interesting: Top 50 ggplot2 Visualizations - The Master List (With Full R Code)

somgen223.stanford.edu 21