What you should know after day 6 An introduction to WS 2018/2019 Review: Rearranging and manipulating data Graphics with base R ● Histograms ● Scatterplots ● Boxplots Data visualization and graphics Saving plots Graphics with ggplot2 Dr. Sonja Grath Dr. Eliza Argyridou Special thanks to : Dr. Benedikt Holtmann for sharing slides for this lecture 2 Review Review Reshaping data Combining datasets Package Functions to combine data sets in dplyr tidyr left_join(a, b, by = "x1") Joins matching rows from b to a right_join(a, b, by = "x1") Joins matching rows from a to b inner_join(a, b, by = "x1") Returns all rows from a where there are gather() matching values in b full_join(a, b, by = "x1") Joins data and returns all rows and columns Fish survey Water GPS characteristics Site Site spread() Site Month Transect Month Transect Latitude Water temp. Species Longitude O 2 - content 3 4 Review Review Adding new variables Adding new variables ● Split one column into two using separate() from dplyr Three ways for adding a new variable (example: log of FID) package a) Using $ ● Combine two columns using unite() from tidyr package Bird_Behaviour$log_FID <- log(Bird_Behaviour$FID) dplyr::separate() tidyr::unite() b) Using the [ ] - operator separate() unite() X1 X2 X1 X2.1 X2.2 X1 X2 Bird_Behaviour[ , "log_FID"] <- log(Bird_Behaviour$FID) A 1_1 A 1 1 A 1_1 B 1_2 B 1 2 B 1_2 c) Using the function mutate() from dplyr package A 2_1 A 2 1 A 2_1 B 2_2 B 2 2 B 2_2 Bird_Behaviour <- mutate(Bird_Behaviour, log_FID = log(FID)) 5 6
Review Review Subsetting data Subsetting data Subsetting data: Subsetting data with functions from the dplyr package Operator Description ● Using [ ] – operator ● You can subset by rows using slice() and filter() ● Using subset() ● You can subset by columns with select() > greater than >= greater than or ● With functions from the dplyr package # selects rows 3-5 equal to Bird_Behaviour.slice <- slice(Bird_Behaviour, < less than # selects the rows 1 to 3 and 3:5) # columns 1 to 4 <= less than or # selects rows that meet certain criteria equal to Bird_Behaviour[1:3, 1:4] Bird_Behaviour.filter <- filter(Bird_Behaviour, == equal to FID < 5) # selects the columns Ind, Sex, and Fledglings != not equal to # selects all rows with males Bird_Behaviour_col <- select(Bird_Behaviour, x & y x and y Bird_Behaviour[Bird_Behaviour$Sex == Ind, Sex, Fledglings) "male", ] x | y x or y # excludes the variable disturbance Bird_Behaviour_reduced <- select(Bird_Behaviour, # selects all rows that have a value of -Disturbance) FID greater than 10 or less than 15. We keep only the IND, Sex and Year column subset(Bird_Behaviour, FID > 10 | FID < 15, 7 8 select = c(Ind, Sex, Year)) Graphics with base R Graphics with base R Creating a histogram with hist() Example 1: Simple graphics using plotting functions in the graphics package hist(Sparrows$Tarsus) Histogram of Sparr ows$T arsus ● Base R, installed by default Function Description ● Easy and quick to type 0 0 hist() Histograms 2 ● Wide variety of functions plot() Scatterplots, etc. 0 5 y 1 c boxplot() Box- and whisker plots n e u q 0 e barplot() Bar- and column charts 0 r 1 F dotchart() Cleveland dot plots 0 5 contour Contour of a surface (2D) pie() Circular pie chart 0 … 1 9 2 0 2 1 2 2 2 3 2 4 2 5 S p a r r o w s $ T a r s u s 9 10 Graphics with base R Graphics with base R Creating a histogram with hist() Creating a histogram with hist() Example 2: Alter colour and the number of bins Example 3: Add density curve hist(Sparrows$Tarsus, hist(Sparrows$Tarsus, Histogram of Sparrows$Tarsus Histogram of Sparrow s$ Tarsus col = "grey", col = "grey", 6 . 0 0 6 breaks = 50) breaks = 50, 0 5 4 . freq = FALSE) 0 y t i s 0 4 n e y D c 2 n . 0 e 0 3 u q e r F 0 2 0 . 0 0 1 1 9 2 0 2 1 2 2 2 3 2 4 2 5 0 S p a r r o w s $ T a r s u s 1 9 2 0 2 1 2 2 2 3 2 4 2 5 11 12 S p a r r o w s $ T a r s u s
Graphics with base R Graphics with base R Creating a histogram with hist() Creating a histogram with hist() Example 3: Add density curve Example 4: Plot only males hist(Sparrows [ Sparrows$Sex == "Male", ] $Tarsus, hist(Sparrows$Tarsus, col = "grey", col="grey", Hist ogram of Sparrows$T arsus breaks = 50) 6 breaks = 50, . 0 freq = FALSE) 4 . 0 y Histogram of Sparrow s[Sparrow s$ Se x = = "M ale", ]$ Tarsus t lines(density(Sparrows$Tarsus), i s n e 0 0 5 D 2 col = "blue", . 0 0 4 y c lwd = 2) n 0 3 e u 0 q . e r 0 0 2 F 1 1 9 2 0 2 1 2 2 2 3 2 4 2 5 0 S p a r r o w s $ T a r s u s 2 0 2 1 2 2 2 3 2 4 2 5 13 14 S p a r r o w s [ S p a r r o w s $ S e x = = " Ma l e " , ] $ T a r s u s Graphics with base R Graphics with base R Creating a scatterplot with plot() Creating a scatterplot with plot() ➔ Relationship between two continuous variables Example 2: Alter axis limits and shape of symbols Example 1: plot(Sparrows$Tarsus, plot(Sparrows$Wing, Sparrows$Tarsus) Sparrows$Wing, xlim = c(50, 70), 5 2 pch = 15, 4 2 s u 3 5 s 2 r 2 col = "blue") a T $ s 2 4 2 w 2 o r r s a u p s 3 1 2 S r 2 a Try yourself: T $ s 2 0 w 2 2 o r r a ?pch p 1 S 9 2 1 0 2 5 5 6 0 6 5 9 S p a r r o w s $ Wi n g 1 5 0 5 5 6 0 6 5 7 0 15 16 S p a r r o w s $ Wi n g Graphics with base R Graphics with base R Creating a scatterplot with plot() Creating line graphs with plot() Example 3: Alter the size of plotting symbols Examples: plot(Sparrows$Wing, data( " pressure " ) Sparrows$Tarsus, plot(pressure$temperature, pressure$pressure) plot(pressure$temperature, pressure$pressure, type = "l" ) xlim = c(50,70), cex = 1.5) e r e r u 0 u 0 s 0 s s 0 6 s e 6 e r r 5 p p 2 $ $ e e r r s u u u 0 0 s s s 0 s s 0 r 3 e 0 2 e 0 2 a r 2 r T p p $ s w o r r 1 a 0 5 0 1 5 0 2 5 0 3 5 0 0 5 0 1 5 0 2 5 0 3 5 0 2 p S p r e s s u r e $ t e m p e r a t u r e p r e s s u r e $ t e m p e r a t u r e 9 1 5 0 5 5 6 0 6 5 7 0 S p a r r o w s $ Wi n g 17 18
Graphics with base R Graphics with base R Use the type argument to specify the type of plot Creating a boxplot with boxplot() ➔ Relationship between continuous and categorical variables Example 1: Possible types boxplot(Wing ~ Sex, data = Sparrows) " p " points "l" lines " b " points connected by lines 5 6 " o " points overlaid by lines " h " vertical lines from points to the zero axis 0 6 " s " steps " n " nothing, only the axes 5 5 19 F e m a l e M a l e 20 Graphics with base R Graphics with base R Example 2: Example 2: Multiple grouping variables boxplot(Wing ~ Sex, data = Sparrows, boxplot(Wing ~ Sex + Species , data = Sparrows, xlab = 'Sex', # Adds label to x-axis xlab = ’Species and Sex', ylab = 'Wing length (mm)', # Adds label to y-axis ylab = 'Wing length (mm)', col =c("red", "blue"), # Adds colour col=c("red", "blue"), ylim = c(50,70), # Changes axis limits ylim = c(50,70), main = "Boxplot " )) # Adds title main = "")) Boxplot 0 7 0 7 5 5 ) 6 6 m ) m m ( m ( h h t t g 0 g 0 6 n 6 n e e l l g g n Wi Wi n 5 5 5 5 0 0 5 5 F e m a l e M a e l F e ma l e . S E S P M a l e . S E S P F e ma l e . S S T S M a l e . S S T S 21 22 S e x S p e c i e s a n d S e x Graphics with base R Multiple plots on one page Common parameters in graphics The par() function: main title of the plot xlab label of x-axis • comes with an extensive list of graphical parameters you can ylab label of y-axis change (see ?par) xlim range/limits of x-axis ylim range/limits of y-axis • Some options are helpful; others you may never use col colour of the points, bars, etc. can be character string or hexadecimal colour (e.g. #RRGGBB) To plot multiple charts within the same window, you can use the breaks number of bins mfcol or mfrow parameter pch shape of symbol cex size of symbols For example, par(mfrow = c(2, 2) divides the graphic window lty line type into four panels (two rows and two columns) lwd line width 23 24
Recommend
More recommend