introd u ction to tid y data
play

Introd u ction to Tid y Data W OR K IN G W ITH DATA IN TH E - PowerPoint PPT Presentation

Introd u ction to Tid y Data W OR K IN G W ITH DATA IN TH E TIDYVE R SE Alison Hill Professor & Data Scientist WORKING WITH DATA IN THE TIDYVERSE WORKING WITH DATA IN THE TIDYVERSE The Great British Bake Off Series 8 WORKING WITH DATA


  1. Introd u ction to Tid y Data W OR K IN G W ITH DATA IN TH E TIDYVE R SE Alison Hill Professor & Data Scientist

  2. WORKING WITH DATA IN THE TIDYVERSE

  3. WORKING WITH DATA IN THE TIDYVERSE

  4. The Great British Bake Off Series 8 WORKING WITH DATA IN THE TIDYVERSE

  5. WORKING WITH DATA IN THE TIDYVERSE

  6. Tame b u t u n - tid y juniors_untidy # A tibble: 4 x 4 baker cinnamon_1 cardamom_2 nutmeg_3 <chr> <int> <int> <int> 1 Emma 1 0 1 2 Harry 1 1 1 3 Ruby 1 0 1 4 Zainab 0 NA 0 WORKING WITH DATA IN THE TIDYVERSE

  7. Tid y data juniors_tidy # A tibble: 12 x 4 baker spice order correct <chr> <chr> <int> <int> 1 Emma cinnamon 1 1 2 Harry cinnamon 1 1 3 Ruby cinnamon 1 1 4 Zainab cinnamon 1 0 5 Emma cardamom 2 0 6 Harry cardamom 2 1 7 Ruby cardamom 2 0 8 Zainab cardamom 2 NA 9 Emma nutmeg 3 1 10 Harry nutmeg 3 1 11 Ruby nutmeg 3 1 12 Zainab nutmeg 3 0 WORKING WITH DATA IN THE TIDYVERSE

  8. Who w on ? Co u nt it ! juniors_tidy %>% count(baker, wt = correct) # A tibble: 4 x 2 baker n <chr> <int> 1 Emma 2 2 Harry 3 3 Ruby 2 4 Zainab 0 WORKING WITH DATA IN THE TIDYVERSE

  9. Who w on ? Plot it ! ggplot(juniors_tidy, aes(baker, correct)) + geom_col() WORKING WITH DATA IN THE TIDYVERSE

  10. Which spice w as the hardest to g u ess ? Co u nt it ! ggplot(juniors_tidy, aes(baker, correct)) + geom_col() WORKING WITH DATA IN THE TIDYVERSE

  11. Which spice w as the hardest to g u ess ? Plot it ! ggplot(juniors_tidy, aes(spice, correct)) + geom_col() WORKING WITH DATA IN THE TIDYVERSE

  12. Insert title here ... WORKING WITH DATA IN THE TIDYVERSE

  13. Let ' s get to w ork ! W OR K IN G W ITH DATA IN TH E TIDYVE R SE

  14. Gather W OR K IN G W ITH DATA IN TH E TIDYVE R SE Alison Hill Professor & Data Scientist

  15. The ` tid y r ` package 1 h � p :// tid y r . tid yv erse . org ## Title ```y aml t y pe : F u llSlide ke y: e 6 e 5223 c 49 hide _ title : tr u e ``` WORKING WITH DATA IN THE TIDYVERSE

  16. Gather : u sage ?gather WORKING WITH DATA IN THE TIDYVERSE

  17. Gather : arg u ments ?gather WORKING WITH DATA IN THE TIDYVERSE

  18. Gathering j u niors WORKING WITH DATA IN THE TIDYVERSE

  19. Gathering w hat y o u ha v e into w hat y o u w ant WORKING WITH DATA IN THE TIDYVERSE

  20. The ke y col u mn WORKING WITH DATA IN THE TIDYVERSE

  21. The ke y col u mn WORKING WITH DATA IN THE TIDYVERSE

  22. The ke y col u mn WORKING WITH DATA IN THE TIDYVERSE

  23. The v al u e col u mn WORKING WITH DATA IN THE TIDYVERSE

  24. The v al u e col u mn WORKING WITH DATA IN THE TIDYVERSE

  25. The v al u e col u mn WORKING WITH DATA IN THE TIDYVERSE

  26. A little trick WORKING WITH DATA IN THE TIDYVERSE

  27. Let ' s get to w ork ! W OR K IN G W ITH DATA IN TH E TIDYVE R SE

  28. Separate W OR K IN G W ITH DATA IN TH E TIDYVE R SE Alison Hill Professor & Data Scientist

  29. Gathering the j u niors data WORKING WITH DATA IN THE TIDYVERSE

  30. Separate : u sage ?separate WORKING WITH DATA IN THE TIDYVERSE

  31. Separate : arg u ments ?separate WORKING WITH DATA IN THE TIDYVERSE

  32. Separating w hat y o u ha v e into w hat y o u w ant WORKING WITH DATA IN THE TIDYVERSE

  33. Separate ` spice ` WORKING WITH DATA IN THE TIDYVERSE

  34. Reminder : pre - separate juniors_untidy %>% gather(key = spice, value = correct, -baker) # A tibble: 12 x 3 baker spice correct <chr> <chr> <int> 1 Emma cinnamon_1 1 2 Harry cinnamon_1 1 3 Ruby cinnamon_1 1 4 Zainab cinnamon_1 0 5 Emma cardamom_2 0 6 Harry cardamom_2 1 7 Ruby cardamom_2 0 8 Zainab cardamom_2 NA 9 Emma nutmeg_3 1 10 Harry nutmeg_3 1 11 Ruby nutmeg_3 1 12 Zainab nutmeg_3 0 WORKING WITH DATA IN THE TIDYVERSE

  35. Gather and separate juniors_untidy %>% gather(key = "spice", value = "correct", -baker) %>% separate(spice, into = c("spice", "order")) # A tibble: 12 x 4 baker spice order correct <chr> <chr> <chr> <int> 1 Emma cinnamon 1 1 2 Harry cinnamon 1 1 3 Ruby cinnamon 1 1 4 Zainab cinnamon 1 0 5 Emma cardamom 2 0 6 Harry cardamom 2 1 7 Ruby cardamom 2 0 8 Zainab cardamom 2 NA 9 Emma nutmeg 3 1 10 Harry nutmeg 3 1 11 Ruby nutmeg 3 1 12 Zainab nutmeg 3 0 WORKING WITH DATA IN THE TIDYVERSE

  36. Gather , separate , and con v ert t y pes juniors_untidy %>% gather(key = "spice", value = "correct", -baker) %>% separate(spice, into = c("spice", "order"), convert = TRUE) # A tibble: 12 x 4 baker spice order correct <chr> <chr> <int> <int> 1 Emma cinnamon 1 1 2 Harry cinnamon 1 1 3 Ruby cinnamon 1 1 4 Zainab cinnamon 1 0 5 Emma cardamom 2 0 6 Harry cardamom 2 1 7 Ruby cardamom 2 0 8 Zainab cardamom 2 NA 9 Emma nutmeg 3 1 10 Harry nutmeg 3 1 11 Ruby nutmeg 3 1 12 Zainab nutmeg 3 0 WORKING WITH DATA IN THE TIDYVERSE

  37. Before and after separate # A tibble: 12 x 3 # A tibble: 12 x 4 baker spice correct baker spice order correct <chr> <chr> <int> <chr> <chr> <int> <int> 1 Emma cinnamon_1 1 1 Emma cinnamon 1 1 2 Harry cinnamon_1 1 2 Harry cinnamon 1 1 3 Ruby cinnamon_1 1 3 Ruby cinnamon 1 1 4 Zainab cinnamon_1 0 4 Zainab cinnamon 1 0 5 Emma cardamom_2 0 5 Emma cardamom 2 0 6 Harry cardamom_2 1 6 Harry cardamom 2 1 7 Ruby cardamom_2 0 7 Ruby cardamom 2 0 8 Zainab cardamom_2 NA 8 Zainab cardamom 2 NA 9 Emma nutmeg_3 1 9 Emma nutmeg 3 1 10 Harry nutmeg_3 1 10 Harry nutmeg 3 1 11 Ruby nutmeg_3 1 11 Ruby nutmeg 3 1 12 Zainab nutmeg_3 0 12 Zainab nutmeg 3 0 WORKING WITH DATA IN THE TIDYVERSE

  38. The ` sep ` arg u ment ?separate WORKING WITH DATA IN THE TIDYVERSE

  39. Let ' s practice ! W OR K IN G W ITH DATA IN TH E TIDYVE R SE

  40. Spread W OR K IN G W ITH DATA IN TH E TIDYVE R SE Alison Hill Professor & Data Scientist

  41. Gather WORKING WITH DATA IN THE TIDYVERSE

  42. Spread WORKING WITH DATA IN THE TIDYVERSE

  43. Spread WORKING WITH DATA IN THE TIDYVERSE

  44. Spread : u sage ?spread WORKING WITH DATA IN THE TIDYVERSE

  45. Spread : arg u ments ?spread WORKING WITH DATA IN THE TIDYVERSE

  46. Using spread juniors_jumbled %>% juniors_jumbled spread(key = key, value = value) # A tibble: 12 x 3 # A tibble: 4 x 4 baker key value baker age outcome spices <chr> <chr> <chr> <chr> <chr> <chr> <chr> 1 Emma age 11 1 Emma 11 finalist 2 2 Harry age 10 2 Harry 10 winner 3 3 Ruby age 11 3 Ruby 11 finalist 2 4 Zainab age 10 4 Zainab 10 finalist 0 5 Emma outcome finalist 6 Harry outcome winner 7 Ruby outcome finalist 8 Zainab outcome finalist 9 Emma spices 2 10 Harry spices 3 11 Ruby spices 2 12 Zainab spices 0 WORKING WITH DATA IN THE TIDYVERSE

  47. Spread and con v ert juniors_jumbled %>% juniors_jumbled spread(key = key, value = value, convert = TRUE) # A tibble: 12 x 3 baker key value # A tibble: 4 x 4 <chr> <chr> <chr> baker age outcome spices 1 Emma age 11 <chr> <int> <chr> <int> 2 Harry age 10 1 Emma 11 finalist 2 3 Ruby age 11 2 Harry 10 winner 3 4 Zainab age 10 3 Ruby 11 finalist 2 5 Emma outcome finalist 4 Zainab 10 finalist 0 6 Harry outcome winner 7 Ruby outcome finalist 8 Zainab outcome finalist 9 Emma spices 2 10 Harry spices 3 11 Ruby spices 2 12 Zainab spices 0 WORKING WITH DATA IN THE TIDYVERSE

  48. Spread re v ie w WORKING WITH DATA IN THE TIDYVERSE

  49. Let ' s practice ! W OR K IN G W ITH DATA IN TH E TIDYVE R SE

  50. Tid y m u ltiple sets of col u mns W OR K IN G W ITH DATA IN TH E TIDYVE R SE Alison Hill Professor & Data Scientist

Recommend


More recommend