DataCamp Interactive Data Visualization with rbokeh INTERACTIVE DATA VISUALIZATION WITH RBOKEH Data Formats Omayma Said Data Scientist
DataCamp Interactive Data Visualization with rbokeh hdi_cpi_2015 Data
DataCamp Interactive Data Visualization with rbokeh Data Format (hdi_cpi_wide) > str(hdi_cpi_wide) Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 121 obs. of 8 variables: $ country : chr "Afghanistan" "Albania" "Algeria" "Angola" $ year : int 2015 2015 2015 2015 2015 2015 2015 2015 201 $ human_development_index : num 0.479 0.764 0.745 0.533 0.827 0.939 0.893 0 $ country_code : chr "AFG" "ALB" "DZA" "AGO" ... $ cpi_rank : int 166 88 88 163 106 13 16 50 139 15 ... $ region : chr "AP" "ECA" "MENA" "SSA" ... $ corruption_perception_index: int 11 36 36 15 32 79 76 51 25 77 ... $ continent
DataCamp Interactive Data Visualization with rbokeh Data Format (hdi_cpi_long) > str(hdi_cpi_long) Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 242 obs. of 7 variables: $ country : chr "Afghanistan" "Afghanistan" "Albania" "Albania" ... $ country_code: chr "AFG" "AFG" "ALB" "ALB" ... $ cpi_rank : int 166 166 88 88 88 88 163 163 106 106 ... $ region : chr "AP" "AP" "ECA" "ECA" ... $ continent : chr "Asia" "Asia" "Europe" "Europe" ... $ index : chr "human_development_index" "corruption_perception_index" "h $ value : num 0.479 11 0.764 36 0.745 36 0.533 15 0.827 32 ...
DataCamp Interactive Data Visualization with rbokeh gather() and spread()
DataCamp Interactive Data Visualization with rbokeh Long to Wide hdi_cpi_wide <- hdi_cpi_long %>% spread(key = index, value = value) str(hdi_cpi_wide) Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 121 obs. of 8 variables: $ country : chr "Afghanistan" "Albania" "Algeria" "Angola" $ year : int 2015 2015 2015 2015 2015 2015 2015 2015 201 $ human_development_index : num 0.479 0.764 0.745 0.533 0.827 0.939 0.893 0 $ country_code : chr "AFG" "ALB" "DZA" "AGO" ... $ cpi_rank : int 166 88 88 163 106 13 16 50 139 15 ... $ region : chr "AP" "ECA" "MENA" "SSA" ... $ corruption_perception_index: int 11 36 36 15 32 79 76 51 25 77 ... $ continent : chr "Asia" "Europe" "Africa" "Africa" ...
DataCamp Interactive Data Visualization with rbokeh Data Format (hdi_data_wide) > hdi_data_wide # A tibble: 188 x 27 country `1990` `1991` `1992` `1993` `1994` `1995` `1996` `1997` `1998` <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 Afghanistan 0.295 0.3 0.309 0.305 0.3 0.324 0.328 0.332 0.335 2 Albania 0.635 0.618 0.603 0.608 0.616 0.628 0.637 0.636 0.646 3 Algeria 0.577 0.581 0.587 0.591 0.595 0.6 0.609 0.617 0.627 4 Andorra NA NA NA NA NA NA NA NA NA 5 Angola NA NA NA NA NA NA NA NA NA 6 Antigua and ~ NA NA NA NA NA NA NA NA NA 7 Argentina 0.705 0.713 0.72 0.725 0.728 0.731 0.738 0.746 0.753 8 Armenia 0.634 0.628 0.595 0.593 0.597 0.603 0.609 0.618 0.632 9 Australia 0.866 0.867 0.871 0.874 0.876 0.885 0.888 0.891 0.894 10 Austria 0.794 0.798 0.804 0.806 0.812 0.816 0.819 0.823 0.833 # ... with 178 more rows, and 14 more variables: `2002` <dbl>, `2003` <dbl>, `20 # `2005` <dbl>, `2006` <dbl>, `2007` <dbl>, `2008` <dbl>, `2009` <dbl>, `2010` # `2011` <dbl>, `2012` <dbl>, `2013` <dbl>, `2014` <dbl>, `2015` <dbl>
DataCamp Interactive Data Visualization with rbokeh Wide to Long hdi_data_long <- hdi_data_wide %>% gather(key = year, value = human_development_index, - country) > hdi_data_long # A tibble: 4,888 x 3 country year human_development_index <chr> <int> <dbl> 1 Afghanistan 1990 0.295 2 Albania 1990 0.635 3 Algeria 1990 0.577 4 Andorra 1990 NA 5 Angola 1990 NA 6 Antigua and Barbuda 1990 NA 7 Argentina 1990 0.705 8 Armenia 1990 0.634 9 Australia 1990 0.866 10 Austria 1990 0.794 # ... with 4,878 more rows
DataCamp Interactive Data Visualization with rbokeh INTERACTIVE DATA VISUALIZATION WITH RBOKEH Let's practice!
DataCamp Interactive Data Visualization with rbokeh INTERACTIVE DATA VISUALIZATION WITH RBOKEH More rbokeh Layers Omayma Said Data Scientist
DataCamp Interactive Data Visualization with rbokeh Scatter Plot + Regression Line
DataCamp Interactive Data Visualization with rbokeh Scatter Plot + Regression Line First: create scatter plot ## filter data dat_90_13 <- bechdel %>% filter(between(year, 1990, 2013)) ## create scatter plot p_scatter <- figure() %>% ly_points(x = log(budget_2013), y = log(intgross_2013), data = dat_90_13, size = 5, alpha = 0.4)
DataCamp Interactive Data Visualization with rbokeh Scatter Plot + Regression Line Second: fit linear regression model ## fit linear regression model lin_reg <- lm(log(intgross_2013) ~ log(budget_2013), data = dat_90_13) > summary(lin_reg) Call: lm(formula = log(intgross_2013) ~ log(budget_2013), data = dat_90_13) Residuals: Min 1Q Median 3Q Max -9.9518 -0.5414 0.1304 0.7083 4.8586 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.43003 0.38987 6.233 5.84e-10 *** log(budget_2013) 0.90739 0.02253 40.269 < 2e-16 *** Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.258 on 1605 degrees of freedom (8 observations deleted due to missingness) Multiple R-squared: 0.5026, Adjusted R-squared: 0.5023 i i 1622 1 d 1605 l 2 2 16
DataCamp Interactive Data Visualization with rbokeh Scatter Plot + Regression Line ## add regression line p_scatter %>% ly_abline(lin_reg)
DataCamp Interactive Data Visualization with rbokeh INTERACTIVE DATA VISUALIZATION WITH RBOKEH Now it is your turn!
DataCamp Interactive Data Visualization with rbokeh INTERACTIVE DATA VISUALIZATION WITH RBOKEH Interaction Tools Omayma Said Data Scientist
DataCamp Interactive Data Visualization with rbokeh Interaction Tools
DataCamp Interactive Data Visualization with rbokeh Interaction Tools
DataCamp Interactive Data Visualization with rbokeh Interaction Tools (Default) figure(tools = c("pan", "wheel_zoom", "box_zoom", "reset", "save", "help"), toolbar_location = "right")
DataCamp Interactive Data Visualization with rbokeh Interaction Tools (All) tools "pan", "wheel_zoom", "box_zoom", "resize", "crosshair", "box_select", "lasso_select", "reset", "save", "help" toolbar_location 'above', 'below', 'left', 'right', NULL
DataCamp Interactive Data Visualization with rbokeh Interaction Tools (Custom) figure(tools = c("pan", "wheel_zoom", "box_zoom"), toolbar_location = "above", legend_location = "bottom_right", ylim = c(0, 100)) %>% ly_points(x = gdpPercap, y = lifeExp, data = gapminder_2002, color = continent, size = 6, alpha = 0.7)
DataCamp Interactive Data Visualization with rbokeh Interaction Tools (Custom)
DataCamp Interactive Data Visualization with rbokeh Saving rbokeh Figures plot_scatter <- figure(title = "Life Expectancy Vs. GDP per Capita in 2002", legend_location = "bottom_right") %>% ly_points(x = gdpPercap, y = lifeExp, data = gapminder_2002) png ## save figure as png widget2png(p = plot_scatter, file = "plot_scatter.png") html ## save figure as html rbokeh2html(fig = plot_scatter, file = "plot_scatter_interactive.html") ## open saved html browseURL("plot_scatter_interactive.html")
DataCamp Interactive Data Visualization with rbokeh INTERACTIVE DATA VISUALIZATION WITH RBOKEH Time to Practice
Recommend
More recommend