TrelliscopeJS Modern Approaches to Data Exploration with Trellis Display Ryan Hafen Hafen Consulting, LLC Purdue University @hafenstats http://bit.ly/trelliscopejs1
All examples in this talk are reproducible after installing and loading the following packages: install.packages(c("tidyverse", "gapminder", "rbokeh", "visNetwork", "plotly")) devtools::install_github("hafen/trelliscopejs") library(tidyverse) library(gapminder) library(rbokeh) library(visNetwork) library(trelliscopejs)
TrelliscopeJS is an htmlwidget TrelliscopeJS is a layout engine for collections of htmlwidgets TrelliscopeJS is a framework for creating interactive displays of small multiples
Small Multiples A series of similar plots, usually each based on a different slice of data, arranged in a grid This idea was formalized and popularized in S/S-PLUS and subsequently R with the trellis and lattice packages "For a wide range of problems in data presentation, small multiples are the best design solution." Edward Tufte (Envisioning Information)
Advantages of Small Multiple Displays Avoid overplotting Work with big or high dimensional data It is often critical to the discovery of a new insight to be able to see multiple things at once Our brains are good at perceiving simple visual features like color or shape or size and they do it amazingly fast without any conscious effort We can tell immediately when a part of an image is different from the rest, without really having to focus on it In my experience, small multiples are much more effective than more flashy things like animation, linked brushing, custom interactive vis, etc. source:
Trelliscope: Interactive Small Multiple Display Small multiple displays are useful when visualizing data in detail But the number of panels in a display can be potentially very large, too large to view all at once It can also be difficult to specify a meaningful order in which panels are displayed Trelliscope is a general solution that allows small multiple displays to come alive by providing the ability to interactively sort and filter the panels based on summary statistics, cognostics , automatically computed for each panel source:
TrelliscopeJS JavaScript Library R Package trelliscopejs-lib trelliscopejs htmlwidget interface to Built using React trelliscopejs-lib Pure JavaScript Evolved from CRAN "trelliscope" Interface agnostic package (part of DeltaRho project)
Gapminder Example https://www.gapminder.org/ glimpse(gapminder) �bservations: �,��� �ariables: � � country �fctr� �fghanistan, �fghanistan, �fghanistan, �fghanistan, �fgh... � continent �fctr� �sia, �sia, �sia, �sia, �sia, �sia, �sia, �sia, �sia, �s... � year �int� ����, ����, ����, ����, ����, ����, ����, ����, ����, ���... � life��p �dbl� ��.���, ��.���, ��.���, ��.���, ��.���, ��.���, ��.���, �... � pop �int� �������, �������, ��������, ��������, ��������, ��������,... � gdp�ercap �dbl� ���.����, ���.����, ���.����, ���.����, ���.����, ���.���... Suppose we want to understand mortality over time for each country
�plot(year, life��p, data � gapminder, color � country, geom � "line") Yikes! There are a lot of countries...
�plot(year, life��p, data � gapminder, color � continent, group � country, geom � "line") I can't see what's going on ...
�plot(year, life��p, data � gapminder, color � continent, group � country, geom � "line") � facet_wrap(� continent, nrow � �) That helped a little...
p �� �plot(year, life��p, data � gapminder, color � continent, group � country, geom � "line") � facet_wrap(� continent, nrow � �) plotly::ggplotly(p) `r h ` This helps but there is still too much overplotting... (and hovering for additional info is too much work and we can only see more info one at a time)
�plot(year, life��p, data � gapminder) � �lim(����, ����) � ylim(��, ��) � theme_bw() � facet_wrap(� country � continent)
From ggplot2 Faceting to Trelliscope Turning a ggplot2 faceted display into a Trelliscope display is as easy as changing: facet_wrap() or: facet_grid() to: facet_trelliscope()
�plot(year, life��p, data � gapminder) � �lim(����, ����) � ylim(��, ��) � theme_bw() � facet_ trelliscope (� country � continent , nrow = 2, ncol = 7, width = 300 ) open in new window
�plot(year, life��p, data � gapminder) � �lim(����, ����) � ylim(��, ��) � theme_bw() � facet_trelliscope(� country � continent, nrow � �, ncol � �, width � ���, �s�plotl� = ���� ) open in new window
Plotting in the Tidyverse
Gapminder Example from "R for Data Science" One row per group country_model �� function(df) Per-group data and lm(life��p � year, data � df) models as "list-columns" by_country �� gapminder ��� group_by(country, continent) ���nest() ��� mutate( model � map(data, country_model), resid_mad � map_dbl(model, function(�) mad(resid(�)))) by_country � � tibble: ��� � � country continent data model resid_mad �fctr� �fctr� �list� �list� �dbl� � �fghanistan �sia �tibble ��� � ��� ���: lm� �.������� � �lbania �urope �tibble ��� � ��� ���: lm� �.������� � �lgeria �frica �tibble ��� � ��� ���: lm� �.������� � �ngola �frica �tibble ��� � ��� ���: lm� �.������� � �rgentina �mericas �tibble ��� � ��� ���: lm� �.������� � �ustralia �ceania �tibble ��� � ��� ���: lm� �.������� � �ustria �urope �tibble ��� � ��� ���: lm� �.������� � �ahrain �sia �tibble ��� � ��� ���: lm� �.������� � �angladesh �sia �tibble ��� � ��� ���: lm� �.������� �� �elgium �urope �tibble ��� � ��� ���: lm� �.������� � ... with ��� more rows Example adapted from "R for Data Science"
Plotting the Fit for Each Country Excerpt from "R for Data Science"
Plotting the Data and Model Fit for a Group We'll use the rbokeh package to make a plot function and apply it to the first row of our data country_plot �� function(data, model) � figure(�lim � c(����, ����), ylim � c(��, ��), tools � N���) ��� ly_points(year, life��p, data � data, hover � data) ���ly_abline(model) � country_plot(by_country�data�����, by_country�model�����)
Let's Apply This Function to Every Row! by_country �� by_country ��� mutate(plot � ��p2�plot (data, model, country_plot)) by_country � � tibble: ��� � � country continent data model resid_mad plot �fctr� �fctr� �list� �list� �dbl� �list� � �fghanistan �sia �tibble ��� � ��� ���: lm� �.������� ���: rbokeh� � �lbania �urope �tibble ��� � ��� ���: lm� �.������� ���: rbokeh� � �lgeria �frica �tibble ��� � ��� ���: lm� �.������� ���: rbokeh� � �ngola �frica �tibble ��� � ��� ���: lm� �.������� ���: rbokeh� � �rgentina �mericas �tibble ��� � ��� ���: lm� �.������� ���: rbokeh� � �ustralia �ceania �tibble ��� � ��� ���: lm� �.������� ���: rbokeh� � �ustria �urope �tibble ��� � ��� ���: lm� �.������� ���: rbokeh� � �ahrain �sia �tibble ��� � ��� ���: lm� �.������� ���: rbokeh� � �angladesh �sia �tibble ��� � ��� ���: lm� �.������� ���: rbokeh� �� �elgium �urope �tibble ��� � ��� ���: lm� �.������� ���: rbokeh� � ... with ��� more rows Plots as list-columns!!!
by_country ��� trelliscope(name � "by_country_lm", nrow � �, ncol � �) open in new window
More recommend