BootcampR AN INTRODUCTION TO R Jason A. Heppler, PhD University of Nebraska at Omaha March 10, 2020 @jaheppler
Hi. I'm Jason. I like to gesture at screens. Digital Engagement Librarian , University of Nebraska at Omaha Mentor, Mozilla Open Leaders Researcher, Humanities+Design , Stanford University
Schedule March 17: 1:30-3 Making Maps in CL 112 March 31: 1:30-3 Clustering and Classifying in CL 112
Today's plan • Introduction to networks • Intro to ggraph and tidygraph • Hands-on! Open up RStudio. We'll start doing a few things together soon.
"The bad news is that when ever you learn a new skill you’re going to suck . It’s going to be frustrating. The good news is that is typical and happens to everyone and it is only temporary . You can’t go from knowing nothing to becoming an expert without going through a period of great frustration and great suckiness." —Hadley Wickham
Networks
Kabbalistic tree of life Athanasius Kircher Oedipus Ægyptacus , 1652-55
Tree of Life Charles Darwin On the Origin of Species by Means of Natural Selection, 1859
Robust Action and the Rise of the Medici, 1400-1434 John Padgett and Christopher Ansell American Journal of Sociology 98:6 (May 1993)
O Say Can You See: Early Washington, D.C., Law & Family William G. Thomas III, Jennifer Guiliano, Trevor Muñoz http://earlywashingtondc.org/
ORBIS: The Stanford Geospatial Network Model of the Roman World Walter Scheidel and Elijah Meeks http://orbis.stanford.edu/
Shakespeare Tragedy Martin Grandjean http://www.martingrandjean.ch
What is a network?
Network = graph Humanities scholars call these networks , mathematicians and network scientists call these graphs . Graphs are defined as • a set of nodes (or vertices), and • a set of edges (or links) that connect nodes
Nodes and edges
What do nodes and edges mean? Nodes Edges People Letters People Membership Publications Citations Cities Railways Cities Imports/exports
Directed vs. undirected Networks that are directed have an asymmetrical relationship, often represented by an arrow pointing to one or two nodes that share an edge. Similarly, networks that are undirected have a symmetrical relationship.
What does network data look like? names size color A 12 red B 10 blue C 4 red D 18 red E 15 Blue Nodes table
What does network data look like? source target weight A B 3 B C 6 C D 9 C A 4 D B 4 Edges table
How do we interpret networks?
Spaghetti plot
Problems • Networks are often incomplete (e.g., ego networks) • Networks are difficult to visualize • Networks are hard to scale • Layouts are imposed, not inherent. Graphs can be topologically similar but layout entirely different
These graphs are the same
...so are these
Network measures Degree measures • Degree: how many edges does a node have?
Network measures - Degree
Network measures Degree measures • Degree: how many edges does a node have? • Strength/weighted degree: degree taking into account weights of edges Centrality measures • Betweenness centrality: nodes that could be hubs • Closeness centrality: center of the graph • Eigenvector centrality: nodes connected to central nodes (e.g., page rank)
Centrality A: Betweenness centrality B: Closeness centrality C: Eigenvector centrality D: Degree centrality E: Harmonic centrality F: Katz centrality
Network measures Degree measures • Degree: how many edges does a node have? • Strength/weighted degree: degree taking into account weights of edges Centrality measures • Betweenness centrality: nodes that could be hubs • Closeness centrality: center of the graph • Eigenvector centrality: nodes connected to central nodes (e.g., page rank) Community • Modularity/community: groups of similar nodes
Network measures - modularity
Bi-partite Networks
Bipartite networks Most basic networks can only support one kind of node type - think of connections among students taking multiple courses. This network would not, however, connect both courses and students in the same network. Network theory assumes that nodes in a network are of the same type .
Bipartite networks Bimodal networks support two node types, but note that edges in these kinds of networks must only allow edges between types, not edges within types. • Bipartite networks have two kinds of nodes • Bipartite networks can be projected into unipartite networks with only one type of node • Each bipartite network will have two projections, one for each type of node
Bipartite networks
Bipartite projected to students
Bipartite projected to courses
Networks in R
What's hard about networks in R? • It's a completely different data concept • It's kind of messy and very untidy • It makes impressive-looking plots • It has its own semantics and algorithms
The network workflow
The network workflow Import Visualize readr ggraph Convert to Transform graph object Model tidygraph tidygraph Communicate
tidygraph • An adaptation and extension of dplyr verbs for working with network data • Tidyfication of (almost) all algorithms provided by igraph • Unified API for all relational data structures • igraph underneath
ggraph • An adaptation of relational data to ggplot -- not just node-link diagrams • Layouts, everything from igraph and more. • Dedicated geoms for nodes and edges • New facets, guides, and themes.
Let's make this graph together
Let's make this graph together https://tinyurl.com/unogot
Questions? Troubleshooting? Next workshop: March 17, 1:30p-3p: Making Networks (CL 112)
Recommend
More recommend