bootcampr
play

BootcampR AN INTRODUCTION TO R Jason A. Heppler, PhD University of - PowerPoint PPT Presentation

BootcampR AN INTRODUCTION TO R Jason A. Heppler, PhD University of Nebraska at Omaha March 10, 2020 @jaheppler Hi. I'm Jason. I like to gesture at screens. Digital Engagement Librarian , University of Nebraska at Omaha Mentor, Mozilla Open


  1. BootcampR AN INTRODUCTION TO R Jason A. Heppler, PhD University of Nebraska at Omaha March 10, 2020 @jaheppler

  2. Hi. I'm Jason. I like to gesture at screens. Digital Engagement Librarian , University of Nebraska at Omaha Mentor, Mozilla Open Leaders Researcher, Humanities+Design , Stanford University

  3. Schedule March 17: 1:30-3 Making Maps in CL 112 March 31: 1:30-3 Clustering and Classifying in CL 112

  4. Today's plan • Introduction to networks • Intro to ggraph and tidygraph • Hands-on! Open up RStudio. We'll start doing a few things together soon.

  5. "The bad news is that when ever you learn a new skill you’re going to suck . It’s going to be frustrating. The good news is that is typical and happens to everyone and it is only temporary . You can’t go from knowing nothing to becoming an expert without going through a period of great frustration and great suckiness." —Hadley Wickham

  6. Networks

  7. Kabbalistic tree of life Athanasius Kircher Oedipus Ægyptacus , 1652-55

  8. Tree of Life Charles Darwin On the Origin of Species by Means of Natural Selection, 1859

  9. Robust Action and the Rise of the Medici, 1400-1434 John Padgett and Christopher Ansell American Journal of Sociology 98:6 (May 1993)

  10. O Say Can You See: Early Washington, D.C., Law & Family William G. Thomas III, Jennifer Guiliano, Trevor Muñoz http://earlywashingtondc.org/

  11. ORBIS: The Stanford Geospatial Network Model of the Roman World Walter Scheidel and Elijah Meeks http://orbis.stanford.edu/

  12. Shakespeare Tragedy Martin Grandjean http://www.martingrandjean.ch

  13. What is a network?

  14. Network = graph Humanities scholars call these networks , mathematicians and network scientists call these graphs . Graphs are defined as • a set of nodes (or vertices), and • a set of edges (or links) that connect nodes

  15. Nodes and edges

  16. What do nodes and edges mean? Nodes Edges People Letters People Membership Publications Citations Cities Railways Cities Imports/exports

  17. Directed vs. undirected Networks that are directed have an asymmetrical relationship, often represented by an arrow pointing to one or two nodes that share an edge. Similarly, networks that are undirected have a symmetrical relationship.

  18. What does network data look like? names size color A 12 red B 10 blue C 4 red D 18 red E 15 Blue Nodes table

  19. What does network data look like? source target weight A B 3 B C 6 C D 9 C A 4 D B 4 Edges table

  20. How do we interpret networks?

  21. Spaghetti plot

  22. Problems • Networks are often incomplete (e.g., ego networks) • Networks are difficult to visualize • Networks are hard to scale • Layouts are imposed, not inherent. Graphs can be topologically similar but layout entirely different

  23. These graphs are the same

  24. ...so are these

  25. Network measures Degree measures • Degree: how many edges does a node have?

  26. Network measures - Degree

  27. Network measures Degree measures • Degree: how many edges does a node have? • Strength/weighted degree: degree taking into account weights of edges Centrality measures • Betweenness centrality: nodes that could be hubs • Closeness centrality: center of the graph • Eigenvector centrality: nodes connected to central nodes (e.g., page rank)

  28. Centrality A: Betweenness centrality B: Closeness centrality C: Eigenvector centrality D: Degree centrality E: Harmonic centrality F: Katz centrality

  29. Network measures Degree measures • Degree: how many edges does a node have? • Strength/weighted degree: degree taking into account weights of edges Centrality measures • Betweenness centrality: nodes that could be hubs • Closeness centrality: center of the graph • Eigenvector centrality: nodes connected to central nodes (e.g., page rank) Community • Modularity/community: groups of similar nodes

  30. Network measures - modularity

  31. Bi-partite Networks

  32. Bipartite networks Most basic networks can only support one kind of node type - think of connections among students taking multiple courses. This network would not, however, connect both courses and students in the same network. Network theory assumes that nodes in a network are of the same type .

  33. Bipartite networks Bimodal networks support two node types, but note that edges in these kinds of networks must only allow edges between types, not edges within types. • Bipartite networks have two kinds of nodes • Bipartite networks can be projected into unipartite networks with only one type of node • Each bipartite network will have two projections, one for each type of node

  34. Bipartite networks

  35. Bipartite projected to students

  36. Bipartite projected to courses

  37. Networks in R

  38. What's hard about networks in R? • It's a completely different data concept • It's kind of messy and very untidy • It makes impressive-looking plots • It has its own semantics and algorithms

  39. The network workflow

  40. The network workflow Import Visualize readr ggraph Convert to Transform graph object Model tidygraph tidygraph Communicate

  41. tidygraph • An adaptation and extension of dplyr verbs for working with network data • Tidyfication of (almost) all algorithms provided by igraph • Unified API for all relational data structures • igraph underneath

  42. ggraph • An adaptation of relational data to ggplot -- not just node-link diagrams • Layouts, everything from igraph and more. • Dedicated geoms for nodes and edges • New facets, guides, and themes.

  43. Let's make this graph together

  44. Let's make this graph together https://tinyurl.com/unogot

  45. Questions? Troubleshooting? Next workshop: March 17, 1:30p-3p: Making Networks (CL 112)

More recommend