csci 210: Data Structures Graphs and social networks
Social networks • Active area of research • motivated in part by growing public fascination withe the complex “connectedness” of modern society • Social networks have grown in complexity • technological advances, global travel, global communication • Networks come up in many other contexts • information networks • economic networks • trade networks • biological networks • Necessary to study and understand the structure and behavior or complex networks
Networks • How should we think about networks? • Structurally, networks are graphs • we can find structural properties of the underlying graph • But structure is not enough. We also need a framework for reasoning about behavior and interaction in complex networks. • Traditionally networks have been studied across many disciplines • sociology, economics,... • now coming together with computer science, math, physics. • From computer science point of view, we are primarily interested in • the structural properties of the network • modeling networks as classes of graph and studying their properties
Graphs and social networks • Paths, cycles, connectivity • Are two people connected? • Are any two people connected? • Are there cycles of friends? • What is the size of the largest group of friends? • Distances • What is the distance of person A to person B? • What is the average distance between two people in a network? • What is the diameter of the network? • Centrality • What is the center of the network? • Clustering • What are the underlying clusters in the network?
Graphs and social networks • Paths, cycles, connectivity • Traverse the graph (BFS or DFS) • Time: O(V+E)
Graphs and social networks • Distances • Compute the distance between two vertices in a graph • BFS (from one vertex until reaching the other) • Time: O(V+E) • Compute the average distances between two vertices in a graph • BFS from all vertices in G and record all distances • Time: O(V(V+E)) = O(V 2 + VE) • Compute the diameter of a network • diameter = the length of the longest shortest path between two vertices. • Meaning: gives an idea of the time it takes to spread something over the network. • To compute: • need to compute shortest paths between all pairs of vertices • run BFS from every vertex as source and record pairwise distances • this takes O(V(V+E)) time and O(V 2 ) space.
Graphs and social networks • Centrality • Degree centrality • degree(v) = the number of edges of a node • the center is the node with largest degree • time to compute: ? • Closeness centrality • closeness(v) = the average path length to all vertices that are reachable from v • the center is the node with lowest closeness • time to compute: ? • Betweenness centrality • idea: vertices that occur on many shortest path have higher betweenness than those that do not. • time to compute: ? • Various other measures of centrality, depending on specific application
Clustering (graph partitioning) • Goal: Identify the underlying clusters in a graph from http://projects.si.umich.edu/netlearn/GUESS/betweennessclust.html
Clustering (graph partitioning) • Goal: Identify the underlying clusters in a graph • A wide range of methods • in social networks: clustering using betweenness • Betweenness of an edge e: • the total number of times the edges appears on a shortest path between vertices in the graph • How to compute betweenness values for all edges? • Time?
Clustering (graph partitioning) • Goal: Identify the underlying clusters in a graph • A wide range of methods • in social networks: clustering using betweenness • Betweenness of an edge e: • the total number of times the edges appears on a shortest path between vertices in the graph • How to compute betweenness values for all edges? • compute shortest path between all pairs of vertices, keep track of the edges on the shortest paths, and update the frequency of each edge • Time: • O(V 2 + VE)
Clustering (graph partitioning) • Goal: Identify the underlying clusters in a graph • A wide range of methods • in social networks: clustering using betweenness • Betweenness of an edge e: • the total number of times the edges appears on a shortest path between vertices in the graph • Clustering using betweenness: • repeatedly remove the edge of highest betweenness. • [demo]: http://projects.si.umich.edu/netlearn/GUESS/betweennessclust.html
Recommend
More recommend