CS 310 – Advanced Data Structures and Algorithms Weighted Graphs July 20, 2017 Tong Wang UMass Boston CS 310 July 20, 2017 1 / 34
Weighted Graphs Each edge has a weight (cost) Edge-weighted graphs Mostly we consider only positive weights Minimum spanning trees Shortest paths Tong Wang UMass Boston CS 310 July 20, 2017 2 / 34
Minimum Spanning Tree Given a connected undirected graph, we often want the cheapest way to connect all the nodes This is a minimum cost spanning tree A spanning tree is a tree that contains all the nodes in the graph The total cost is just the sum of the edge costs The minimum spanning tree is the spanning tree whose total cost is minimal among all spanning trees The kind of tree involved here has no particular root node, and such trees are called free trees , because they are not tied down by a root Tong Wang UMass Boston CS 310 July 20, 2017 3 / 34
MST Example Point set in the plane Geometric spanning trees The middle one is the minimum spanning tree The one on the right has shortest paths to the central node Tong Wang UMass Boston CS 310 July 20, 2017 4 / 34
MST Example CLRS 23.1 Tong Wang UMass Boston CS 310 July 20, 2017 5 / 34
Prim’s Algorithm for MST A greedy algorithm: Decide what to do next by selecting the immediately best option, without regard to the global structure Pseudocode Choose an arbitrary vertex s as the first tree node 1 Add s to PrimTree 2 while there are still nontree nodes 3 Choose the edge of minimum weight between a tree node and a 1 nontree node, ( u , v ) Add ( u , v ) to TreeEdge and v to PrimTree 2 Tong Wang UMass Boston CS 310 July 20, 2017 6 / 34
Prim’s Algorithm for MST Choose an arbitrary node s Initialize tree T = { s } Initialize E = { } Repeat until all nodes are in T: Choose an edge (u, v) with minimum weight such that u is in T and v is not Add v to T, add (u, v) to E Output (T, E) Tong Wang UMass Boston CS 310 July 20, 2017 7 / 34
Correctness of Prim’s Algorithm Prim’s alogrithm creates a spanning tree No cycle can be introduced by adding edges between tree and nontree vertices Why does the tree has the minimal weight? Proof by contradiction Assume Prim’s algorithm does not return an MST Then there is an edge ( x , y ) that does not belong to the MST Instead, ( x , y ) is replaced by a different edge ( v 1 , v 2 ) To maintain the property of spanning, when we remove an edge, we must add another edge The cost of ( x , y ), however, is not greater than the cost of ( v 1 , v 2 ) Thus Prim’s tree is still minimal after all Tong Wang UMass Boston CS 310 July 20, 2017 8 / 34
Correctness of Prim’s Algorithm If we remove ( x , y ), a different edge must be added In the new spanning tree, let ( v 1 , v 2 ) be the first edge along the path from x to y that is not in the tree generated by Prim’s algorithm Cost ( x , y ) ≤ Cost ( v 1 , v 2 ) Substitute ( x , y ) for ( v 1 , v 2 ) back into the tree Repeat until the tree by Prim’s algorithm is restored Tong Wang UMass Boston CS 310 July 20, 2017 9 / 34
Naive Implementation of Prim’s Algorithm Prim’s algorithm grows the MST in stages, one vertex at a time Thus n − 1 iterations During each iteration, search all edges and find the minimum weight edge with one vertex in the tree and the other vertex not in the tree Thus each iteration takes O ( m ) time O ( m ∗ n ) total time Tong Wang UMass Boston CS 310 July 20, 2017 10 / 34
Better Implementation of Prim’s Algorithm Maintain an array Dist that stores the distances from nontree nodes to the nearest tree nodes Initialize Dist to MAXINT When vertex u is added to the tree, loop through its adjacency list Consider edge ( u , v ) If v is already a tree node, ignore it If v is not a tree node, and if Cost(u, v) < Dist[v] Dist[v] = Cost(u, v) parent[v] = u After updating Dist , find the node with the minimum distance to be the next tree node O ( n 2 ) total time Further improvement: Use a priority queue, so that the node with the minimum distance can be found in O (log n ) time, for a total time of O ( m log n ) Tong Wang UMass Boston CS 310 July 20, 2017 11 / 34
Kruskal’s algorithm This algorithm works like building the Huffman code tree Start with all the nodes separate and then join them incrementally, using a greedy approach that turns out to give the optimal solution Set up a partition, a set of sets of nodes, starting with one node in each set Then find the minimal edge to join two sets, and use that as a tree edge, and join the sets Each set in the partition is a connected component Tong Wang UMass Boston CS 310 July 20, 2017 12 / 34
Example 2 2 v 0 v 1 v 0 v 1 4 10 3 1 1 2 2 v 2 v 3 v 4 v 2 v 3 v 4 2 2 4 4 8 5 6 v 5 v 6 v 5 v 6 1 1 Tong Wang UMass Boston CS 310 July 20, 2017 13 / 34
Example v 0 v 1 v 0 v 1 1 1 v 2 v 3 v 4 v 2 v 3 v 4 v 5 v 6 v 5 v 6 1 2 1 2 2 v 0 v 1 v 0 v 1 1 1 v 2 v 3 v 4 v 2 v 3 v 4 2 v 5 v 6 v 5 v 6 3 4 1 1 Tong Wang UMass Boston CS 310 July 20, 2017 14 / 34
Example 2 2 v 0 v 1 v 0 v 1 3 1 1 2 2 v 2 v 3 v 4 v 2 v 3 v 4 2 2 v 5 v 6 v 5 v 6 5 6 1 1 2 2 v 0 v 1 v 0 v 1 4 1 1 2 2 v 2 v 3 v 4 v 2 v 3 v 4 2 2 4 v 5 v 6 v 5 v 6 7 8 1 1 Tong Wang UMass Boston CS 310 July 20, 2017 15 / 34
Running Example The component numbers for the first few steps: Node V 0 V 1 V 2 V 3 V 4 V 5 V 6 Set 0 1 2 3 4 5 6 V 0 - V 3 0 1 2 0 4 5 6 (3 joins 0) V 5 - V 6 0 1 2 0 4 5 5 (6 joins 5) V 0 - V 1 0 0 2 0 4 5 5 V 2 - V 3 0 0 0 0 4 5 5 and so on Edges are used in cost order If an edge has both to and from vertices with the same component number, it is skipped The resulting tree is defined by the edges used Tong Wang UMass Boston CS 310 July 20, 2017 16 / 34
Correctness of Kruskal’s Algorithm The algorithm adds n − 1 edges without creating a cycle Thus it must create a spanning tree Why does the tree has the minimal weight? Proof by contradiction Assume Kruskal’s algorithm does not return an MST There is a real MST T min Then there is an edge ( x , y ) in the tree by Kruskal’s that is not in T min Insert ( x , y ) in T min and create a cycle with the path P from x to y When Kruskal’s adds the edge ( x , y ), x and y are in different components That is, one of the edges in path P is not there yet; let it be ( v 1 , v 2 ) This means Cost ( x , y ) ≤ Cost ( v 1 , v 2 ) Tong Wang UMass Boston CS 310 July 20, 2017 17 / 34
Correctness of Kruskal’s Algorithm If we add ( x , y ) to T min , a cycle is formed The edge ( v 1 , v 2 ) is added after ( x , y ) Thus Cost ( x , y ) ≤ Cost ( v 1 , v 2 ) By replacing ( v 1 , v 2 ) with ( x , y ), we would not increase the cost Repeat until T min is transformed into the tree by Kruskal’s Tong Wang UMass Boston CS 310 July 20, 2017 18 / 34
Kruskal’s Algorithm Note that several edges are left unprocessed in the edge list These are the high-cost edges we want to avoid using Pseudocode: Put the edges in a priority queue ordered by edge weight Initialize the tree edge set T to the empty set Repeat n − 1 times Get next edge ( u , v ) from PQ If component[u] != component[v] Add ( u , v ) to T Merge component[u] and component[v] In each iteration, test the connectivity of two trees plus an edge This can be done with BFS or DFS on a sparse graph with at most n edges and n vertices. Thus each iteration takes O ( n ) time O ( m ∗ n ) total time If we can determine the connectivity of two trees plus an edge in O (log n ) time, then the total time for Kruskal’s algorithm is O ( m log n ) Tong Wang UMass Boston CS 310 July 20, 2017 19 / 34
Data Structures for Kruskal’s Algorithm Need two operations SameComponent(u, v): returns true or false 1 MergeComponent(C1, C2): returns the merged component 2 Best data structure: Union-Find Tong Wang UMass Boston CS 310 July 20, 2017 20 / 34
The Union-Find Data Structure Let n be the number of elements in a set A partition divides the elements into a collection of disjoint subsets Examples: SCC, vertex coloring, vertices in Kruskal’s algorithm Each subset in the partition is represented by one of its elements The operation Find(u) returns the representative of the subset that u belongs to The operation Union(u, v) joins the subsets containing u and v into a new subset Tong Wang UMass Boston CS 310 July 20, 2017 21 / 34
The Union-Find Data Structure Each subset is represented by a “backwards” tree, where each child node has a pointer to its parent The rank of a tree is (without path compression ) its height Initially, each vertex of the graph is in its own tree, and its parent pointer points to itself vertex 1 2 3 4 5 6 7 parent 1 4 3 4 3 4 2 Find(i) : Follow the parent pointers upwards rank 0 1 1 2 0 0 0 size 1 2 2 4 1 1 1 until reaching a node that points to itself; return the label of this node as the label of the find(x) subset if (parent(x) == x) return x Union(i, j) : Use the root of the else higher-ranked tree as the root of the union – it return find(parent(x)) becomes the parent of the root of the lower-ranked tree Tong Wang UMass Boston CS 310 July 20, 2017 22 / 34
Recommend
More recommend