Disjoint Union - Find CSE 326: Data Structures • Maintain a set of pairwise disjoint sets. Disjoint Sets – Union/Find – {3,5,7} , {4,2,8}, {9}, {1,6} • Required operations – Union – merge two sets to create their union Hal Perkins (original sets need not be preserved) Spring 2007 – Find – determine which set a given item appears in (in particular, be able to quickly tell Lectures 19-21 whether two items are in the same set) 1 2 Set Representation Union • Maintain a set of pairwise disjoint sets. • Union(x,y) – take the union of two sets named x and y – {3,5,7} , {4,2,8}, {9}, {1,6} – {3,5,7} , {4,2,8}, {9}, {1,6} • Each set has a unique name, one of its members – Union(5,1) {3,5,7,1,6}, {4,2,8}, {9}, – {3,5,7} , {4,2,8}, {9}, {1,6} 3 4 1
Find An Example: Building Mazes • Find(x) – return the name of the set • Build a random maze by erasing edges. containing x. – {3,5,7,1,6}, {4,2,8}, {9}, – Find(1) = 5 – Find(4) = 8 5 6 Building Mazes (2) Building Mazes (3) • Pick Start and End • Repeatedly pick random edges to delete. Start Start End End 7 8 2
Desired Properties A Cycle • None of the boundary is deleted • Every cell is reachable from every other Start cell. • Only one path from any one cell to another (There are no cycles – no cell can reach itself by a path unless it retraces some part of the path.) End 9 10 A Good Solution A Hidden Tree Start Start End End 11 12 3
Basic Algorithm Number the Cells • S = set of sets of connected cells • E = set of edges not yet examined We have disjoint sets S ={ {1}, {2}, {3}, {4},… {36} } each cell is a set by itself. • Maze = set of maze edges (initially empty) Also a set of all possible edges E ={ (1,2), (1,7), (2,8), (2,3), … } 60 edges total. While there is more than one set in S { Start 1 2 3 4 5 6 pick a random edge (x,y) and remove from E 7 8 9 10 11 12 u := Find(x); v := Find(y); 13 14 15 16 17 18 if u ≠ v then // removing edge (x,y) connects previously non- // connected cells x and y - leave this edge removed! 19 20 21 22 23 24 Union(u,v) 25 26 27 28 29 30 else // cells x and y were already connected, add this // edge to set of edges that will make up final maze. 31 32 33 34 35 36 End add edge (x,y) to Maze } 13 14 All remaining members of E together with Maze form the maze Example Step Example S S S Pick (8,14) {1,2,7,8,9,13,19} {1,2,7,8,9,13,19,14,20 26,27} {1,2,7,8,9,13,19} {3} Find(8) = 7 {3} {3} {4} Find(14) = 20 {4} Start 1 2 3 4 5 6 {4} {5} {5} {5} {6} {6} 7 8 9 10 11 12 Union(7,20) {6} {10} {10} {10} {11,17} 13 14 15 16 17 18 {11,17} {11,17} {12} {12} {12} 19 20 21 22 23 24 {14,20,26,27} {15,16,21} {14,20,26,27} {15,16,21} . {15,16,21} 25 26 27 28 29 30 . . . . End {22,23,24,29,39,32 31 32 33 34 35 36 . {22,23,24,29,39,32 33,34,35,36} {22,23,24,29,30,32 33,34,35,36} 33,34,35,36} 15 16 4
Example Example at the End S S Pick (19,20) {1,2,7,8,9,13,19 {1,2,3,4,5,6,7,… 36} 14,20,26,27} Start Start 1 2 3 4 5 6 {3} 1 2 3 4 5 6 {4} E 7 8 9 10 11 12 7 8 9 10 11 12 {5} Maze {6} 13 14 15 16 17 18 13 14 15 16 17 18 {10} {11,17} 19 20 21 22 23 24 19 20 21 22 23 24 {12} 25 26 27 28 29 30 {15,16,21} 25 26 27 28 29 30 . 31 32 33 34 35 36 End 31 32 33 34 35 36 End . {22,23,24,29,39,32 33,34,35,36} 17 18 Implementing the DS ADT Attempt #1 • n elements, can there be Total Cost of: m finds, ≤ n -1 unions more unions? • Hash elements to a hashtable • Store set identifier for each element as data • Target complexity: O ( m + n ) i.e. O (1) amortized runtime for find : runtime for union : • O (1) worst-case for find as well as union would be great, but… runtime for m finds, n-1 unions : Known result : both find and union cannot be done in worst-case O (1) time 19 20 5
Attempt #3 Attempt #2 • Hash elements to a hashtable • Hash elements to a hashtable • Store set identifier for each element as data • Store set identifier for each element as data • Link all elements in the same set together • Always update identifiers of smaller set • Link all elements in the same set together runtime for find : runtime for find : runtime for union : runtime for union : runtime for m finds, n-1 unions : runtime for m finds, n-1 unions : 21 22 [Read section 8.2] Up-Tree for Disjoint Union/Find Find Operation Find(x) - follow x to the root and return the root Initial state: 1 2 3 4 5 6 7 After several 1 3 7 1 3 7 Unions: 2 5 4 2 5 4 Roots are the names of each set. 6 6 Find(6) = 7 23 24 6
Union Operation Simple Implementation Union(x,y) - assuming x and y are roots, point y to x. • Array of indices Up[x] = 0 means 1 2 3 4 5 6 7 x is a root. up 0 1 0 7 7 5 0 Union(1,7) 1 3 7 1 3 7 2 5 4 4 2 5 6 6 25 26 Now this doesn’t look good � Implementation Can we do better? Yes! int Find(int x) { void Union(int x, int y) { up[y] = x; 1. Improve union so that find only takes while(up[x] != 0) { } Θ (log n ) x = up[x]; • Union-by-size } • Reduces complexity to Θ ( m log n + n ) runtime for Union(): return x; } 2. Improve find so that it becomes even runtime for Find(): better! • Path compression runtime for m Finds and n-1 Unions: • Reduces complexity to almost Θ ( m + n ) 27 28 7
A Bad Case Weighted Union • Weighted Union … 1 2 3 n – Always point the smaller (total # of nodes) tree Union(2,1) … to the root of the larger tree 2 3 n Union(3,2) 1 … W-Union(1,7) : 3 n 1 3 7 : 4 1 2 2 Union(n,n-1) n 1 2 5 4 3 Find(1) n steps!! 2 6 1 29 30 Analysis of Weighted Union Example Again With weighted union an up-tree of height h has … weight at least 2 h . 1 2 3 n W-Union(2,1) … 2 3 n • Proof by induction W-Union(3,2) 1 … – Basis : h = 0. The up-tree has one node, 2 0 = 1 : 2 n : – Inductive step : Assume true for all h’ < h. 1 3 W-Union(n,2) W(T 1 ) > W(T 2 ) > 2 h-1 T 2 Minimum weight Induction … Weighted h-1 up-tree of height h hypothesis 1 3 n T 1 union Find(1) constant time T 2 formed by W( T ) > 2 h-1 + 2 h-1 = 2 h weighted unions 31 32 8
Analysis of Weighted Union (cont) Worst Case for Weighted Union n/2 Weighted Unions Let T be an up-tree of weight n formed by weighted union. Let h be its height. n > 2 h n/4 Weighted Unions log 2 n > h • Find(x) in tree T takes O(log n) time. – Can we do better? 33 34 Array Implementation Example of Worst Cast (cont’) After n/2 + n/4 + …+ 1 Weighted Unions: 1 3 7 4 1 2 2 5 4 log 2 n 6 1 2 3 4 5 6 7 -1 1 -1 7 7 5 -1 up Find If there are n = 2 k nodes then the longest weight 2 1 4 path from leaf to root has length k. 35 36 9
Weighted Union Union-by-size: Find Analysis • Complexity of Find: O(max node depth) W-Union(i,j : index){ //i and j are roots wi := weight[i]; • All nodes start at depth 0 wj := weight[j]; • Node depth increases: if wi < wj then – Only when it is part of smaller tree in a union up[i] := j; – Only by one level at a time weight[j] := wi + wj; new runtime for Union(): Result : tree size doubles when node depth increases by 1 else up[j] :=i; weight[i] := wi +wj; Find runtime = O(node depth) = } new runtime for Find(): runtime for m finds and n-1 unions = runtime for m finds and n-1 unions = 37 38 Nifty Storage Trick How about Union-by-height? • Use the same array representation as before • Can still guarantee O(log n ) worst case depth • Instead of storing –1 for a root, Left as an exercise! simply store –size • Problem: Union-by-height doesn’t combine very well with the new find optimization technique we’ll see next [Read section 8.4, page 276] 39 40 10
Path Compression Path Compression • On a Find operation point all the nodes on the • On a Find operation point all the nodes on the search path directly to the root. search path directly to the root. 7 7 1 1 1 7 5 4 5 4 2 PC-Find(3) 2 PC-Find(3) 2 3 6 5 4 6 8 9 6 8 9 10 8 9 3 3 10 10 41 42 Your Turn Draw the result of Find(e): Self-Adjustment Works c g PC-Find(x) a f h b d x i e 43 44 11
Recommend
More recommend