disjoint set data structure
play

Disjoint-set data structure CS 5633 -- Spring 2006 (Union-Find) - PowerPoint PPT Presentation

3/30/06 Disjoint-set data structure CS 5633 -- Spring 2006 (Union-Find) Problem: Maintain a dynamic collection of pairwise-disjoint sets S = { S 1 , S 2 , , S r }. Each set S i has one element distinguished as the representative


  1. 3/30/06 Disjoint-set data structure CS 5633 -- Spring 2006 (Union-Find) Problem: • Maintain a dynamic collection of pairwise-disjoint sets S = { S 1 , S 2 , …, S r }. • Each set S i has one element distinguished as the representative element, rep [ S i ]. • Must support 3 operations: • M AKE -S ET ( x ): adds new set { x } to S Union-Find Data Structures with rep [{ x }] = x (for any x ∉ S i for all i ) • U NION ( x , y ): replaces sets S x , S y with S x ∪ S y in S Carola Wenk (for any x , y in distinct sets S x , S y ) Slides courtesy of Charles Leiserson with small • F IND -S ET ( x ): returns representative rep [ S x ] changes by Carola Wenk of set S x containing element x 3/30/06 CS 5633 Analysis of Algorithms 1 3/30/06 CS 5633 Analysis of Algorithms 2 Disjoint-set data structure Simple linked-list solution (Union-Find) II Store each set S i = { x 1 , x 2 , …, x k } as an (unordered) doubly linked list. Define representative element • In all operations the elements x , y are rep [ S i ] to be the front of the list, x 1 . given (as pointers or references for example) … S i : x 1 x 2 x k • Hence, we do not need to first search for the rep [ S i ] element in the data structure. Θ (1) • M AKE -S ET ( x ) initializes x as a lone node. • F IND -S ET ( x ) walks left in the list containing • Let n denote the overall number of elements Θ ( n ) x until it reaches the front of the list. (equivalently, the number of M AKE -S ET • U NION ( x , y ) calls F IND -S ET on x and y and Θ ( n ) operations). concatenates the lists containing x and y , leaving rep. as F IND -S ET [ x ]. 3/30/06 CS 5633 Analysis of Algorithms 3 3/30/06 CS 5633 Analysis of Algorithms 4 1

  2. 3/30/06 Simple balanced-tree solution Plan of attack maintain how? Store each set S i = { x 1 , x 2 , …, x k } as a balanced tree •We will build a simple disjoint-union data structure (ignoring keys). Define representative element that, in an amortized sense , performs significantly rep [ S i ] to be the root of the tree. better than Θ (log n ) per op., even better than Θ (log log n ), Θ (log log log n ), ..., but not quite Θ (1). S i = { x 1 , x 2 , x 3 , x 4 , x 5 } • M AKE -S ET ( x ) initializes x Θ (1) as a lone node. •To reach this goal, we will introduce two key tricks . rep [ S i ] x 1 • F IND -S ET ( x ) walks up the tree Each trick converts a trivial Θ ( n ) solution into a Θ (log n ) containing x until reaching root. simple Θ (log n ) amortized solution. Together, the x 4 x 3 • U NION ( x , y ) calls F IND -S ET on two tricks yield a much better solution. Θ (log n ) x and y and concatenates the x 2 x 5 • First trick arises in an augmented linked list. trees containing x and y , Second trick arises in a tree structure. changing rep. of x or y 3/30/06 CS 5633 Analysis of Algorithms 5 3/30/06 CS 5633 Analysis of Algorithms 6 Example of Augmented linked-list solution augmented linked-list solution Store S i = { x 1 , x 2 , …, x k } as unordered doubly linked list. Each element x j stores pointer rep [ x j ] to rep [ S i ]. Augmentation: Each element x j also stores pointer U NION ( x , y ) rep [ x j ] to rep [ S i ] (which is the front of the list, x 1 ). • concatenates the lists containing x and y , and rep • updates the rep pointers for all elements in the list containing y . … S i : x 1 x 2 x k rep rep [ S i ] S x : x 1 x 2 rep – Θ (1) • F IND -S ET ( x ) returns rep [ x ]. rep [ S x ] • U NION ( x , y ) concatenates lists containing S y : y 1 y 2 y 3 x and y and updates the rep pointers for – Θ ( n ) rep [ S y ] all elements in the list containing y . 3/30/06 CS 5633 Analysis of Algorithms 7 3/30/06 CS 5633 Analysis of Algorithms 8 2

  3. 3/30/06 Example of Example of augmented linked-list solution augmented linked-list solution Each element x j stores pointer rep [ x j ] to rep [ S i ]. Each element x j stores pointer rep [ x j ] to rep [ S i ]. U NION ( x , y ) U NION ( x , y ) • concatenates the lists containing x and y , and • concatenates the lists containing x and y , and • updates the rep pointers for all elements in the • updates the rep pointers for all elements in the list containing y . list containing y . rep S x ∪ S y : rep S x ∪ S y : x 1 x 2 x 1 x 2 rep rep [ S x ] rep [ S x ∪ S y ] y 1 y 2 y 3 y 1 y 2 y 3 rep [ S y ] 3/30/06 CS 5633 Analysis of Algorithms 9 3/30/06 CS 5633 Analysis of Algorithms 10 Alternative concatenation Alternative concatenation U NION ( x , y ) could instead U NION ( x , y ) could instead • concatenate the lists containing y and x , and • concatenate the lists containing y and x , and • update the rep pointers for all elements in the • update the rep pointers for all elements in the list containing x . list containing x . rep rep S x : x 1 x 2 x 1 x 2 rep rep S x ∪ S y : rep [ S x ] rep [ S x ] S y : y 1 y 2 y 3 y 1 y 2 y 3 rep [ S y ] rep [ S y ] 3/30/06 CS 5633 Analysis of Algorithms 11 3/30/06 CS 5633 Analysis of Algorithms 12 3

  4. 3/30/06 Trick 1 : Smaller into larger Alternative concatenation (weighted-union heuristic) To save work, concatenate smaller list onto the end U NION ( x , y ) could instead of the larger list. Cost = Θ (length of smaller list). • concatenate the lists containing y and x , and Augment list to store its weight (# elements). • update the rep pointers for all elements in the list containing x . • Let n denote the overall number of elements (equivalently, the number of M AKE -S ET operations). rep • Let m denote the total number of operations. • Let f denote the number of F IND -S ET operations. x 1 x 2 rep S x ∪ S y : Theorem: Cost of all U NION ’s is O( n log n ). Corollary: Total cost is O( m + n log n ). y 1 y 2 y 3 rep [ S x ∪ S y ] 3/30/06 CS 5633 Analysis of Algorithms 13 3/30/06 CS 5633 Analysis of Algorithms 14 Analysis of Trick 1 Disjoint set forest: Representing sets as trees (weighted-union heuristic) Store each set S i = { x 1 , x 2 , …, x k } as an unordered, Theorem: Total cost of U NION ’s is O( n log n ). potentially unbalanced, not necessarily binary tree, Proof. • Monitor an element x and set S x containing it. storing only parent pointers. rep [ S i ] is the tree root. • After initial MAKE-SET( x ), weight [ S x ] = 1. • M AKE -S ET ( x ) initializes x • Each time S x is united with S y , weight [ S y ] ≥ weight [ S x ], S i = { x 1 , x 2 , x 3 , x 4 , x 5 , x 6 } – Θ (1) as a lone node. • pay 1 to update rep [ x ], and rep [ S i ] x 1 • F IND -S ET ( x ) walks up the • weight [ S x ] at least doubles (increases by weight [ S y ]). • Each time S x is united with smaller set S y , tree containing x until it • pay nothing, and – Θ ( depth [ x ]) reaches the root. x 4 x 3 • weight [ S x ] only increases. • U NION ( x , y ) concatenates Thus pay ≤ log n for x . the trees containing x and y … x 2 x 5 x 6 3/30/06 CS 5633 Analysis of Algorithms 15 3/30/06 CS 5633 Analysis of Algorithms 16 4

  5. 3/30/06 Trick 1 adapted to trees Trick 1 adapted to trees (union-by-weight) • U NION ( x , y ) can use a simple concatenation strategy: • Height of tree is logarithmic in weight, because: Make root F IND -S ET ( y ) a child of root F IND -S ET ( x ). • Induction on the weight ⇒ F IND -S ET ( y ) = F IND -S ET ( x ). • Height of a tree T is determined by the two x 1 subtrees T 1 , T 2 that T has been united from. • Adapt Trick 1 to this context: • Inductively the heights of T 1 , T 2 are the logs Union-by-weight: x 4 x 3 of their weights. y 1 Merge tree with smaller • height(T) = max(height(T 1 ), height(T 2 )) weight into tree with possibly +1, but only if T 1 , T 2 have same height x 2 x 5 x 6 y 4 y 3 larger weight. • Thus total cost is O( m log n ). • Variant of Trick 1 (see book): y 2 y 5 Union-by-rank: rank of a tree = its height 3/30/06 CS 5633 Analysis of Algorithms 17 3/30/06 CS 5633 Analysis of Algorithms 18 Trick 2 : Path compression Trick 2 : Path compression When we execute a F IND -S ET operation and walk When we execute a F IND -S ET operation and walk up a path p to the root, we know the representative up a path p to the root, we know the representative for all the nodes on path p . for all the nodes on path p . x 1 x 1 Path compression makes Path compression makes x 4 x 3 x 4 x 3 all of those nodes direct all of those nodes direct y 1 y 1 children of the root. children of the root. x 2 x 5 x 6 x 2 x 5 x 6 y 4 y 3 y 4 y 3 Cost of F IND -S ET ( x ) Cost of F IND -S ET ( x ) is still Θ ( depth [ x ]). is still Θ ( depth [ x ]). y 2 y 5 y 2 y 5 F IND -S ET ( y 2 ) F IND -S ET ( y 2 ) 3/30/06 CS 5633 Analysis of Algorithms 19 3/30/06 CS 5633 Analysis of Algorithms 20 5

Recommend


More recommend