Are p and q connected?
Network connectivity Yes, they are connected!
Network connectivity ◮ Problem: Given a set of nodes N and a set of links between pairs of nodes L. Find connectivity for node p and node q.(p ∈ N,q ∈ N)
Real World Application
Kruskal
The union-find data structure Zhengtian Xu Xiaoqing Geng Lihua Qian Ruxuan Zhang Chen Feng Department of Computer Science and Engineering Shanghai Jiao Tong University 8th December 2016
Outline Brief Introduction for Union-Find Data Structure Improvement Time Complexity Analysis
Union-Find data structure type Goal. Support three operations on a set of elements: ◮ MAKE-SET( x ). Create a new set containing only element x . ◮ FIND( x ). Return a canonical element in the set containing x . ◮ UNION( x , y ). Merge the sets containing x and y .
Union-Find example b c a 2,7,9,3 4,5,8 6
Union-Find example b c a 2,7,9,3 4,5,8 6 FIND(9) = 2
Union-Find example b c a 2,7,9,3 4,5,8 6 FIND(9) = 2 MAKE-SET(1)
Union-Find example b c a d 2,7,9,3 4,5,8 6 1 FIND(9) = 2 MAKE-SET(1)
Union-Find example b c a d 2,7,9,3 4,5,8 6 1 FIND(9) = 2 MAKE-SET(1) UNION(2,4)
Union-Find example e a d 2,7,9,3,4,5,8 6 1 FIND(9) = 2 MAKE-SET(1) UNION(2,4)
Union-Find data structure Representation Represent each set as a tree of elements. ◮ Each element has a parent pointer in the tree. ◮ The root serves as the canonical element. ◮ FIND( x ). Find the root of the tree containing x . ◮ UNION( x , y ). Make the root of one tree point to root of other tree. root d a c f parent of e is c e b
Find operation Representation FIND( x ). Find the root of the tree containing x . d a c f e b g
Find operation Representation FIND( x ). Find the root of the tree containing x . d FIND(g) a c f e b g
Find operation Representation FIND( x ). Find the root of the tree containing x . d FIND(g) a c f e b g
Find operation Representation FIND( x ). Find the root of the tree containing x . d FIND(g) a c f e b g
Find operation Representation FIND( x ). Find the root of the tree containing x . d FIND(g) a c f e b g
Find operation Representation FIND( x ). Find the root of the tree containing x . d FIND(g) a c f e b g
Find operation Representation FIND( x ). Find the root of the tree containing x . d FIND(g) a c f FIND(d) e b g
Find operation Representation FIND( x ). Find the root of the tree containing x . d FIND(g) a c f FIND(d) e b g
Union operation ◮ Maintain an integer rank for each node, initially 0. ◮ Link root of smaller rank to root of larger rank; if tie, increase rank of new root by 1. Note. For now, rank = height.
Union operation ◮ Maintain an integer rank for each node, initially 0. ◮ Link root of smaller rank to root of larger rank; if tie, increase rank of new root by 1. union(d, g) rank = 1 rank = 2 g d c j e i k b a h Note. For now, rank = height.
Union by rank ◮ Maintain an integer rank for each node, initially 0. ◮ Link root of smaller rank to root of larger rank; if tie, increase rank of new root by 1. union(d, g) rank = 2 g e d k b c j a i h
Union by rank ◮ Maintain an integer rank for each node, initially 0. ◮ Link root of smaller rank to root of larger rank; if tie, increase rank of new root by 1. union(d, g) rank = 2 rank = 2 g d c j e i k b a l h
Union by rank ◮ Maintain an integer rank for each node, initially 0. ◮ Link root of smaller rank to root of larger rank; if tie, increase rank of new root by 1. union(d, g) rank = 3 g e d k b c j a i h l
Union by rank: analysis Lemma 1. Using union by rank, for every root node r size ( r ) ≥ 2 rank ( r ) Proof.[ by induction on number of links ] ◮ Base case: singleton tree has size 1 and rank 0. ◮ Inductive hypothesis: assume true after first i links.
Union by rank: analysis Proof. ◮ Case 1. [ rank ( r ) > rank ( s ) ] or [ rank ( r ) < rank ( s ) ] size ′ ( r ) ≥ size ( r ) ≥ 2 rank ( r ) = 2 rank ′ ( r ) size = 8 (rank = 2) r size = 3 (rank = 1) s
Union by rank: analysis Proof. ◮ Case 2. [ rank ( r ) = rank ( s ) ] size ′ ( r ) = size ( r ) + size ( s ) ≥ 2 × size ( r ) 2 × 2 rank ( r ) ≥ 2 rank ( r )+1 = 2 rank ′ ( r ) = size = 6 (rank = 2) size = 3 r (rank = 1) s
Union by rank:analysis n Lemma 2. There are at most 2 k elements of rank k .
Union by rank:analysis n Lemma 2. There are at most 2 k elements of rank k . Proof. According to Lemma 1, for node has rank k, its sizes are at least 2 k . If the size of all elements is n . Obviously we can get Lemma 2.
Union by rank:analysis Theorem. Using Union by rank, any FIND operations takes O ( log 2 n ) time in the worst case, where n is the number of elements; any UNION operations take constant time.
Union by rank:analysis Theorem. Using Union by rank, any FIND operations takes O ( log 2 n ) time in the worst case, where n is the number of elements; any UNION operations take constant time. Proof. ◮ The running time of each operation is bounded by the tree height. ◮ We can know that the height ≤ ⌊ log 2 n ⌋
Outline Brief Introduction for Union-Find Data Structure Improvement Time Complexity Analysis
Improvement Observation ◮ It is the height of the tree that affects the running time. ◮ When we’re trying to find the root of the tree containing a given node, we’re touching all the nodes on the path from that node to the root. So... ◮ Why not make each of those just point to the root? ◮ That’s the idea of path compression !
Path compression ◮ Just after computing the root of the target node, set the parent of each examined node to point to that root. a c b e height=4 d f g h j find(j) i
Path compression ◮ Just after computing the root of the target node, set the parent of each examined node to point to that root. a j c b e d f g h i
Path compression ◮ Just after computing the root of the target node, set the parent of each examined node to point to that root. a g j c b e i d f h
Path compression ◮ Just after computing the root of the target node, set the parent of each examined node to point to that root. a height=2 g j c d b e i h f
Path compression: benefits ◮ The resulting tree is much flatter. ◮ If the target node is very deep, path compression may dramatically decrease the height of the tree. ◮ Speed up future operations on all the nodes on the path and on those referencing them, directly or indirectly.
Path compression: rank vs. height ◮ The rank of a tree does not change during path compression. ◮ ...but the height of a tree may decrease. ◮ Now it is possible that rank � = height!
Path compression: rank vs. height ◮ Example: Apply the following operations on the forest below. ◮ Union(a,g) ◮ Find(f) ◮ Find(j) g a 3 2 c 2 b 0 1 h 0 i e j 1 d 0 0 0 f
Path compression: rank vs. height ◮ Union(a,g) ◮ Find(f) ◮ Find(j) a 3 g c 2 b 0 2 e 1 d 0 1 h 0 i j 0 f 0
Path compression: rank vs. height ◮ Union(a,g) ◮ Find(f) ◮ Find(j) a 3 g c 2 b 1 d 0 f 0 2 e 0 1 h 0 i j 0
Path compression: rank vs. height ◮ Union(a,g) ◮ Find(f) ◮ Find(j) a 3 g c j 2 1 0 0 2 1 0 b d f h e 0 0 i
Path compression: rank vs. height ◮ Union(a,g) ◮ Find(f) ◮ Find(j) a height(a)=2 � =rank(a) 3 g c j 2 b 1 d 0 f 0 2 1 h 0 e 0 0 i
Outline Brief Introduction for Union-Find Data Structure Improvement Time Complexity Analysis
Time Complexity Analysis ◮ Without path compression : O (log n ) per find instruction. ◮ With path compression : O (log log n ) per find instruction.
Time Complexity Analysis Lemma. n ◮ There are at most 2 k nodes with rank k . ◮ rank ( parent ( x )) > rank ( x ). ◮ rank ( root ) ≤ log 2 n .
Time Complexity Analysis Definition. ◮ f ( t ) = 2 × t a a rank=6 rank=4 b rank=2 b rank=2 rank ( a ) > f ( rank ( b )) rank ( a ) ≤ f ( rank ( b )) happy node, long edge sad node, short edge
Time Complexity Analysis Observation. In one find operation, at most log log n long edges are traversed. Proof. r · · · c b d rank ≤ log n rank ≥ 1 Observation. x is sad for at most rank ( x ) find operations. Proof. ◮ rank ( parent ( x )) > rank ( x ) ◮ rank ( parent ( x )) increases per find operation. ◮ After rank ( x ) find ops, rank ( parent ( x )) > rank ( x ) + rank ( x ) = f ( rank ( x ))
Time Complexity Analysis time of all find ops � find ops # long edges � find ops # short edges log log n × m � e (# find operations when e is short ) � x rank ( x ) 2 n
Time Complexity Analysis � cost of all find operations = (# long edges + # short edges ) find ops � # long edges ≤ log log n × # find ops find ops � � # short edges = (# find operations when e is short ) e find ops � ≤ rank ( x ) x log n � ≤ ( k × # rank − k − nodes ) k =0 ∞ k × n � ≤ 2 k k =0 ≤ 2 n
Recommend
More recommend