amortized analysis of union find operations
play

Amortized Analysis of Union/Find operations CPSC 320 2012W T1 1 - PowerPoint PPT Presentation

CPSC 320: Intermediate Algorithm Design and Analysis Amortized Analysis of Union/Find operations CPSC 320 2012W T1 1 Potential Function Relationships between a node N and its parent P: N is close if log 2 rank(N) = log 2


  1. CPSC 320: Intermediate Algorithm Design and Analysis Amortized Analysis of Union/Find operations CPSC 320 – 2012W T1 1

  2. Potential Function Relationships between a node N and its parent P: N is close if ⌊ log 2 rank(N) ⌋ = ⌊ log 2 rank(P) ⌋ . N is far if N is not close, but rank(N) and rank(P) belong to the same interval. N is really far if rank (N) and rank(P) belong to different intervals. CPSC 320 – 2012W T1 2

  3. Potential Function Potential of a node N with parent P: ϕ( N )= { 3 rank ( N ) if N isaroot 3 rank ( N )− rank ( P ) if N isclose rank ( N )+⌊ log 2 rank ( N )⌋−⌊ log 2 rank ( P )⌋ if N isfar 0 otherwise Observe that ϕ (N) is always an integer. The potential of D i is Φ( D i )= ∑ phi ( N ) N ∈ D i CPSC 320 – 2012W T1 3

  4. Observations Observation 1 : if N is close than rank(P) < 2 rank(N). Proof : if rank(P) ≥ 2 rank(N) then log 2 rank(P) ≥ log 2 rank(N) + 1 and so ⌊ log 2 rank(P) ⌋ > ⌊ log 2 rank(N) ⌋ . Observation 2 : If N and P are both in the interval from k then ⌊ log k+1 to 2 2 rank(P) ⌋ - ⌊ log 2 rank(N) ⌋ < rank(N) Proof : ⌊ log 2 rank(P) ⌋ - ⌊ log 2 rank(N) ⌋ ≤ k - ⌊ log 2 (k+1) ⌋ < k < rank(N). Corollary : for every node N, ϕ (N) ≥ 0. CPSC 320 – 2012W T1 4

  5. Proof (part 1) For makeSet(x): real (makeSet) ∈ Θ (1). cost potential difference = 3 rank(new node) = 0. Hence cost am (makeSet) ∈ Θ (1). CPSC 320 – 2012W T1 5

  6. Proof (part 2) For find(x): Suppose the algorithm goes through x = x 0 , x 1 , ..., x l-1 , x l (the root) cost real (find) = l + 1. How does ϕ (N = x j ) change? We will look at every node x 0 , x 1 , ..., x l-1 , x l . We will show that the potential ϕ (x j ) goes down by at least 1 for every j. Except for O(log* n) x j 's CPSC 320 – 2012W T1 6

  7. Proof (part 2) For find(x) – continued: Case 1 : N is x l-1 or x l l do not move. Hence neither ϕ (x The nodes x l-1 and x l-1 ) nor ϕ (x l ) changes. There are only two such nodes. ⇒ exceptions Observe that in all other cases rank(x l ) > rank(P). Case 2 : N was close before path compression, N is really far after path compression. ϕ (N) was at least 1, and is now 0. So it has decreased. CPSC 320 – 2012W T1 7

  8. Proof (part 2) For find(x) – continued: Case 3 : N was far before path compression, N is really far after path compression. ϕ (N) was at least 1, and is now 0. So it has decreased. Case 4 : N was really far before path compression, N is really far after path compression. ϕ (N) has not changed: it was and is still 0. There is at most one such node per interval. Hence there are O(log* n) such nodes. ⇒ exceptions CPSC 320 – 2012W T1 8

  9. Proof (part 2) For find(x) – continued: Case 5 : N was close before path compression, N is close after path compression. rank(x l ) > rank(P). Hence 3 rank(N) – rank(x l ) < 3 rank(N) – rank (P). This means that ϕ (N) has decreased. CPSC 320 – 2012W T1 9

  10. Proof (part 2) For find(x) – continued: Case 6 : N was far before path compression, N is far after path compression. Before path compression ϕ (N) = rank(N) + ⌊ log 2 rank(N) ⌋ – ⌊ log 2 rank(P) ⌋ . After path compression ϕ (N) = rank(N) + ⌊ log 2 rank(N) ⌋ – ⌊ log 2 rank(x l ) ⌋ . Is ⌊ log 2 rank(x l ) ⌋ > ⌊ log 2 rank(P) ⌋ ? Maybe not for the far node closest to the root on the path x 0 , x 1 , ..., x l-1 , x l . ⇒ one more exception But it's true for every other far node N on this path. CPSC 320 – 2012W T1 10

  11. Proof (part 2) For find(x) – continued: Case 7 : N was close before path compression, N is far after path compression. Before path compression ϕ (N) = 3rank(N) – rank(P), which is bigger than rank(N) by observation 1. After path compression ϕ (N) = rank(N) + ⌊ log 2 rank(N) ⌋ – ⌊ log 2 rank(x l ) ⌋ , which by observation 2 is smaller than rank(N). Hence ϕ (N) has decreased by at least 2. CPSC 320 – 2012W T1 11

  12. Proof (part 2) For find(x) – continued: So the potential decreases by at least l + 1 - (log* n + 3). Therefore cost am (find) ≤ l + 1 - (l + 1 - (log* n + 3)). This means cost am (find) ∈ O(log* n). CPSC 320 – 2012W T1 12

  13. Proof (part 3) For union(x,y): Each of the two calls find(x), find(y) has amortized cost in O(log* n). What of the union operation itself? Assume that x l becomes a child of y l . real (union) ∈ Θ (1). cost ϕ (x l ) was 3 rank(x l ) before the union, and is now either 3 l ) + ⌊ log l ) ⌋ – ⌊ log rank(x l ) – rank(y l ), or rank(x 2 rank(x 2 rank(y l ) ⌋ , or 0 after the union. So ϕ (x l ) has decreased. CPSC 320 – 2012W T1 13

  14. Proof (part 3) For union(x,y) – continued: What of the union operation itself – continued? If rank(y l ) increases, then ϕ (y l ) will increase by 3. The potential ϕ (z) associated with a child z of y l may decrease (since its parent rank has increased). So the potential difference is in Θ (1). And hence cost am (union) ∈ O(log* n). Therefore we can conclude that a sequence of m operations on an initially empty union/find data structure with a total of n elements runs in O(m log* n) time in the worst case. CPSC 320 – 2012W T1 14

Recommend


More recommend