Amortized Analysis and Union-Find 02283, Inge Li Gørtz 1
Today • Amortized analysis • 3 different methods • 2 examples • Union-Find data structures • Worst-case complexity • Amortized complexity 2
Amortized Analysis • Amortized analysis. • Average running time per operation over a worst-case sequence of operations. • Time required to perform a sequence of data operations is averaged over all the operations performed. • Motivation: traditional worst-case-per-operation analysis can give too pessimistic bound if the only way of having an expensive operation is to have a lot of cheap ones before it. • Different from average case analysis: average over time, not input. 3
Amortized Analysis • Methods. • Aggregate method • Accounting method • Potential method 4
Aggregate method • Aggregate. • Determine total cost. • Amortized cost = total cost/#operations. 5
Dynamic Tables • Doubling strategy. • Start with empty array of size 1. • Insert: If array is full create a new array of double the size and reinsert all elements. • Analysis: n insert operations. Assume n is a power of 2. • Number of insertions 1 + 2 + 4 + ... + 2 log n = O(n). • Total cost: O(n). • Amortized cost per insert: O(1). 6
Accounting method • Accounting. • Some types of operations are overcharged. • Credit allocated with elements in the data structure used to pay for subsequent operations • Total credit non-negative at all times -> total amortized cost an upper bound on the actual cost. 7
Dynamic Tables • Amortized costs: • Amortized cost of insertion: 3 • 1 for own insertion • 1 for its first reinsertion. • 1 to pay for reinsertion of one of the items that have already been reinserted once. 8
Dynamic Tables • Analysis: keep 2 credits on each element in the array that is beyond the middle. • table not full: insert costs 1, and we have 2 credits to save. • table full, i.e., doubling: half of the elements have 2 credits each. Use these to pay for reinsertion of all in the new array. • Amortized cost per operation: 3. 2 2 2 x x x x x x x x x x x 2 2 2 2 2 2 2 2 x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x 9
Example: Stack with MultiPop • Stack with MultiPop. • Push(e): push element e onto stack. • MultiPop(k): pop top k elements from the stack • Worst case: Implement via linked list or array. • Push: O(1). • MultiPop: O(k). • Amortized cost per operation: 2. 10
Stack: Aggregate Analysis • Amortized analysis. Sequence of n Push and MultiPop operations. • Each object popped at most once for each time it is pushed. • #pops on non-empty stack ≤ #Push operations ≤ n. • Total time O(n). • Amortized cost per operation: 2n/n = 2. 11
Stack: Accounting Method • Amortized analysis. Sequence of n Push and MultiPop operations. • Pay 2 credits for each Push. • Keep 1 credit on each element on the stack. • Amortized cost per operation: • Push: 2 • MultiPop: 1 (to pay for pop on empty stack). 12
Potential method • Potential functions. • Prepaid credit (potential) associated with the data structure (money in the bank). • Can be used to pay for future operations. • Ensure there is always enough “money in the bank”. • Amortized cost of an operation: potential cost plus increase in potential due to the operation. • D i : data structure after i operations • Potential function Φ (D i ) maps D i onto a real value. • amortized cost = actual cost + Δ (D i ) = actual cost + Φ (D i ) - Φ (D i-1 ). 13
Potential Functions • Amortized cost: • amortized cost = actual cost + Δ (D i ) = actual cost + Φ (D i ) - Φ (D i-1 ). • Stack. • Φ (D i ) = #elements on the stack. • amortized cost of Push = 1 + Δ (D i ) = 2. • amortized cost of MultiPop(k): If k’=min(k,|S|) elements are popped. • if S ≠ ∅ : amortized cost = k‘+ Φ (D i ) - Φ (D i-1 ) = k’ - k’ = 0. • if S = ∅ : amortized cost = 1 + Δ (D i ) = 1. 14
Potential Functions • Amortized cost: • amortized cost = actual cost + Δ (D i ) = actual cost + Φ (D i ) - Φ (D i-1 ). • Dynamic tables � 2( k − L/ 2) if k ≥ L/ 2 • Φ (D i ) = 0 otherwise • L = current array size, k = number of elements in array. • amortized cost of insertion: • Array not full: amortized cost = 1 + 2 = 3 • Array full (doubling): Actual cost = L + 1, Φ (D i-1 ) = L, Φ (D i )=2: amortized cost = L + 1 + (2 - L) = 3. 15
Amortized Cost vs Actual Cost • Total cost: • ∑ amortized cost = ∑ (actual cost + Δ (D i )) = ∑ actual cost + Φ (D n ) - Φ (D 0 ). • ∑ actual cost = ∑ amortized cost + Φ (D 0 ) - Φ (D n ). • If potential always nonnegative and Φ (D 0 ) = 0 then ∑ actual cost ≤ ∑ amortized cost. 16
Potential Method • Summary: 1. Pick a potential function, Φ , that will work (art). 2. Use potential function to bound the amortized cost of the operations you're interested in. 3. Bound Φ (D 0 ) - Φ (D final ) • Techniques to find potential functions: if the actual cost of an operation is high , then decrease in potential due to this operation must be large, to keep the amortized cost low . 17
Union-Find Data Structures 18
Union-Find Data Structure • Union-Find data structure: • Makeset(x): Create a singleton set containing x and return its identifier. • Union(A,B): Combine the sets identified by A and B into a new set, destroying the old sets. Return the identifier of the new set. • Find(x): Return the identifier of the set containing x. • Only requirement for identifier: find(x) = find(y) iff x and y are in the same set. • Applications: Connectivity, Kruskal’s algorithm for MST, ... 19
A Simple Union-Find Data Structure • Quick-Union: • Each set represented by a tree. Elements are represented by nodes. Root is also identifier. • Make-Set(x): Create a new node x. Set p(x) = x. • Find(x): Follow parent pointers to the root. Return the root. • Union(A,B): Make root(B) a child of root(A). 20
A Simple Union-Find Data Structure • Quick-Union: 1 2 3 4 5 6 7 8 9 • Union(A,B): Make root(B) a child Union(7,5) of root(A). 1 2 3 4 6 7 8 9 5 Union(3,1) 2 3 4 6 7 8 9 1 5 Union(7,8) 2 3 4 6 7 9 1 5 8 Union(3,7) 2 3 4 6 9 1 7 5 8 21
A Simple Union-Find Data Structure • Quick-Union: • Each set represented by a tree. Elements are represented by nodes. Root is also identifier. • Make-Set(x): Create a new node x. Set p(x) = x. • Find(x): Follow parent pointers to the root. Return the root. • Union(A,B): Make root(B) a child of root(A). • Analysis: • Make-Set(x) and Union(A,B): O(1) • Find(x): O(h), where h is the height of the tree containing x. Worst-case O(n). 22
A Simple Union-Find Data Structure • Quick Find: • Each set represented by a tree of height at most one. Elements are represented by nodes. Root is also identifier. • Make-Set(x): Create a new node x. Set p(x) = x and size(x) = 1. • Find(x): Follow parent pointer to root. Return root. • Union(A,B): Move all elements from smallest set to larger set (change parent pointers). I.e., set p(B) = A and size(A) = size(A) + size(B). 23
A Simple Union-Find Data Structure • Quick Find: 1 2 3 4 5 6 7 8 9 • Union(A,B): Move all elements Union(7,5) from smallest set to larger set 1 2 3 4 6 7 8 9 (change parent pointers). 5 Union(3,1) 2 3 4 6 7 8 9 1 5 Union(7,8) 2 3 4 6 7 9 1 5 8 Union(3,7) 2 4 6 7 9 1 3 5 8 24
A Simple Union-Find Data Structure • Quick Find: • Each set represented by a tree of height at most one. Elements are represented by nodes. Root is also identifier. • Make-Set(x): Create a new node x. Set p(x) = x and size(x) = 1. • Find(x): Follow parent pointer to root. Return root. • Union(A,B): Move all elements from smallest set to larger set (change parent pointers). I.e., set p(B) = A and size(A) = size(A) + size(B). • Analysis: • Make-Set(x) and Find(x): O(1) • Union(A,B): O(n) 25
Amortized Complexity of Quick-Find • Amortized analysis: Consider a sequence of k Unions. • Observation 1: How many elements can be touched by the k Unions? • Consider an element x: • What can we say about the size of the set containing x before and after a union that changes x’s parent pointer? • How large can the set containing x be after the m Unions? • How many times can x’s parent pointer be changed? 26
Amortized Complexity of Quick-Find • Amortized analysis: • Each time x’s parent pointer changes the size of the set containing it at least doubles. • At most 2k elements can be touched by k unions. • Size of set containing x after k unions at most 2k. • x’s parent pointer is updated at most lg(2k) times. • In total O(k log k) parent pointers updated in a sequence of k unions. • Amortized time per union: O(log k). • Lemma. Using the Quick-Find data structure a Find operation takes worst case time O(1), a Make-Set operation time O(1), and a sequence of n Union operations takes time O(n log n). 27
Recommend
More recommend