the probabilistic method proof through a probabilistic
play

The Probabilistic Method: Proof Through a Probabilistic Argument - PowerPoint PPT Presentation

The Probabilistic Method: Proof Through a Probabilistic Argument Compute n n n 1 i i 2 i =0 Proof Through a Probabilistic Argument Compute n n n n n 1 1 n ! i = i i 2 i !( n


  1. The Probabilistic Method: Proof Through a Probabilistic Argument • Compute n � n � n � � 1 � i i 2 i =0

  2. Proof Through a Probabilistic Argument • Compute n n � n � n � n � � 1 � 1 n ! � � i = i i 2 i !( n − i )! 2 i =0 i =1 n � n ( n − 1)! � 1 � = n ( i − 1)!( n − i )! 2 i =1 n � n − 1 n ( n − 1)! � 1 � = 2 ( i − 1)!( n − i )! 2 i =1 n − 1 � n − 1 ( n − 1)! � 1 n � = 2 ( i )!( n − i − 1)! 2 i =0 n = 2

  3. Proof Through a Probabilistic Argument • Compute n � n � n � � 1 � i i 2 i =0 • Let X ∼ B ( n , 1 / 2), • X i independent r.v. with Pr ( X i = 1) = Or ( X i = 0) = 1 / 2. n n n � n � n � � 1 E [ X i ] = n � � � i = E [ X ] = E [ X i ] = i 2 2 i =0 i =1 i =1 • We prove a deterministic statement using a probabilistic argument!

  4. Theorem Given any graph G = ( V , E ) with n vertices and m edges, there is a partition of V into two disjoint sets A and B such that at least m / 2 edges connect vertex in A to a vertex in B. Proof. Construct sets A and B by randomly assign each vertex to one of the two sets. The probability that a given edge connect A to B is 1 / 2, thus the expected number of such edges is m / 2. Thus, there exists such a partition.

  5. Sample and Modify An independent set in a graph G is a set of vertices with no edges between them. Finding the largest independent set in a graph is an NP-hard problem. Theorem Let G = ( V , E ) be a graph on n vertices with dn / 2 edges. Then G has an independent set with at least n / 2 d vertices. Algorithm: 1 Delete each vertex of G (together with its incident edges) independently with probability 1 − 1 / d . 2 For each remaining edge, remove it and one of its adjacent vertices.

  6. X = number of vertices that survive the first step of the algorithm. E [ X ] = n d . Y = number of edges that survive the first step. An edge survives if and only if its two adjacent vertices survive. � 1 � 2 E [ Y ] = nd = n 2 d . 2 d The second step of the algorithm removes all the remaining edges, and at most Y vertices. Size of output independent set: E [ X − Y ] = n d − n 2 d = n 2 d .

  7. Conditional Expectation Definition � E [ Y | Z = z ] = y Pr( Y = y | Z = z ) , y where the summation is over all y in the range of Y . Lemma For any random variables X and Y , � E [ X ] = Pr( Y = y ) E [ X | Y = y ] , y where the sum is over all values in the range of Y .

  8. Derandomization using Conditional Expectations Given a graph G = ( V , E ) with n vertices and m edges, we showed that there is a partition of V into A and B such that at least m / 2 edges connect A to B . How do we find such a partition?

  9. C ( A , B ) = number of edges connecting A to B . If A , B is a random partition E [ C ( A , B )] = m 2 . Algorithm: 1 Let v 1 , v 2 , . . . , v n be an arbitrary enumeration of the vertices. 2 Let x i be the set where v i is placed ( x i ∈ { A , B } ). 3 For i = 1 to n do: 1 Place v i such that E [ C ( A , B ) | x 1 , x 2 , . . . , x i ] ≥ E [ C ( A , B ) | x 1 , x 2 , . . . , x i − 1 ] ≥ m / 2 .

  10. Lemma For all i = 1 , . . . , n there is an assignment of v i such that E [ C ( A , B ) | x 1 , x 2 , . . . , x i ] ≥ E [ C ( A , B ) | x 1 , x 2 , . . . , x i − 1 ] ≥ m / 2 .

  11. Proof. By induction on i . For i = 1, E [ E [ C ( A , B ) | X 1 ]] = E [ C ( A , B )] = m / 2 For i > 1, if we place v i randomly in one of the two sets, E [ C ( A , B ) | x 1 , x 2 , . . . , x i − 1 ] 1 = 2 E [ C ( A , B ) | x 1 , x 2 , . . . , x i = A ] + 1 2 E [ C ( A , B ) | x 1 , x 2 , . . . , x i = B ] . max( E [ C ( A , B ) | x 1 , x 2 , . . . , x i = A ] , E [ C ( A , B ) | x 1 , x 2 , . . . , x i = B ]) ≥ E [ C ( A , B ) | x 1 , x 2 , . . . , x i − 1 ] ≥ m / 2

  12. How do we compute max( E [ C ( A , B ) | x 1 , x 2 , . . . , x i = A ] , E [ C ( A , B ) | x 1 , x 2 , . . . , x i = B ]) ≥ E [ C ( A , B ) | x 1 , x 2 , . . . , x i − 1 ] ≥ m / 2 We just need to consider edges between v i and v 1 , . . . , v i − 1 . Simple Algorithm: 1 Place v 1 arbitrarily. 2 For i = 2 to n do 1 Place v i in the set with smaller number of neighbors.

  13. Perfect Hashing Goal: Store a static disctionary of n items in a table of O ( n ) space such that any search takes O (1) time.

  14. Universal hash functions Definition Let U be a universe with | U | ≥ n and V = { 0 , 1 , . . . , n − 1 } . A family of hash functions H from U to V is said to be k-universal if, for any elements x 1 , x 2 , . . . , x k , when a hash function h is chosen uniformly at random from H , 1 Pr( h ( x 1 ) = h ( x 2 ) = . . . = h ( x k )) ≤ n k − 1 .

  15. Example of 2-Universal Hash Functions Universe U = { 0 , 1 , 2 , . . . , m − 1 } Table keys V = { 0 , 1 , 2 , . . . , n − 1 } , with m ≥ n . A family of hash functions obtained by choosing a prime p ≥ m , h a , b ( x ) = (( ax + b ) mod p ) mod n , and taking the family H = { h a , b | 1 ≤ a ≤ p − 1 , 0 ≤ b ≤ p } . Lemma H is 2-universal.

  16. Lemma H is 2-universal. Proof. We first observe that for x 1 , x 2 ∈ { 0 , . . . , p − 1 } , x 1 � = x 2 , ax 1 + b � = ax 2 + b mod p . Thus, if h a , b ( x 1 ) = h a , b ( x 2 ) there is a pair ( s , r ) such that s � = r , s = r mod n , and ( ax 1 + b ) mod p = r ( ax 2 + b ) mod p = s There are p choices of r , and for each pair ( r , s ) there is only one pair ( a , b ) that satisfies the relation. For each r there are ≤ ⌈ p n ⌉ − 1 values s � = r such that s = r mod n . p ( ⌈ p n ⌉− 1) p ( p − 1) ≤ 1 Thus, the probability of a collision is ≤ n .

  17. Lemma If h ∈ H is chosen uniformly at random from a 2-universal family of hash functions mapping the universe U to [0 , n − 1] , then for any set S ⊂ U of size m, with probability ≥ 1 / 2 the number of collisions is bounded by m 2 / n. Proof. Let s 1 , s 2 , . . . , s m be the m items of S . Let X ij be 1 if the h ( s i ) = h ( s j ) and 0 otherwise. Let X = � 1 ≤ i < j ≤ n X ij .   n < m 2 � m � 1 �  = � E [ X ] = E X ij E [ X ij ] ≤ 2 n ,  2 1 ≤ i < j ≤ n 1 ≤ i < j ≤ m Markov’s inequality yields Pr( X ≥ m 2 / n ) ≤ Pr( X ≥ 2 E [ X ]) ≤ 1 2 .

  18. Definition A hash function is perfect for a set S if it maps S with no collisions. Lemma If h ∈ H is chosen uniformly at random from a 2-universal family of hash functions mapping the universe U to [0 , n − 1] , then for any set S ⊂ U of size m, such that m 2 ≤ n with probability ≥ 1 / 2 the hash function is perfect

  19. Theorem The two-level approach gives a perfect hashing scheme for m items using O ( m ) bins. Level I: use a hash table with n = m . Let X be the number of collisions, Pr( X ≥ m 2 / n ) ≤ Pr( X ≥ 2 E [ X ]) ≤ 1 2 . When n = m , there exists a choice of hash function from the 2-universal family that gives at most m collisions.

  20. Level II: Let c i be the number of items in the i -th bin. There are � c i � collisions between items in the i -th bin, thus 2 m � c i � � ≤ m . 2 i =1 For each bin with c i > 1 items, we find a second hash function that gives no collisions using space c 2 i . The total number of bins used is bounded above by m m m � c i � � c 2 � � m + i ≤ m + 2 + c i ≤ m + 2 m + m = 4 m . 2 i =1 i =1 i =1 Hence the total number of bins used is only O ( m ).

  21. The First and Second Moment Theorem For an integer random variable X, • Pr ( X > 0) = Pr ( X ≥ 1) ≤ E [ X ] • Pr ( X = 0) ≤ Pr ( | X − E [ X ] | ≥ E [ X ]) ≤ Var [ X ] ( E [ X ]) 2

  22. Application: Number of Isolated Nodes Let G n , p = ( V , E ) be a random graph generated as follows: • The graph has n nodes. � n • Each of the � pairs of vertices are connected by an edge with 2 probability p independently of any other edge in the graph. A node is isolated if it is adjacent to no edges. If p = 0 all vertices are isolated (have no edges). If p = 1 no vertex is isolated. What can we say for 0 < p < 1?

  23. Application: Number of Isolated Nodes Let G n , p = ( V , E ) be a random graph generated as follows: • The graph has n nodes. � n • Each of the � pairs of vertices are connected by an edge with 2 probability p independently of any other edge in the graph. A node is isolated if it has no edges. Theorem For any function w ( n ) → ∞ • If p = log n − w ( n ) , then whp the graph has isolated nodes. n • If p = log n + w ( n ) , then whp the graph has no isolated nodes. n

  24. Proof For i = 1 , . . . , n , let X i = 1 if node i is isolated, otherwise X i = 0. Let X = � n i =1 X i . E [ X ] = n (1 − p ) n − 1 For p = log n + w ( n ) n E [ X ] = n (1 − p ) n − 1 ≤ e log n − ( n − 1) p ≤ e − w ( n ) → 0 Thus, for p = log n + w ( n ) , n Pr ( X > 0) ≤ E [ X ] → 0

  25. To use the second moment method we need to bound Var [ x ]. Var [ X i ] ≤ E [ X 2 i ] = E [ X i ] = (1 − p ) n − 1 Cov ( X i , X j ) = (1 − p ) 2 n − 3 − (1 − p ) 2 n − 2 n � � Var [ X ] ≤ Var [ X i ] + Cov ( X i , X i ) i =1 i � = j n (1 − p ) n − 1 + n ( n − 1)(1 − p ) 2 n − 3 − n ( n − 1)(1 − p ) 2 n − 2 = n (1 − p ) n − 1 + n ( n − 1) p (1 − p ) 2 n − 3 =

Recommend


More recommend