advanced algorithms xiii
play

Advanced Algorithms (XIII) Shanghai Jiao Tong University Chihao - PowerPoint PPT Presentation

Advanced Algorithms (XIII) Shanghai Jiao Tong University Chihao Zhang June 1, 2020 Total Variation Distance Total Variation Distance Let and be two distributions on Total Variation Distance Let and be two distributions on


  1. Advanced Algorithms (XIII) Shanghai Jiao Tong University Chihao Zhang June 1, 2020

  2. Total Variation Distance

  3. Total Variation Distance Let and be two distributions on Ω μ ν

  4. Total Variation Distance Let and be two distributions on Ω μ ν Their total variation distance is

  5. Total Variation Distance Let and be two distributions on Ω μ ν Their total variation distance is d TV ( μ , ν ) = 1 2 ∑ μ ( x ) − ν ( x ) = max A ⊆Ω μ ( A ) − ν ( A ) x ∈Ω

  6. Total Variation Distance Let and be two distributions on Ω μ ν Their total variation distance is d TV ( μ , ν ) = 1 2 ∑ μ ( x ) − ν ( x ) = max A ⊆Ω μ ( A ) − ν ( A ) x ∈Ω ν μ A

  7. Total Variation Distance Let and be two distributions on Ω μ ν Their total variation distance is d TV ( μ , ν ) = 1 2 ∑ μ ( x ) − ν ( x ) = max A ⊆Ω μ ( A ) − ν ( A ) x ∈Ω ν 1 -distance scaled by ℓ 1 2 μ A

  8. Coupling

  9. Coupling Let and be two distributions on μ ν Ω

  10. Coupling Let and be two distributions on μ ν Ω A coupling of and is a joint distribution on μ ν ω such that: Ω × Ω

  11. Coupling Let and be two distributions on μ ν Ω A coupling of and is a joint distribution on μ ν ω such that: Ω × Ω μ ( x ) = ∑ ∀ x ∈ Ω , ω ( x , y ) y ∈Ω

  12. Coupling Let and be two distributions on μ ν Ω A coupling of and is a joint distribution on μ ν ω such that: Ω × Ω μ ( x ) = ∑ ∀ x ∈ Ω , ω ( x , y ) y ∈Ω ν ( x ) = ∑ ∀ y ∈ Ω , ω ( x , y ) x ∈Ω

  13. Coupling Lemma

  14. Coupling Lemma Let be a coupling of and ω μ ν

  15. Coupling Lemma Let be a coupling of and ω μ ν and ( X , Y ) ∼ ω ⟹ X ∼ μ Y ∼ ν

  16. Coupling Lemma Let be a coupling of and ω μ ν and ( X , Y ) ∼ ω ⟹ X ∼ μ Y ∼ ν Then ( X , Y ) ∼ ω [ X ≠ Y ] ≥ d TV ( μ , ν ) Pr

  17. Coupling Lemma Let be a coupling of and ω μ ν and ( X , Y ) ∼ ω ⟹ X ∼ μ Y ∼ ν Then ( X , Y ) ∼ ω [ X ≠ Y ] ≥ d TV ( μ , ν ) Pr Moreover, there exists such that ω *

  18. Coupling Lemma Let be a coupling of and ω μ ν and ( X , Y ) ∼ ω ⟹ X ∼ μ Y ∼ ν Then ( X , Y ) ∼ ω [ X ≠ Y ] ≥ d TV ( μ , ν ) Pr Moreover, there exists such that ω * ( X , Y ) ∼ ω * [ X ≠ Y ] = d TV ( μ , ν ) Pr

  19. Proof of Coupling Lemma

  20. Proof of Coupling Lemma For finite , designing a coupling is equivalent to Ω filling a matrix so that the marginals are correct Ω × Ω

  21. Proof of Coupling Lemma For finite , designing a coupling is equivalent to Ω filling a matrix so that the marginals are correct Ω × Ω Ω = {1,2}, μ = (1/2,1/2), ν = (1/3,2/3)

  22. Proof of Coupling Lemma For finite , designing a coupling is equivalent to Ω filling a matrix so that the marginals are correct Ω × Ω Ω = {1,2}, μ = (1/2,1/2), ν = (1/3,2/3) μ 1 1 ν 2 2 1 3 2 3

  23. Proof of Coupling Lemma For finite , designing a coupling is equivalent to Ω filling a matrix so that the marginals are correct Ω × Ω Ω = {1,2}, μ = (1/2,1/2), ν = (1/3,2/3) μ 1 1 ν 2 2 1 1 3 3 2 3

  24. Proof of Coupling Lemma For finite , designing a coupling is equivalent to Ω filling a matrix so that the marginals are correct Ω × Ω Ω = {1,2}, μ = (1/2,1/2), ν = (1/3,2/3) μ 1 1 ν 2 2 1 1 3 3 2 1 3 2

  25. Proof of Coupling Lemma For finite , designing a coupling is equivalent to Ω filling a matrix so that the marginals are correct Ω × Ω Ω = {1,2}, μ = (1/2,1/2), ν = (1/3,2/3) μ 1 1 ν 2 2 1 1 0 3 3 2 1 3 2

  26. Proof of Coupling Lemma For finite , designing a coupling is equivalent to Ω filling a matrix so that the marginals are correct Ω × Ω Ω = {1,2}, μ = (1/2,1/2), ν = (1/3,2/3) μ 1 1 ν 2 2 1 1 0 3 3 2 1 1 3 6 2

  27. Proof of Coupling Lemma For finite , designing a coupling is equivalent to Ω filling a matrix so that the marginals are correct Ω × Ω Ω = {1,2}, μ = (1/2,1/2), ν = (1/3,2/3) μ 1 1 ν 2 2 is the one maximizing ω * 1 1 0 3 3 the sum of diagonals 2 1 1 3 6 2

  28. Coupling of Markov Chains

  29. Coupling of Markov Chains Consider two copies of the chain : P

  30. Coupling of Markov Chains Consider two copies of the chain : P • The initial distribution is and μ 0 ν 0 • μ T t = μ T 0 P t and ν T t = ν T 0 P t

  31. Coupling of Markov Chains Consider two copies of the chain : P • The initial distribution is and μ 0 ν 0 • μ T t = μ T 0 P t and ν T t = ν T 0 P t A coupling of the two chains is joint distribution of and satisfying the following ω { μ t } t ≥ 0 { ν t } t ≥ 0 conditions

  32. is a pair of processes such that {( X t , Y t )} t ≥ 0 ∼ ω

  33. is a pair of processes such that {( X t , Y t )} t ≥ 0 ∼ ω ∀ a , b ∈ Ω , Pr [ X t +1 = b ∣ X t = a ] = P ( a , b )

  34. is a pair of processes such that {( X t , Y t )} t ≥ 0 ∼ ω ∀ a , b ∈ Ω , Pr [ X t +1 = b ∣ X t = a ] = P ( a , b ) ∀ a , b ∈ Ω , Pr [ Y t +1 = b ∣ X t = a ] = P ( a , b )

  35. is a pair of processes such that {( X t , Y t )} t ≥ 0 ∼ ω ∀ a , b ∈ Ω , Pr [ X t +1 = b ∣ X t = a ] = P ( a , b ) ∀ a , b ∈ Ω , Pr [ Y t +1 = b ∣ X t = a ] = P ( a , b ) Marginally, and are both chain { X t } { Y t } P

  36. is a pair of processes such that {( X t , Y t )} t ≥ 0 ∼ ω ∀ a , b ∈ Ω , Pr [ X t +1 = b ∣ X t = a ] = P ( a , b ) ∀ a , b ∈ Ω , Pr [ Y t +1 = b ∣ X t = a ] = P ( a , b ) Marginally, and are both chain { X t } { Y t } P ∀ t ≥ 0, X t = Y t ⟹ X t ′ = Y t ′ for all t ′ > t

  37. is a pair of processes such that {( X t , Y t )} t ≥ 0 ∼ ω ∀ a , b ∈ Ω , Pr [ X t +1 = b ∣ X t = a ] = P ( a , b ) ∀ a , b ∈ Ω , Pr [ Y t +1 = b ∣ X t = a ] = P ( a , b ) Marginally, and are both chain { X t } { Y t } P ∀ t ≥ 0, X t = Y t ⟹ X t ′ = Y t ′ for all t ′ > t Two chains coalesce once they meet

  38. Fundamental Theorem via Coupling

  39. Fundamental Theorem via Coupling If a finite chain is irreducible and aperiodic, then it has a P unique stationary distribution . Moreover, for any initial π distribution , it holds that μ t →∞ μ T P t = π T lim

  40. Fundamental Theorem via Coupling If a finite chain is irreducible and aperiodic, then it has a P unique stationary distribution . Moreover, for any initial π distribution , it holds that μ t →∞ μ T P t = π T lim Consider two chains and { X t } t ≥ 0 { Y t } t ≥ 0

  41. Fundamental Theorem via Coupling If a finite chain is irreducible and aperiodic, then it has a P unique stationary distribution . Moreover, for any initial π distribution , it holds that μ t →∞ μ T P t = π T lim Consider two chains and { X t } t ≥ 0 { Y t } t ≥ 0 , for arbitrary • X 0 ∼ π Y 0 ∼ μ 0 μ 0

  42. Fundamental Theorem via Coupling If a finite chain is irreducible and aperiodic, then it has a P unique stationary distribution . Moreover, for any initial π distribution , it holds that μ t →∞ μ T P t = π T lim Consider two chains and { X t } t ≥ 0 { Y t } t ≥ 0 , for arbitrary • X 0 ∼ π Y 0 ∼ μ 0 μ 0 •A coupling where and run independently X t Y t

  43. irreducible + aperiodic ⟹ ∃ t , ∀ x , y , P t ( x , y ) > 0

  44. irreducible + aperiodic ⟹ ∃ t , ∀ x , y , P t ( x , y ) > 0 Then for any , there exists some s.t. z ∈ Ω θ > 0

  45. irreducible + aperiodic ⟹ ∃ t , ∀ x , y , P t ( x , y ) > 0 Then for any , there exists some s.t. z ∈ Ω θ > 0 Pr [ X t = Y t ] ≥ Pr [ X t = Y t = z ] = Pr [ X t = z ] ⋅ Pr [ Y t = z ] = π ( z ) ⋅ P t ( Y 0 , z ) ≥ θ > 0

  46. irreducible + aperiodic ⟹ ∃ t , ∀ x , y , P t ( x , y ) > 0 Then for any , there exists some s.t. z ∈ Ω θ > 0 Pr [ X t = Y t ] ≥ Pr [ X t = Y t = z ] = Pr [ X t = z ] ⋅ Pr [ Y t = z ] = π ( z ) ⋅ P t ( Y 0 , z ) ≥ θ > 0 Pr [ X t ≠ Y t ] ≤ 1 − θ < 1

  47. irreducible + aperiodic ⟹ ∃ t , ∀ x , y , P t ( x , y ) > 0 Then for any , there exists some s.t. z ∈ Ω θ > 0 Pr [ X t = Y t ] ≥ Pr [ X t = Y t = z ] = Pr [ X t = z ] ⋅ Pr [ Y t = z ] = π ( z ) ⋅ P t ( Y 0 , z ) ≥ θ > 0 Pr [ X t ≠ Y t ] ≤ 1 − θ < 1 Pr [ X 2 t ≠ Y 2 t ] = Pr [ X 2 t ≠ Y 2 t ∧ X t = Y t ] + Pr [ X 2 t ≠ Y 2 t ∧ X t ≠ Y t ] = Pr [ X 2 t ≠ Y 2 t ∣ X t ≠ Y t ] ⋅ Pr [ X t ≠ Y t ] ≤ (1 − θ ) 2

  48. irreducible + aperiodic ⟹ ∃ t , ∀ x , y , P t ( x , y ) > 0 Then for any , there exists some s.t. z ∈ Ω θ > 0 Pr [ X t = Y t ] ≥ Pr [ X t = Y t = z ] = Pr [ X t = z ] ⋅ Pr [ Y t = z ] = π ( z ) ⋅ P t ( Y 0 , z ) ≥ θ > 0 Pr [ X t ≠ Y t ] ≤ 1 − θ < 1 Pr [ X 2 t ≠ Y 2 t ] = Pr [ X 2 t ≠ Y 2 t ∧ X t = Y t ] + Pr [ X 2 t ≠ Y 2 t ∧ X t ≠ Y t ] = Pr [ X 2 t ≠ Y 2 t ∣ X t ≠ Y t ] ⋅ Pr [ X t ≠ Y t ] ≤ (1 − θ ) 2 …

Recommend


More recommend