New Generalizations of the Bethe Approximation via Asymptotic - PowerPoint PPT Presentation

New Generalizations of the Bethe Approximation via Asymptotic Expansion Ryuhei Mori Toshiyuki Tanaka Kyoto University 35th Symposium of Information Theory and Its Application Beppu, Oita, Japan 13 December 2012

The Bethe approximation ◮ Successful approximation for low-density parity-check codes, compressed sensing, etc. ◮ Efficient message passing algorithm belief propagation (BP). ◮ A fixed point of BP is a stationary point of the Bethe free energy [Yedidia et al. 2005]. 2 / 24

Factor graph and partition function For a factor graph G . ◮ V : the set of variable nodes i 7 ◮ F : the set of factor nodes ◮ X : the alphabet set a 5 i 6 ◮ N : the number of variables ◮ d o : the degree of a node for a 4 i 5 o ∈ V ∪ F ◮ f a : a non-negative function a 3 i 4 in X d a → R ≥ 0 . a 2 i 3 1 � p ( x ; G ) := f a ( x ∂ a ) Z ( G ) a ∈ F a 1 i 2 � � Z ( G ) := f a ( x ∂ a ) a ∈ F x ∈X N i 1 3 / 24

The Legendre transformation     � � − log Z ( G ) = inf  − q ( x ) log f a ( x ∂ a ) − H ( q ) q ∈P ( X N )  x ∈X N a ∈ F where H ( q ) is the Shannon entropy. log Z ( G ) and − H ( q ) are dual in the sense of Legendre transformation. log Z ( G ) ← → − H ( q ) 4 / 24

The Bethe free energy     � � − log Z ( G ) = inf  − q ( x ) log f a ( x ∂ a ) − H ( q ) q ∈P ( X N )  x ∈X N a ∈ F � − log Z Bethe ( G ) = inf ( b i ∈P ( X )) i ∈ V ,( b a ∈P ( X da )) a ∈ F � � � − b a ( x ∂ a ) log f a ( x ∂ a ) − H Bethe (( b i ) i ∈ V , ( b a ) a ∈ F ) a ∈ F x ∈X da where � � H Bethe (( b i ) i ∈ V , ( b a ) a ∈ F ) := H ( b a ) − ( d i − 1) H ( b i ). a ∈ F i ∈ V 5 / 24

Charactrizations of the Bethe free energy ◮ Loop calculus [Chertkov and Chernyak 2006, 2007]   �  . Z ( G ) = Z Bethe  1 + r ( C ) C : generalized loop − → generalized to non-binary alphabet [This work] 6 / 24

Charactrizations of the Bethe free energy ◮ Loop calculus [Chertkov and Chernyak 2006, 2007]   �  . Z ( G ) = Z Bethe  1 + r ( C ) C : generalized loop − → generalized to non-binary alphabet [This work] ◮ Method of graph cover [Vontobel 2010] 1 M log � Z Σ M � → log Z Bethe − → generalized to the second-order analysis [This work] 6 / 24

Loop calculus for the binary alphabet Lemma (Chertkov and Chernyak 2006, Sudderth et al., 2008) Assume that the alphabet is binary, i.e., X = { 0, 1 } . Let η i := � X i � b i = b i (1) . For any stationary point (( b i ), ( b a )) of the Bethe free energy, � Z ( E ′ ) Z ( G ) = Z Bethe (( b i ) i ∈ V , ( b a ) a ∈ F ) E ′ ⊆ E where � d i ( E ′ ) � �� X i − η i � Z ( E ′ ) := � � ( X i − η i ) 2 � b i i ∈ V b i � � X i − η i � � · . � � ( X i − η i ) 2 � b i i ∈ ∂ a , ( i , a ) ∈ E ′ a ∈ F b a 7 / 24

Generalized loop G := { E ′ ⊆ E | d o ( E ′ ) � = 1 for o ∈ V ∪ F }   � Z ( E ′ )  . Z ( G ) = Z Bethe (( b i ) i ∈ V , ( b a ) a ∈ F )  1 + E ′ ∈G\{ ∅ } 8 / 24

Loop calculus for a non-binary alphabet 1/2 Theorem (This work) For any stationary point (( b i ), ( b a )) of the Bethe free energy, � Z ( E ′ ) Z ( G ) = Z Bethe (( b i ) i ∈ V , ( b a ) a ∈ F ) E ′ ⊆ E where � � ∂ log b i ( X i ) � � � Z ( E ′ ) := ∂ η i , y i , a y ∈ ( X\{ 0 } ) | E ′| i ∈ V a ∈ ∂ i ,( i , a ) ∈ E ′ b i � � ∂ log b i ( X i ) � � · . ∂ θ i , y i , a a ∈ F i ∈ ∂ a ,( i , a ) ∈ E ′ b a Coordinate systems the natural parameters ( θ i , y ) y ∈X\{ 0 } and the expectation parameters ( η i , y ) y ∈X\{ 0 } . 9 / 24

Loop calculus for a non-binary alphabet 2/2 The Jacobian matrix ∂ θ ∂ η is the Fisher information matrix. Theorem (This work) If one chooses a sufficient statistic t i ( x i ) for i ∈ V such that the Fisher information matrix is diagonal at b i , it holds � � t i , y i , a ( X i ) − η i , y i , a � � � Z ( E ′ ) = �� 2 � t i , y i , a ( X i ) − η i , y i , a y ∈ ( X\{ 0 } ) | E ′| a ∈ ∂ i ,( i , a ) ∈ E ′ i ∈ V b i b i � � t i , y i , a ( X i ) − η i , y i , a � � · . �� 2 � t i , y i , a ( X i ) − η i , y i , a a ∈ F i ∈ ∂ a ,( i , a ) ∈ E ′ b i b a Acknowledgment: P. Vontobel for insightful discussion about normal factor graph. 10 / 24

Loop calculus for expectations Theorem (This work; it can be simplified like the previous theorem ) Let C ⊆ V , F C := { a ∈ F | ∂ a ⊆ C } and g : X | C | → R . For any (( b i ), ( b a )) ∈ A , it holds � Z ( E ′ ) Z � g ( X C ) � p = Z Bethe (( b i ) i ∈ V , ( b a ) a ∈ F ) E ′ ⊆ E \ E ( F C ) where � � ∂ log b i ( X i ) Z ( E ′ ) := � � � ∂η i , y i , a y ∈ ( X\{ 0 } ) | E ′| a ∈ ∂ i ,( i , a ) ∈ E ′ i ∈ V \ C b i � � ∂ log b i ( X i ) � � ∂θ i , y i , a i ∈ ∂ a ,( i , a ) ∈ E ′ a ∈ F \ F C b a � � ∂ log b i ( X i ) � · g ( X C ) . ∂η i , y i , a i ∈ C ,( i , a ) ∈ E ′ b C Here, �·� b C is a pseudo expectation with respect to b a ( x ∂ a ) � � b C ( x C ) = b i ( x i ) i ∈ ∂ a b i ( x i ) . � i ∈ C a ∈ F C 11 / 24

Loop calculus for single-cycle graph a 1 i 1 i 2 a 3 a 2 i 3 Cor b ak [ t i k ( X i ), t i k +1 ( X i k +1 )] := Var b k [ t i k ( X i k )] − 1 2 Cov b ak [ t i k ( X i k ), t i k +1 ( X i k +1 )]Var b k +1 [ t i k +1 ( X i k +1 )] − 1 2 . Corollary (Partition function of single-cycle factor graph) Z ( G ) = Z Bethe (( b i ) i ∈ V , ( b a ) a ∈ F ) � � �� · 1 + tr Cor b a 1 [ t i 1 ( X i 1 ), t i 2 ( X i 2 )]Cor b a 2 [ t i 2 ( X i 2 ), t i 3 ( X i 3 )] · · · Cor b an [ t i n ( X i n ), t i 1 ( X i 1 )] . 12 / 24

Correlation matrix on a tree factor graph a 3 i 1 i 3 4 a 1 a 2 i 2 Corollary ( Correlation matrix on a tree factor graph; Watanabe 2010) Cor p [ X 1 , X n ] = Cor p [ t 1 ( X 1 ), t 2 ( X 2 )]Cor p [ t 2 ( X 2 ), t 3 ( X 3 )] · · · Cor p [ t n − 1 ( X n − 1 ), t n ( X n )] 13 / 24

Graph cover Z ( G ) i 1 i 2 i 3 i 4 a 1 a 2 a 3 14 / 24

Graph cover Z ( G ) M i (0) i (1) i (2) i (0) i (1) i (2) i (0) i (1) i (2) i (0) i (1) i (2) 1 1 1 2 2 2 3 3 3 4 4 4 a (0) a (1) a (2) a (0) a (1) a (2) a (0) a (1) a (2) 1 1 1 2 2 2 3 3 3 14 / 24

Graph cover ? ≈ Z ( G ) M Z ( G σ ) i (0) i (1) i (2) i (0) i (1) i (2) i (0) i (1) i (2) i (0) i (1) i (2) 1 1 1 2 2 2 3 3 3 4 4 4 a (0) a (1) a (2) a (0) a (1) a (2) a (0) a (1) a (2) 1 1 1 2 2 2 3 3 3 14 / 24

The method of graph cover Lemma (Vontobel 2010) log � Z Σ M � = M log Z Bethe + o ( M ) Sketch of the proof. The method of types and Laplace method. 15 / 24

The second-order analysis for graph cover Lemma (This work) � log � Z Σ M � = M log Z Bethe + log ζ ( u ) + o (1) where ζ ( u ) is the edge zeta function and u a i → j = Cor b a [ t i ( X i ), t j ( X j )] . Sketch of the proof. Laplace method with the central approximation. 16 / 24

Interpretation of Legendre transformation by large deviation log Z ( G ) = 1 1 M log Z ( G ) M = lim M log Z ( G ) M M →∞     � � = − inf  − p ( x ) log f a ( x ∂ a ) − H ( p ) p ∈P ( X N )  x ∈X N a ∈ F From more detailed analysis (asymptotic expansion) � det ( J ( θ )) + 1 M 0 + 1 log Z ( G ) M = M log Z ( G ) + log M 2 0 + · · · � x p ( x ) � �� =0 17 / 24

Asymptotic expansion and asymptotic Bethe approximation � det ( J ( θ )) + 1 M 0 + 1 log Z ( G ) M = M log Z ( G ) + log M 2 0 + · · · � x p ( x ) � �� =0 � det( ∇ F Bethe ) − 1 log � Z Σ M � = M log Z Bethe + log � � x i b i ( x i ) 1 − d i � � x ∂ a b a ( x ∂ a ) i a ∈ F � �� =log √ ζ ( u ) [Watanabe and Fukumizu 2010] + 1 M g 1 + 1 M 2 g 2 + · · · . By letting M = 1, Definition (Asymptotic Bethe approximation) For m = 1, 2, ... , � log Z ( m ) AB := log Z Bethe + log ζ ( u ) + g 1 + g 2 + · · · + g m − 1 . 18 / 24

Edge zeta function Definition (Prime cycle) A closed walk e 1 ⇀ e 2 · · · ⇀ e n ⇀ e 1 is a prime cycle ⇐ ⇒ it is backtrackless and cannot be expressed as power of another walk. Definition (Edge zeta function) 1 � ζ ( u ) = det ( I − u e 1 , e 2 u e 2 , e 3 · · · u e n , e 1 ). ( e 1 ⇀ e 2 ··· ⇀ e n ⇀ e 1 ) is a prime cycle Lemma (Watanabe-Fukumizu formula; 2010) ζ ( u ) − 1 = det( ∇ 2 F Bethe (( η i ), ( η � a � ))) � det(Var b i [ t i ( X i )]) 1 − d i � · det(Var b a [ t a ( X ∂ a )]) i ∈ V a ∈ F where u a i → j = Cor b a [ t i ( X i ), t j ( X j )] . 19 / 24

Single-cycle graph Let A := Cor b a 1 [ t i 1 ( X i 1 ), t i 2 ( X i 2 )]Cor b a 2 [ t i 2 ( X i 2 ), t i 3 ( X i 3 )] · · · Cor b an [ t i n ( X i n ), t i 1 ( X i 1 )] Then, the true partition function Z and the asymptotic Bethe approximation Z (1) AB are Z = Z Bethe (( b i ) i ∈ V , ( b a ) a ∈ F ) (1 + tr( A )) . 1 Z (1) AB = Z Bethe (( b i ) i ∈ V , ( b a ) a ∈ F ) det( I − A ). � � 1 + tr( A ) + O ( ρ ( A ) 2 ) = Z Bethe (( b i ) i ∈ V , ( b a ) a ∈ F ) where ρ ( A ) is the spectrum radius of A . The asymptotic Bethe approximation is accurate when A ≈ 0. 20 / 24

New Generalizations of the Bethe Approximation via Asymptotic - PowerPoint PPT Presentation

New Generalizations of the Bethe Approximation via Asymptotic Expansion Ryuhei Mori Toshiyuki Tanaka Kyoto University 35th Symposium of Information Theory and Its Application Beppu, Oita, Japan 13 December 2012 The Bethe approximation

Probabilistic Graphical Models Probabilistic Graphical Models Loopy BP and Bethe Free Energy

Graphical Models Graphical Models Loopy BP and Bethe Free Energy Siamak Ravanbakhsh Winter 2018

Outline DMP204 SCHEDULING, TIMETABLING AND ROUTING Lecture 16 Job Shop 1. Job Shop

6. Approximation and fitting norm approximation least-norm problems regularized

Degeneration of Bethe subalgebras in the Yangian Aleksei Ilin National Research University

Deep Approximation via Deep Learning Zuowei Shen Department of Mathematics National University

Generalizations are driven by semantics and constrained by statistical preemption New evidence

Communicating generalizations (in computational terms) Michael Henry Tessler Stanford University

The Computational and Logical Nature of Phonological Generalizations Jeffrey Heinz , Jane

Catalan Combinatorics of Borel Ideals and Generalizations Eric S. Egge Carleton College

Learning Opaque Generalizations: The Case of Samala (Chumash) Jeffrey Heinz* William Idsardi**

Generalizations of Gowers Theorem Dana Barto sov a (USP) Aleksandra Kwiatkowska (UCLA)

Jan 23 Conceptual models of ecological systems Example of drawing strong generalizations in

Multivariate generalizations of the Foata-Sch utzenberger equidistribution Fourth Colloquium

The computational nature of phonological generalizations Jeffrey Heinz Rutgers University April

Moderately exponential approximation Bridging the gap between exact computation and polynomial

A RESURGENT TRANSSERIES FOR N=4 SUSY YANG-MILLS Ins Aniceto Non-Perturbative Methods in

Dimer Model: Full Asymptotic Expansion of the Partition Function Pavel Bleher Indiana

r N ! N X Q ( N ) ( N k )! N k = + O (1) 2 1 k N AofA Asymptotics Q&A 1

Example The sorting problems is defined as follows: n Input set: sequence of numbers <a 1 ,

The problem of multisummability in higher dimensions 18th June 2018 Sergio A. Carrillo.

Model Theory of Transseries Matthias Aschenbrenner Overview I. Transseries II. Some Conjectures

Topologically massive gravity and the AdS/CFT correspondence Balt van Rees 8 September 2009

Log-Concavity of Asymptotic Multigraded Hilbert Series Gregory G. Smith arXiv:1109.4135 15

Sambuz

Useful Links

Newsletter

Mail Us

New Generalizations of the Bethe Approximation via Asymptotic - PowerPoint PPT Presentation

New Generalizations of the Bethe Approximation via Asymptotic Expansion Ryuhei Mori Toshiyuki Tanaka Kyoto University 35th Symposium of Information Theory and Its Application Beppu, Oita, Japan 13 December 2012 The Bethe approximation

Probabilistic Graphical Models Probabilistic Graphical Models Loopy BP and Bethe Free Energy

Graphical Models Graphical Models Loopy BP and Bethe Free Energy Siamak Ravanbakhsh Winter 2018

Outline DMP204 SCHEDULING, TIMETABLING AND ROUTING Lecture 16 Job Shop 1. Job Shop

6. Approximation and fitting norm approximation least-norm problems regularized

Degeneration of Bethe subalgebras in the Yangian Aleksei Ilin National Research University

Deep Approximation via Deep Learning Zuowei Shen Department of Mathematics National University

Generalizations are driven by semantics and constrained by statistical preemption New evidence

Communicating generalizations (in computational terms) Michael Henry Tessler Stanford University

The Computational and Logical Nature of Phonological Generalizations Jeffrey Heinz , Jane

Catalan Combinatorics of Borel Ideals and Generalizations Eric S. Egge Carleton College

Learning Opaque Generalizations: The Case of Samala (Chumash) Jeffrey Heinz* William Idsardi**

Generalizations of Gowers Theorem Dana Barto sov a (USP) Aleksandra Kwiatkowska (UCLA)

Jan 23 Conceptual models of ecological systems Example of drawing strong generalizations in

Multivariate generalizations of the Foata-Sch utzenberger equidistribution Fourth Colloquium

The computational nature of phonological generalizations Jeffrey Heinz Rutgers University April

Moderately exponential approximation Bridging the gap between exact computation and polynomial

A RESURGENT TRANSSERIES FOR N=4 SUSY YANG-MILLS Ins Aniceto Non-Perturbative Methods in

Dimer Model: Full Asymptotic Expansion of the Partition Function Pavel Bleher Indiana

r N ! N X Q ( N ) ( N k )! N k = + O (1) 2 1 k N AofA Asymptotics Q&amp;A 1

Example The sorting problems is defined as follows: n Input set: sequence of numbers &lt;a 1 ,

The problem of multisummability in higher dimensions 18th June 2018 Sergio A. Carrillo.

Model Theory of Transseries Matthias Aschenbrenner Overview I. Transseries II. Some Conjectures

Topologically massive gravity and the AdS/CFT correspondence Balt van Rees 8 September 2009

Log-Concavity of Asymptotic Multigraded Hilbert Series Gregory G. Smith arXiv:1109.4135 15

Sambuz

Useful Links

Newsletter

Mail Us

r N ! N X Q ( N ) ( N k )! N k = + O (1) 2 1 k N AofA Asymptotics Q&A 1

Example The sorting problems is defined as follows: n Input set: sequence of numbers <a 1 ,