Mean field approximation methods and information geometry Shiro Ikeda ISM, Tokyo, Japan 31 August 2009 Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 1 / 38
Outline Belief Propagation 1 Information Geometrical View 2 Survey Propagation (SP) 3 Conclusion 4 Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 2 / 38
Belief Propagation Graphical model and inference Graphical Model Example Stochastic Variable x l d x = ( x i , x j , x k , x l ) T b x i a Clique x k c r ∈ L = { a, b, c, d } x j Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 3 / 38
Belief Propagation Graphical model and inference Graphical model Joint distribution q ( x ) = 1 � ψ r ( x r ) Z r ∈L �� � = exp c r ( x r ) − ϕ q . r ∈L �� � � ψ r ( x r ) > 0 , Z = ψ r ( x r ) x r ∈L c r ( x r ) = log ψ r ( x r ) , ϕ q = log Z x r = { x i | i ∈ V ( r ) } , V ( r ) : member of clique r. Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 4 / 38
Belief Propagation Graphical model and inference Belief Propagation Message update For each ( r, i ) , update messages ν ri ( x i ) , µ ri ( x i ) . initialize t = 1 , ν ri ( x i ) t = 1 / 2 µ ri ( x i ) t = 1 / 2 1 update messages as follows 2 ν t +1 � � µ t ri ( x i ) ∝ ψ r ( x r ) rj ( x j ) x r \ x i j ∈ V ( r ) \ i µ t +1 � ν t +1 ri ( x i ) ∝ si ( x i ) , s ∈N ( i ) \ r belief (marginal distribution) is 3 � b t +1 ν t +1 ( x i ) ∝ ri ( x i ) i r ∈N ( i ) Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 5 / 38
Information Geometrical View Information Geometrical View Our results Information geometry: Applied differential geometry to statistical/stochastic models. Amari, (1985). Springer-Verlag . Murray & Rice, (1993). Chipman & Hall . Amari & Nagaoka, (2000). AMS and Oxford University Press . Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 6 / 38
Information Geometrical View Information Geometry Information Geometry S r ( x ) S : x Space of probability distribu- tions. Each point r ( x ) ∈ S is a probability distribution Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 7 / 38
Information Geometrical View Information Geometry Information Geometry S r ( x ) { p ( x ; θ ) } : M = distributions parametrized by θ . M = { p ( x ; θ ) } Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 8 / 38
∆ � T I ( � ) ∆ � ∆ � Information Geometrical View Information Geometry Information Geometry S r ( x ) M = { p ( x ; θ ) } : Model manifold M = { p ( x ; θ ) } Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 9 / 38
Information Geometrical View Information Geometry Information Geometry S r ( x ) ˆ θ = argmin KL ( r ( x ); p ( x ; θ )) m –projection θ ∈ Θ Maximum Likelihood: m -projection to the model If the model is an exponential family, ˆ θ projection is unique. M = { p ( x ; θ ) } Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 10 / 38
Information Geometrical View Information Geometry Information Geometry S r ( x ) If p ( x ; θ ) is an exponential family, m –projection and its sufficient statistics is t ( x ) , m - projection gives E r ( x ) [ t ( x )] = E p ( x ; θ ) [ t ( x )] ˆ θ M = { p ( x ; θ ) } Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 11 / 38
Information Geometrical View Information Geometry Information Geometry S r ( x ) ˆ = argmin KL ( p ( x ; θ ); r ( x )) θ mf e –projection θ ∈ Θ Naive Mean Field approximation: e - projection For an exponential family, there are a ˆ θ mf lot of local minima. M = { p ( x ; θ ) } Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 12 / 38
Information Geometrical View Information Geometry Information Geometry S r ( x ) In Statistical Physics, multiple lo- e –projection cal minima are important. They consider the multiple solutions corre- sponds to landscape of energy func- ˆ θ mf tion and “phase transitions.” M = { p ( x ; θ ) } Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 13 / 38
Information Geometrical View Belief Propagation and Information Geometry Information Geometrical View Our results Discuss the accuracy, convergence of LBP with information geometry Ikeda, Tanaka, & Amari, (2004). IEEE tr. on IT , 50(6) , 1097-1114. Ikeda, Tanaka, & Amari, (2004). Neural Comput. , 50(6) , 1779-1810. Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 14 / 38
Information Geometrical View Belief Propagation and Information Geometry Belief Propagation Joint distribution S q ( x ) x l d b x i a x k c x j �� � q ( x ) = exp c r ( x ) − ψ q r Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 15 / 38
Information Geometrical View Belief Propagation and Information Geometry Belief Propagation Single link models S q ( x ) x l d b M r x i a x k c x j � � M r = p r ( x ; ζ r ) = exp( c r ( x ) + ζ r · x − ϕ q ( ζ r )) , r = 1 , · · · , L. Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 16 / 38
Information Geometrical View Belief Propagation and Information Geometry Belief Propagation Marginals S q ( x ) x l d b M r x i a x k M 0 c x j � � M 0 = p 0 ( x ; θ ) = exp( θ · x − ϕ 0 ( θ )) Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 17 / 38
Information Geometrical View Belief Propagation and Information Geometry Belief Propagation Convergence S q ( x ) M ( θ ) = { Product of marginals p 0 ( x ; θ ) } p r ( x ; ζ r ) M r M ( θ ) Condition 1 M 0 p 0 ( x ; θ ) , p r ( x ; ζ r ) ∈ M ( θ ) , p 0 ( x ; θ ) Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 18 / 38
Information Geometrical View Belief Propagation and Information Geometry Belief Propagation At convergent points S q ( x ) p r ( x ; ζ r ) M r If M ( θ ) includes q ( x ) p 0 ( x ; θ ) is the true marginals. M ( θ ) M 0 p 0 ( x ; θ ) Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 19 / 38
Information Geometrical View Belief Propagation and Information Geometry Belief Propagation Convergence S q ( x ) E = { log -linear mixture ofp 0 , p r } 1 � Z E ( t ) p 0 ( x ; θ ) t 0 � p r ( x ; ζ r ) r p r ( x ; ζ r ) t r = M r � � t r =1 � � � E M 0 Condition 2 p 0 ( x ; θ ) q ( x ) , p 0 ( x ; θ ) , p r ( x ; ζ r ) ∈ E Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 20 / 38
Information Geometrical View Belief Propagation and Information Geometry Belief Propagation Convergence S q ( x ) E Theorem When p 0 ( x ; θ ) , and p r ( x ; ζ r ) r = 1 , · · · , L p r ( x ; ζ r ) satisfies M r Condition 1 and Condition 2 M ( θ ) ↔ M 0 It is the convergent point of BP. p 0 ( x ; θ ) Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 21 / 38
Information Geometrical View Improving BP Approximate accuracy Perturbation analysis Difference between E and M ( θ ) → Accuracy x q : expectation of x w.r.t. q ( x ) . x BP : convergent point of BP. x q ≃ x BP + 1 B rs x BP + 1 �� � � � B rst − B rrr x BP . 2 6 r � = s rst r B rs x BP : order 4 loop, embedded m –curvature of E B rst x BP : order 6 loop, torsion of E . x l x l d d b b x i x i a a x k x k c c x j x j Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 22 / 38
Information Geometrical View Improving BP Convergence e –constraint algorithm S S q ( x ) q ( x ) E E p r ( x ; ζ r ) p r ( x ; ζ r ) M r M r M 0 M 0 p 0 ( x ; θ ) p 0 ( x ; θ ) Condition 2 is always satisfied, update parameters to satisfy Condition 1 Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 23 / 38
Information Geometrical View Improving BP Convergence e –constraint algorithm S S q ( x ) q ( x ) E E p r ( x ; ζ r ) p r ( x ; ζ r ) M r M r M ( θ ) M 0 M 0 p 0 ( x ; θ ) p 0 ( x ; θ ) BP, TRP (Wainwright, et al. NIPS*14) Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 24 / 38
Information Geometrical View Improving BP Convergence m –constraint algorithm S S q ( x ) q ( x ) p r ( x ; ζ r ) p r ( x ; ζ r ) M r M r M ( θ ) M ( θ ) M 0 M 0 p 0 ( x ; θ ) p 0 ( x ; θ ) Condition 1 is always satisfied, update parameters to satisfy Condition 2 Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 25 / 38
Information Geometrical View Improving BP Convergence m –constraint algorithm S S q ( x ) q ( x ) E p r ( x ; ζ r ) p r ( x ; ζ r ) M r M r M ( θ ) M ( θ ) M 0 M 0 p 0 ( x ; θ ) p 0 ( x ; θ ) CCCP (Yuille & Rangarajan, NIPS*15) Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 26 / 38
Survey Propagation (SP) SP Survey Propagation Background M´ ezard, Parisi, & Zecchina, (2002). Science , 297 , 812–815. Method to analyze K-sat problems. 3-sat problem ( x 1 ∨ ¯ x 2 ∨ x 3 ) ∧ (¯ x 1 ∨ ¯ x 4 ∨ x 5 ) ∧ ( x 4 ∨ x 2 ∨ x 3 ) 3-sat is NP complete. For above example, ( x 1 , x 2 , x 3 , x 4 , x 5 ) = (+ , + , ∗ , ∗ , +) is a solution. Also ( ∗ , ∗ , + , − , ∗ ) is. Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 27 / 38
Survey Propagation (SP) SP Survey Propagation Notation ψ 1 ( x 1 ) = 1 − 1 8(1 − x 1 )(1 + x 2 )(1 − x 3 ) ψ 1 ( x 1 ) is 1 if ( x 1 ∨ ¯ x 2 ∨ x 3 ) is True, otherwise 0 , and � V = ( x 1 ∨ ¯ x 2 ∨ x 3 ) ∧ (¯ x 1 ∨ ¯ x 4 ∨ x 5 ) ∧ ( x 4 ∨ x 2 ∨ x 3 ) = ψ r ( x r ) r ψ r ( x r ) = 1 − 1 � (1 + J ri x i ) , J ri ∈ {− 1 , +1 } 8 i ∈ V ( r ) if V = 1 is 1 , it is SAT. Ikeda (ISM) MF approx and Info Geom 31/Aug/2009 28 / 38
Recommend
More recommend