on the thermodynamic equivalence between hopfield
play

On the Thermodynamic Equivalence between Hopfield Networks and - PowerPoint PPT Presentation

On the Thermodynamic Equivalence between Hopfield Networks and Hybrid Boltzmann Machines Enrica Santucci On the equivalence of Hopfield Networks and Boltzmann Machines (A. Barra, A. Bernacchia, E. Santucci, P. Contucci, Neural Networks 34


  1. On the Thermodynamic Equivalence between Hopfield Networks and Hybrid Boltzmann Machines Enrica Santucci On the equivalence of Hopfield Networks and Boltzmann Machines (A. Barra, A. Bernacchia, E. Santucci, P. Contucci, Neural Networks 34 (2012) 1-9) UNIVERSITY OF CAGLIARI PraLab - Department of Electrical and Electronic Engineering 4 novembre 2016

  2. Parte I Description of the models

  3. Spin glass: Sherrington Kirkpartick (SK) - 1975 Spin system whose low temperature state appears as a disordered one rather than the uniform or periodic structure pattern that one use to find in conventional Ising magnets K. H. Fischer, J. A. Hertz (1991) - M. Mezard, G. Parisi, M. Virasoro (1987) Figure 1 : Schematic representation of a spin glass structure versus a ferromagnet one

  4. SK Hamiltonian • N particles (where N is very large) • σ i ∈ {− 1 , + 1 } Ising spin related to the i -th particle ( i = 1 , . . . , N ) • J ij ∼ N ( 0 , 1 ) interaction matrix between the lattice particles • T temperature of the system ( β = 1 / T ) Hamiltonian H sg ( σ, J ) = − β � √ J ij σ i σ j (1) N 1 ≤ i , j ≤ N • frustration : we cannot simultaneously minimize all the Hamiltonian terms because the interactions J ij are random variables

  5. SK Phase Diagram • Partition Function Z N ( β ) = � σ exp ( − H N ( σ, J )) • Average over the interactions E ( F ( J )) = � d µ ( J ) F ( J ) Free Energy f N ( β ) = − 1 β N E ln Z N ( β ) i = 1 σ ( a ) σ ( b ) � N � N m = 1 q ab = 1 • Order Parameters i = 1 σ i N N i i • Free Energy for N → ∞ • Minimization of the free energy with respect to the order parameters • Self-consistence equations

  6. Gaussian spin glass (A. Barra, G. Genovese, F. Guerra - 2012) • z i , i = 1 , . . . , N ∼ N ( 0 , 1 ) • J ij , i , j = 1 , . . . , N ∼ N ( 0 , 1 ) Hamiltonian H N ( z , J ) = − β � √ J ij z i z j (2) N 1 ≤ i < j ≤ N i = 1 z ( a ) z ( b ) � N • Order Parameter q ab = 1 i i N • Free Energy for N → ∞ • Minimization of the free energy with respect to the order parameters • Self-consistence equation

  7. Hopfield Model (HM) - 1982 Σ 1 Σ 5 Σ 2 Σ 4 Σ 3 • Stored patterns: ξ µ = ( ξ µ 1 , . . . , ξ µ ξ µ N ) , µ = 1 , . . . , P i ∈ {− 1 , + 1 } • Digital units (activation levels): σ = ( σ 1 , . . . , σ N ) , σ i ∈ {− 1 , + 1 } • Activation function: Sign function • Two-way information flow • Symmetric synapses ( J ij = J ji )

  8. HM Hamiltonian Hamiltonian H hop ( σ, J ) = − β � J ij σ i σ j (3) N 1 ≤ i , j ≤ N Hebbian learning rule P � ξ µ i ξ µ J ij = ∀ i . j = 1 , . . . , N j µ = 1 m µ = 1 � N � N i = 1 σ ( a ) σ ( b ) i = 1 ξ µ q ab = 1 • Order parameters i σ i i i N N • Free energy for N → ∞ • Minimization of the free energy with respect to the order parameters • Self-consistence equations

  9. Analogy between Sherrington Kirkpatrick and Hopfield models • N particles ← → neurons • σ i Ising spin ← → neuronal activation level • J ij spin interactions ← → synapses • T temperature ← → noise level P → ∞ ⇓ Sherrington-Kirkpartick ⇐ ⇒ Hopfield

  10. HM Phase Diagram P • α = lim N → ∞ control parameter ( high storage regime ) N • T temperature • Retrieval Phase F (0 < α ≤ 0 . 05) • Mixed phase M (0 . 05 < α ≤ 0 . 14) • Spin glass phase SG ( α > 0 . 14) • Paramagnetic phase P

  11. Boltzmann Machine (G. E. Hinton, T. J. Sejnowski - 1983) � 1 � 2 Σ 1 � 3 Σ 2 Σ 3 Σ 4 Σ 5 Τ 1 Τ 2 • Digital visible layer: σ i ∈ { + 1 , − 1 } ( i = 1 , . . . , N ) • Two analog hidden layers: z µ , τ ν ∼ N ( 0 , 1 ) µ = 1 , . . . , P ν = 1 , . . . , K • Activation function: sigmoidal function • Two-way information flow • Symmetric synaptic weights ξ µ i η ν i

  12. Restricted and Hybrid version of the Boltzmann Machine (RHBM) Assumptions • hybrid : one digital layer of visible units and two analog layers of hidden units • restricted : no connections between the hidden layers Hamiltonian � N , P N , K P K � H rhbm ( β, σ, z , τ ; ξ, η ) = 1 µ + 1 β � � z 2 � τ 2 � � σ i ξ µ σ i η ν ν − i z µ + i τ ν (4) 2 2 N µ = 1 ν = 1 i ,µ = 1 i ,ν = 1

  13. Parte II Results

  14. Dynamics of the hidden layers Ornstein-Uhlembeck Diffusion Process N � D dz µ 2 D � ξ µ = − z µ ( t ) + i σ i + ζ µ ( t ) β dt i = 1 � N 2 D ∗ D ∗ d τ ν � η ν = − τ ν ( t ) + i σ i + ρ ν ( t ) dt β i = 1 • ζ , ρ white Gaussian noises • D , D ∗ quantifiers of the timescale of the dynamics • β measure of the strength of the fluctuations Probability distribution of the hidden variables � N � � β − β � 2 � � ξ µ Pr ( z µ | σ ) = 2 π exp z µ − i σ i 2 i = 1 � N � � β − β � 2 � � η ν Pr ( τ ν | σ ) = 2 π exp τ ν − i σ i 2 i = 1 for µ = 1 , . . . , P and ν = 1 , . . . , K

  15. Dynamics of the visible layer N P K � � � � � � ξ i � η i σ i ( t + 1 ) = sign µ σ i ( t ) + ν σ i ( t ) − T i µ = 1 ν = 1 i = 1 • t discrete time unit • T i threshold potential Probability distribution of the visible units (Glauber dynamics) � P µ = 1 ξ µ exp [ βσ i i z µ ] Pr ( σ i | z ) = exp [ β � P µ = 1 ξ µ i z µ ] + exp [ − β � P µ = 1 ξ µ i z µ ] � K ν = 1 η ν exp [ βσ i i τ ν ] Pr ( σ i | τ ) = exp [ β � K ν = 1 η ν i τ ν ] + exp [ − β � K ν = 1 η ν i τ ν ] P K � � Pr ( z | σ ) = Pr ( z µ | σ ) Pr ( τ | σ ) = Pr ( τ ν | σ ) µ = 1 ν = 1 N N � � Pr ( σ | z ) = Pr ( σ i | z ) Pr ( σ | τ ) = Pr ( σ i | τ ) i = 1 i = 1

  16. Statistical equivalence between Hopfield network and Boltzmann machine Pr ( σ, z , τ ) ∝ exp [ − H rhbm ( σ, z , τ )] ⇓     N P K  β � � � ξ µ i ξ µ η ν i η ν  σ i σ j  = exp [ − H hop ( σ )] Pr ( σ ) ∝ exp j +  j 2 N i , j = 1 µ = 1 ν = 1 • Thermodynamics of the visible units in a RHBM is equivalent to the one of a Hopfield network • The dynamics of a Hopfield network, requiring the update of N neurons and the storage of N 2 synapses, can be simulated by a RHBM, requiring the update of N + P neurons but the storage of only NP synapses

  17. Counterpart of the HM Phase Diagram in a RHBM • N number of neurons ← → number of visible units • P , K number of stored patterns ← → number of hidden units • ξ, η stored patterns ← → synaptic weights Hopfield model ⇐ ⇒ Boltzmann Machine • Retrieval Phase ← → Few hidden units • Spin Glass Phase ← → Too many hidden units

  18. Numerical simulations of the RHBM with a single hidden layer for different values of the parameters β (= 1 / T ) and P • β = 0 . 5 (high T ) no retrieval is possible regardless of the number of hidden units P • β = 2 (intermediate T ) retrieval is possible provided that the number of hidden units is not too large • β = 10 (low T ) retrieval is maintained up to large values of P

  19. Noise Source (I): Connection between the hidden layers � N , P N , K P , K P K � H rhbm ( σ, z , τ ; ξ, η ) = 1 µ + 1 β � ˜ � z 2 � τ 2 � ξ µ � η ν � ζ ν ν − i σ i z µ + i σ i τ ν + ǫ µ z µ τ ν 2 2 N µ = 1 ν = 1 i ,µ i ,ν µ,ν ⇓ Integration in z µ e τ ν ⇓ N � α N γ N H hop ( σ ; ξ, η ) = − β 1 − ǫ βγ 1 − ǫ βα � � � �� ˜ � � � ξ µ i ξ µ η ν i η ν + σ i σ j j j 2 N 4 4 i , j = 1 µ ν

  20. Noise Source (II): System subjected to an external field z µ , τ ν ∼ N ( 0 , 1 ) − → z µ ∼ N ( z 0 , 1 ) , τ ν ∼ N ( τ 0 , 1 ) � N , P N , K P K � H rhbm ( σ, z , τ ; ξ, η ) = 1 ( z µ − z 0 ) 2 + 1 β � ˜ � � ( τ ν − τ 0 ) 2 − � ξ µ � η ν i σ i z µ + i σ i τ ν 2 2 N µ = 1 ν = 1 i ,µ i ,ν ⇓ N N ˜ � � � � H hop ( σ ) − → H hop ( σ ) + β z 0 χ i σ i + βτ 0 ψ i σ i i = 1 i = 1 � P � K 1 µ = 1 ξ µ 1 ν = 1 η ν • χ i = i , ψ i = external random fields √ √ i P K

Recommend


More recommend