logic is everywhere
play

logic is everywhere Associative Memories la l ogica est a por - PowerPoint PPT Presentation

Hikmat har Jaga Hai Symmetric Networks logika je svuda Steffen H olldobler Mantk her yerde International Center for Computational Logic logika je v sude


  1. Hikmat har Jaga Hai Symmetric Networks logika je svuda � � ��� � � �� ���� � Steffen H¨ olldobler � � �� Mantık her yerde International Center for Computational Logic logika je vˇ sude Technische Universit¨ at Dresden la logica ` e dappertutto Germany logic is everywhere ◮ Associative Memories la l´ ogica est´ a por todas partes ◮ Symmetric Networks ��� � �� ��� ◮ Energy Functions � ��� � �� � ◮ Stochastic Networks Logik ist ¨ uberall ◮ Combinatorial Optimization Problems Logika ada di mana-mana Logica este peste tot ��� � ���� ��� �������� �� � �� � a l´ ogica est´ a em toda parte la logique est partout Steffen H¨ olldobler Symmetric Networks 1

  2. Associative Memories ◮ Literatur Hertz, Krogh, Palmer: Introduction to the Theory of Neural Computation. Addison-Wesley Publishing Company (1991). ◮ How can we model an associative memory? ⊲ Let M = { v 1 , . . . , v m } be a set of patterns. ⊲ Here, patterns are bit vectors of length l . ⊲ Let x be a bit vector of length l . ⊲ Find v j ∈ M which is most similar to x . ◮ Possible solution: ⊲ For all j = 1 . . . m compute the Hamming distance between v j and x : l X ( v ji (1 − x i ) + (1 − v ji ) x i ) i =1 ⊲ Select v j , whose Hamming distance to x is smallest. Steffen H¨ olldobler Symmetric Networks 2

  3. Symmetric Networks ◮ A symmetric network consists of a finite set U of binary threshold units and a finite set W ⊆ U × U of weighted connections such that ⊲ whenever ( u i , u j ) ∈ W then ( u j , u i ) ∈ W , ⊲ w ij = w ji for all ( u i , u j ) ∈ W , and ⊲ w jj = 0 for all ( u j , u j ) ∈ W . ◮ Asynchronous update procedure: while current state is unstable: update an arbitrary unit. Steffen H¨ olldobler Symmetric Networks 3

  4. Symmetric Networks – Examples 0 0 0 2 2 2 2 2 2 - 1 5 0 - 1 5 0 - 1 5 0 2 2 2 0 0 0 0 0 0 2 2 2 2 2 2 - 1 5 0 - 1 5 0 - 1 5 0 2 2 2 0 0 0 2 2 1 1 1 1 Steffen H¨ olldobler Symmetric Networks 4

  5. Symmetric Networks – Another Example ◮ Consider the following network with initial external activation − 1 0 0 − 1 1 3 0 0 − 1 − 2 2 1 3 0 0 0 − 1 1 ◮ Exercise Find the stable states of the network shown on this slide. ◮ Notation In the sequel, I will omit the threshold iff it is 0 . Steffen H¨ olldobler Symmetric Networks 5

  6. Attractors ◮ Consider the space of states of a given network. ⊲ Stable states are also often called attractors. ⊲ The computation starts in the state corresponding to x . ⊲ Updating this state subsequentially leads to trajectories of states. ⊲ The trajectories are finite and yield attractors as final states. ⊲ The set of states whose trajectories lead to the same attractor is called basin of this attractor. ◮ Exercise Consider the network shown on the previous page. Specify all basins of attractors and all trajectories. Steffen H¨ olldobler Symmetric Networks 6

  7. Symmetric Networks and Associative Memories ◮ Can we use symmetric networks as associative memories? ◮ Let M be a set of patterns and x a bit vector of length l . ◮ Idea ⊲ Externally activate a network of l units by x at time t = 0 ; all inputs at time t > 0 are 0 . ⊲ Search for weights such that after some time the network reaches a stable state which represents the pattern with minimal Hamming distance to x . Steffen H¨ olldobler Symmetric Networks 7

  8. Notational Convention ◮ To simplify the mathematical model we make the following assumptions: ⊲ Threshold θ k = 0 for all units u k in a symmetric network. ⊲ Output v j ∈ {− 1 , 1 } for all units u k . i.e., we use binary bipolar threshold units with threshold 0 . ◮ Exercise Are these assumptions a restriction? ◮ Let l be the number of units in the network. ◮ Then l X v i = sgn ( w ij v j ) , j =1 where  1 if x ≥ 0 , sgn ( x ) = − 1 otherwise. Steffen H¨ olldobler Symmetric Networks 8

  9. Storing a Single Bit Vector ◮ How shall the weights look like? ◮ Let v be a bit vector of length l . ◮ v is a stable state if for all i we find l X v i = sgn ( w ij v j ) . j =1 ◮ This holds if the weights are proportional to v i v j , e.g., w ij = 1 l v i v j : sgn ( P l 1 = l v i v j v j ) v i j =1 sgn ( P l 1 = l v i ) j =1 = sgn v i = v i ◮ Errors in x are corrected if # errors ( x ) < l 2 . ◮ v is an attractor. ◮ But − v is also an attractor which is reached if # errors ( x ) > l 2 . Steffen H¨ olldobler Symmetric Networks 9

  10. Storing Several Bit Vectors ◮ Let m be the number of bit vectors and l the number of units in the network: m w ij = 1 X v ki v kj . l k =1 ◮ Remark This is often called the (generalized) Hebb rule (see Hebb 1949). ◮ Are all vectors v r ∈ M stable states? sgn ( P l = j =1 w ij v rj ) v ri P l P m sgn ( 1 = k =1 v ki v kj v rj ) j =1 l P l P m sgn ( v ri + 1 = k =1 ,k � = r v ki v kj v rj ) . j =1 l P l P l ◮ Let C ri = 1 k =1 ,k � = r v ki v kj v rj . j =1 l ◮ If C ri = 0 for all i , then each vector is a stable state. ◮ If | C ri | < 1 for all i then C ri cannot change the sign of v ri . ◮ Storage capacity: If vectors are stochastically independent and should be l perfectly recalled then the maximum storage capacity is proportional to log l . Steffen H¨ olldobler Symmetric Networks 10

  11. Hopfield and Symmetric Networks ◮ A network realizing an associative memory as shown on the previous slide is often called Hopfield network. ◮ J.J. Hopfield: Neural Networks and Physical Systems with Emergent Collective Computational Abilities. In: Proceedings of the National Academy of Sciences USA, 2554-2558 (1982). Suppose we want to store the vectors ( − 1 , − 1 , 1 , − 1 , 1 , − 1) and ◮ Exercise (1 , 1 , − 1 , − 1 , 1 , 1) in a symmetric network with 6 units. Construct the network which solves this problem. Steffen H¨ olldobler Symmetric Networks 11

  12. Energy Functions ◮ What happens precisely when a symmetric network is updated? ◮ Consider the energy function l E ( t ) = − 1 X w ij v i ( t ) v j ( t ) 2 i,j =1 describing the state of a symmetric network with N units at time t . ◮ Example u 1 2 2 − 1 u 4 u 3 2 u 2 E ( t ) = v 1 ( t ) v 2 ( t ) − 2 v 1 ( t ) v 4 ( t ) − 2 v 2 ( t ) v 4 ( t ) − 2 v 3 ( t ) v 4 ( t ) . Steffen H¨ olldobler Symmetric Networks 12

  13. Properties of Energy Functions ◮ Theorem E is monoton decreasing, i.e., E ( t + 1) ≤ E ( t ) . ◮ Exercise How does an update change the energy of a symmetric network if we do not assume that w ii = 0 ? ◮ Exercise Is the energy function still monoton decreasing if we do not assume that w ij = w ji ? Prove your answer. ◮ How plausible is the assumption that w ij = w ji ? ◮ Exercise Consider symmetric networks, where the threshold of the units need not be 0 . Define a monoton decreasing energy function for these networks. Proof your claim. Steffen H¨ olldobler Symmetric Networks 13

  14. Relation to Ising Models ◮ Spins are magnetic atoms with directions 1 and − 1 . ◮ Suppose there are l atoms. ◮ For each atom v i a magnetic field h i is defined by l X w ij v j + h e h i = j =1 where h e is the external field. ◮ At low temperatures spins follow the magnetic field. This is described by energy function l l H = − 1 X w ij v i v j − h e X v i . 2 i,j =1 i =1 Steffen H¨ olldobler Symmetric Networks 14

  15. More on Ising Models ◮ At high temperatures spins do not follow the magnetic field. ◮ Depending on the temperature T thermal fluctuations occur. ◮ Mathematical model: Glauber dynamics  1 with probability g ( h i ) , v i = − 1 with probability 1 − g ( h i ) , where 1 1 g ( h ) = β = k B Boltzmann’s constant . 1 + exp ( − 2 βh ) , k B T , ◮ 1 − g ( h ) = g ( − h ) . ◮ Behaviour of spins: 1 prob ( v i = ± 1) = 1 + exp ( ∓ 2 βh i ) . ◮ In equilibrium states with low energy are more likely than states with higher energy. Steffen H¨ olldobler Symmetric Networks 15

  16. Stochastic Networks ◮ Hinton, Sejnowski: Optimal Perceptual Inference. In: Proceedings of the IEEE Conference on Computer Vision and Recognition, 448-453 (1983). ⊲ They applied the previously mentioned results to symmetric networks: 1 β = 1 prob ( v i = 1) = where T . 1 + exp ( − β P l j =1 w ij v j ) ⊲ Those networks are called Boltzmann machines or stochastic networks. ⊲ T � 0 : symmetric networks. ◮ Kirkpatrick, Gelatt, Vecchi: Optimization by simulated annealing. Science 220, 671-680 (1983). ⊲ Simulated annealing. ◮ Geman, Geman: Stochastic Relaxation, Gibbs Distribution, and the Bayesian Restoration of Images. IEEE Transactions on Pattern Analysis and Machine In- telligence, 6, 721-741 (1984). ⊲ Simulated annealing is guaranteed to find a global minima of the energy func- tion if temperature is lowered in infinitesimal small steps. Steffen H¨ olldobler Symmetric Networks 16

Recommend


More recommend