Vectors, Matrices, and Associative Memory Computational Models of - PowerPoint PPT Presentation

Vectors, Matrices, and Associative Memory Computational Models of Neural Systems Lecture 3.1 David S. Touretzky September, 2013

A Simple Memory Memory Key 4.7 1 4.7 Result = Key × Memory 2 Computational Models of Neural Systems 09/23/13

Storing Multiple Memories Memory K A K B K C 4.7 0 0 1 2.5 0 0 1 5.3 0 0 1 Each input line activates a particular memory. 3 Computational Models of Neural Systems 09/23/13

Mixtures (Linear Combinations) of Memories Memory  K A  K B / 2 4.7 0.5 2.5 0.5 5.3 0 3.6 4 Computational Models of Neural Systems 09/23/13

Memories As Vectors  M This memory can store M z three things. M y  K C  M x K A  M = 〈 4.7, 2.5, 5.3 〉 Basis unit vectors:  K A = 〈 1,0,0 〉 = x axis  K B = 〈 0,1,0 〉 = y axis  K C = 〈 0,0,1 〉 = z axis 5 Computational Models of Neural Systems 09/23/13

Length of a Vector c  v  v Let ∥ v ∥ = length of  v . Then ∥ c  v ∥ = c ∥ v ∥  v = a unit vector in the direction of  v . ∥ v ∥ 6 Computational Models of Neural Systems 09/23/13

Dot Product: Axioms  v  u d Let ⃗ v be a vector and ⃗ u be a unit vector. Two axioms for dot product: v ⋅⃗ u = d ⃗ c ⃗ v 1 ⋅ ⃗ v 2 = c ( ⃗ v 1 ⋅⃗ v 2 ) = ⃗ v 1 ⋅ c ⃗ v 2 7 Computational Models of Neural Systems 09/23/13

Dot Product: Geometric Definition  v r  u = unit vector  d v ⋅⃗ u = d = r cos θ ⃗ r = ∥⃗ v ∥ v ⋅⃗ u = ∥⃗ v ∥ cos θ ⃗ 8 Computational Models of Neural Systems 09/23/13

Dot Product of T wo Arbitrary Vectors  v 1 v 1 ⋅  = ∥ v 1 ∥ ∥ v 2 ∥ cos   v 2   =  v 2 Proof: v 2 ∥   v 2 ∥ v 2 ∥  v 2 ∥ =   Unit vector v 2 ∥   v 2 v 1 ⋅  v 1 ⋅ ∥ v 2 ∥  v 2 ∥ =  ∥ v 1 ∥ cos   ∥ v 2 ∥ = ∥ v 1 ∥ ∥ v 2 ∥ cos  9 Computational Models of Neural Systems 09/23/13

Dot Product: Algebraic Definition v = 〈 v 1 ,v 2 〉 and ⃗ w = 〈 w 1 ,w 2 〉 Let ⃗ v ⋅⃗ w = v 1 w 1 + v 2 w 2 ⃗ But also: v ⋅⃗ w = ∥⃗ v ∥ ∥⃗ w ∥ cos θ ⃗ Can we reconcile these two definitions? See the proof in the Jordan (optional) reading. 10 Computational Models of Neural Systems 09/23/13

Length and Dot Product v ⋅  v = ∥ v ∥ 2  Proof:  v ⋅ v = ∥ v ∥ ∥ v ∥ cos  The angle  = 0 , so cos  = 1. v ⋅ v = ∥ v ∥ ∥ v ∥ = ∥ v ∥ 2  And also:  v ⋅ v = v x v x  v y v y = ∥ v ∥ 2 so we have: v ∥ =  v x 2  v y ∥ 2 11 Computational Models of Neural Systems 09/23/13

Associative Retrieval as Dot Product  M K A K B K C 4.7 1 0 0 2.5 0 0 1 5.3 0 0 1 Retrieving memory A is equivalent to computing ⃗ K A ⋅ ⃗ M This works for mixtures of memories as well: K AB = 0.5 ⃗ ⃗ K A + 0.5 ⃗ K B 12 Computational Models of Neural Systems 09/23/13

Orthogonal Keys The key vectors are mutually orthogonal. K A = 〈 1,0,0 〉 K B = 〈 0,1,0 〉 K C = 〈 0,0,1 〉 K A ⋅ K B = 1 ⋅ 0  0 ⋅ 1  0 ⋅ 0 = 0  AB = arccos 0 = 90 o We don't have to use vectors of form 〈 , 0,1,0, 〉 . Any set of mutually orthogonal unit vectors will do. 13 Computational Models of Neural Systems 09/23/13

Keys Not Aligned With the Axes K A = 〈 1,0,0 〉 K B = 〈 0,1,0 〉 K C = 〈 0,0,1 〉 Rotate the keys by 45 degrees about the x axis, then 30 degrees about the z axis. This gives a new set of keys, still mutually orthogonal: J A =   0.87 , 0.49, 0 J B =  − 0.35,  0.61, 0.71 J C =  − 0.61,  0.35 , 0.71 2   0.49  2   0  2 = 1 J A ⋅ J A =  0.87  J A ⋅ J B =  0.87  ⋅− 0.35    0.49 ⋅  0.61   0 ⋅ 0.71  = 0 14 Computational Models of Neural Systems 09/23/13

Setting the Weights How do we set the memory weights when the keys are mutually orthogonal unit vectors but aren't aligned with the axes? M =  m A  J A    m B  J B    m C  J C   Prove that this is correct:  J A ⋅  M = m A because: = J A ⋅   J C m C   J A ⋅  J A m A   J B m B   M =   J A  ⋅ m A    J B  ⋅ m B    J C  ⋅ m C J A ⋅ J A ⋅ J A ⋅ 1 0 0 15 Computational Models of Neural Systems 09/23/13

Setting the Weights m A = 4.7 J A =  0.87, 0.49, 0 m B = 2.5 J B = − 0.35,  0.61, 0.71 m C = 5.3 J C = − 0.61,  0.35 , 0.71 M = ∑  m k  = 〈 5.1, 0.61, 5.5 〉 J k k 5.1 J B − 0.35 0.6 0.61 5.5 0.71 2.5 16 Computational Models of Neural Systems 09/23/13

Storing Vectors: Each Stored Component Is A Separate Memory     M 1 M 2 M 3 M 4 K A K B K C 4.7 10 0.6 -8 0 0 1 2.5 20 0.5 -9 0 0 1 5.3 30 0.4 -7 0 0 1 K B retrieves 〈 2.5, 20, 0.5, − 9 〉 17 Computational Models of Neural Systems 09/23/13

Linear Independence ● A set of vectors is linearly independent if no element can be constructed as a linear combination of the others. ● In a system with n dimensions, there can be at most n linearly independent vectors. ● Any set of n linearly independent vectors constitutes a basis set for the space, from which any other vector can be constructed. Linearly independent Not linearly Linearly independent (all independent 3 vectors lie in the x-y plane) 18 Computational Models of Neural Systems 09/23/13

Linear Independence Is Enough ● Key vectors do not have to be orthogonal for an associative memory to work correctly. ● All that is required is linear independence. ● However, since we cannot set the weights as  K A ⋅  K B ≠ 0 simply as we did previously . ● Matrix inversion is one solution: K = 〈  K A ,  K B ,  K C 〉 m = 〈 m A , m B , m C 〉  − 1 M =  K   ⋅  m ● Another approach is an iterative algorithm: Widrow- Hoff. 19 Computational Models of Neural Systems 09/23/13

The Widrow-Hoff Algorithm 1. Let initial weights  M 0 = 0. 2. Randomly choose a pair m i ,  K i from the training set. 3. Compute actual output value a =  M t ⋅  K i . 4. Measure the error: e = m i − a . 5. Adjust the weights:  M  t  1  =  M t  ⋅ e ⋅  K i 6. Return to step 2. ● Guaranteed to converge to a solution if the key vectors are linearly independent. ● This is the way simple, one layer neural nets are trained. ● Also called the LMS (Least Mean Squares) algorithm. ● Identical to the CMAC training algorithm (Albus). 20 Computational Models of Neural Systems 09/23/13

High Dimensional Systems ● In typical uses of associative memories, the key vectors have many components (large # of dimensions). ● Computing matrix inverses is time consuming, so don't bother. Just assume orthogonality. ● If the vectors are sparse, they will be nearly orthogonal. ● How can we check? v ⋅  w  = arccos ∥ v ∥ ⋅ ∥ w ∥ ● Angle between <1,1,1, 1, 0,0,0, 0,0,0, 0,0,0> <0,0,0, 1, 1,1,1, 0,0,0, 0,00> is 76 o . ● Because the keys aren't orthogonal, there will be interference resulting in “noise” in the memory. ● Memory retrievals can produce a mixture of memories. 21 Computational Models of Neural Systems 09/23/13

Eliminating Noise ● Noise occurs when: – Keys are linearly independent but not strictly orthogonal. – We're not using LMS to find optimal weights, but instead relying on the keys being nearly orthogonal. ● If we apply some constraints on the stored memory values, the noise can be reduced. ● Example: assume the stored values are binary: 0 or 1. ● With noise, a stored 1 value might be retrieved as 0.9 or 1.3. A stored 0 might come back as 0.1 or –0.2. ● Solution: use a binary output unit with a threshold of 0.5. 22 Computational Models of Neural Systems 09/23/13

Thresholding for Noise Reduction threshold device 23 Computational Models of Neural Systems 09/23/13

Partial Keys ● Suppose we use sparse, nearly orthogonal binary keys to store binary vectors: K A = <1,1,1,1,0,0,0,0> K B = <0,0,0,0,1,1,1,1> ● It should be possible to retrieve a pattern based on a partial key: <1,0,1,1,0,0,0,0> ● The threshold must be adjusted accordingly. ● Solution: normalize the input to the threshold unit by dividing by the length of the key provided. 24 Computational Models of Neural Systems 09/23/13

Scaling for Partial Keys K A1 K A2 K A3 K A4 K B1 K B2 K B3 K B4 ÷ threshold device 25 Computational Models of Neural Systems 09/23/13

Warning About Binary Complements ● The binary complement of <1,0,0,0> is <0,1,1,1>. The binary complement of <0,1,0,0> is <1,0,1,1>. ● In some respects, a bit string and its complement are equivalent, but this is not true for vector properties. ● If two binary vectors are orthogonal, their binary complements will not be: – Angle between <1,0,0,0> and <0,1,0,0> is 90 o . – Angle between <0,1,1,1> and <1,0,1,1> is 48.2 o . 26 Computational Models of Neural Systems 09/23/13

Matrix Memory Demo 27 Computational Models of Neural Systems 09/23/13

Matrix Memory Demo 28 Computational Models of Neural Systems 09/23/13

Vectors, Matrices, and Associative Memory Computational Models of - PowerPoint PPT Presentation

Vectors, Matrices, and Associative Memory Computational Models of Neural Systems Lecture 3.1 David S. Touretzky September, 2013 A Simple Memory Memory Key 4.7 1 4.7 Result = Key Memory 2 Computational Models of Neural Systems

Results for different matrices and comparisons Dense Matrices Rectangular Matrices

MATHEMATICS 1 CONTENTS Matrices Special matrices Operations with matrices Matrix

Matrices and Vectors Marco Chiarandini Department of Mathematics & Computer Science

15-388/688 - Practical Data Science: Matrices, vectors, and linear algebra J. Zico Kolter

Vectors Vectors and Scalars Properties of Vectors Components of a Vector and Unit

Orthonormal bases of functions April 24, 2018 Data - Vectors or Functions Vectors Functions

CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory

Transformations Vectors, bases, and matrices Vectors, bases, and matrices Translation, rotation,

Associative arrays Associative arrays map a key to a value Keys and values can be different

Lazy Associative Classification Decision Tree Classifier (Eager) Associative Classifier By

In-Place Associative Computing Avidan Akerib Ph.D. Vice President Associative Computing BU

Methods of Adding Vectors Geometrically MCV4U: Calculus & Vectors Recall that two vectors are

Vector'Semantics Dense%Vectors% Dan%Jurafsky Sparse'versus'dense'vectors PPMI%vectors%are

Vectors and Matrices Basilio Bona DAUIN Politecnico di Torino October 2013 B. Bona (DAUIN)

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Overview Points Vectors Lines Spheres Matrices 3D transformations as

LGBTQ YOUTH & TOBACCO A dangerous liaison Scout, MA, PhD Acting Deputy Director, National

Feature-Critic Networks for Heterogeneous Domain Generalisation Yiying Li, Yongxin Yang, Wei

Hetero-Diatomics: HF Due to higher electronegativity of F than H, the electron distribution is

Towards Exploiting Data Locality for Irregular Applications on Shared-Memory Multicore

Cloud type comparisons of AIRS, CALIPSO, and CloudSat cloud height and amount by Brian H. Kahn 1

beyond network selection: exploiting access network hetero- geneity with named data networking

Michael D. Eriksen University of Cincinnati October 13, 2017 FDIC Consumer Research Symposium

Unit root tests for explosive behaviour 1 us Otero 2 Christopher F. Baum r Jes 1 Boston

Vectors, Matrices, and Associative Memory Computational Models of - PowerPoint PPT Presentation

Vectors, Matrices, and Associative Memory Computational Models of Neural Systems Lecture 3.1 David S. Touretzky September, 2013 A Simple Memory Memory Key 4.7 1 4.7 Result = Key Memory 2 Computational Models of Neural Systems

Results for different matrices and comparisons Dense Matrices Rectangular Matrices

MATHEMATICS 1 CONTENTS Matrices Special matrices Operations with matrices Matrix

Matrices and Vectors Marco Chiarandini Department of Mathematics &amp; Computer Science

15-388/688 - Practical Data Science: Matrices, vectors, and linear algebra J. Zico Kolter

Vectors Vectors and Scalars Properties of Vectors Components of a Vector and Unit

Orthonormal bases of functions April 24, 2018 Data - Vectors or Functions Vectors Functions

CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory

Transformations Vectors, bases, and matrices Vectors, bases, and matrices Translation, rotation,

Associative arrays Associative arrays map a key to a value Keys and values can be different

Lazy Associative Classification Decision Tree Classifier (Eager) Associative Classifier By

In-Place Associative Computing Avidan Akerib Ph.D. Vice President Associative Computing BU

Methods of Adding Vectors Geometrically MCV4U: Calculus &amp; Vectors Recall that two vectors are

Vector'Semantics Dense%Vectors% Dan%Jurafsky Sparse'versus'dense'vectors PPMI%vectors%are

Vectors and Matrices Basilio Bona DAUIN Politecnico di Torino October 2013 B. Bona (DAUIN)

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Overview Points Vectors Lines Spheres Matrices 3D transformations as

LGBTQ YOUTH &amp; TOBACCO A dangerous liaison Scout, MA, PhD Acting Deputy Director, National

Feature-Critic Networks for Heterogeneous Domain Generalisation Yiying Li*, Yongxin Yang*, Wei

Hetero-Diatomics: HF Due to higher electronegativity of F than H, the electron distribution is

Towards Exploiting Data Locality for Irregular Applications on Shared-Memory Multicore

Cloud type comparisons of AIRS, CALIPSO, and CloudSat cloud height and amount by Brian H. Kahn 1

beyond network selection: exploiting access network hetero- geneity with named data networking

Michael D. Eriksen University of Cincinnati October 13, 2017 FDIC Consumer Research Symposium

Unit root tests for explosive behaviour 1 us Otero 2 Christopher F. Baum r Jes 1 Boston

Matrices and Vectors Marco Chiarandini Department of Mathematics & Computer Science

Methods of Adding Vectors Geometrically MCV4U: Calculus & Vectors Recall that two vectors are

LGBTQ YOUTH & TOBACCO A dangerous liaison Scout, MA, PhD Acting Deputy Director, National

Feature-Critic Networks for Heterogeneous Domain Generalisation Yiying Li, Yongxin Yang, Wei