comp304 introduction to neural networks based on slides by
play

COMP304 Introduction to Neural Networks based on slides by: - PowerPoint PPT Presentation

COMP304 Introduction to Neural Networks based on slides by: Christian Borgelt http://www.borgelt.net/ Christian Borgelt Introduction to Neural Networks 1 Motivation: Why (Artificial) Neural Networks? (Neuro-)Biology / (Neuro-)Physiology


  1. COMP304 Introduction to Neural Networks based on slides by: Christian Borgelt http://www.borgelt.net/ Christian Borgelt Introduction to Neural Networks 1

  2. Motivation: Why (Artificial) Neural Networks? • (Neuro-)Biology / (Neuro-)Physiology / Psychology: ◦ Exploit similarity to real (biological) neural networks. ◦ Build models to understand nerve and brain operation by simulation. • Computer Science / Engineering / Economics ◦ Mimic certain cognitive capabilities of human beings. ◦ Solve learning/adaptation, prediction, and optimization problems. • Physics / Chemistry ◦ Use neural network models to describe physical phenomena. ◦ Special case: spin glasses (alloys of magnetic and non-magnetic metals). Christian Borgelt Introduction to Neural Networks 2

  3. Motivation: Why Neural Networks in AI? Physical-Symbol System Hypothesis [Newell and Simon 1976] A physical-symbol system has the necessary and sufficient means for general intelligent action. Neural networks process simple signals, not symbols. So why study neural networks in Artificial Intelligence? • Symbol-based representations work well for inference tasks, but are fairly bad for perception tasks. • Symbol-based expert systems tend to get slower with growing knowledge, human experts tend to get faster. • Neural networks allow for highly parallel information processing. • There are several successful applications in industry and finance. Christian Borgelt Introduction to Neural Networks 3

  4. Biological Background Structure of a prototypical biological neuron terminal button synapsis dendrites cell body cell core (soma) axon myelin sheath Christian Borgelt Introduction to Neural Networks 4

  5. Biological Background (Very) simplified description of neural information processing • Axon terminal releases chemicals, called neurotransmitters . • These act on the membrane of the receptor dendrite to change its polarization. (The inside is usually 70mV more negative than the outside.) • Decrease in potential difference: excitatory synapse Increase in potential difference: inhibitory synapse • If there is enough net excitatory input, the axon is depolarized. • The resulting action potential travels along the axon. (Speed depends on the degree to which the axon is covered with myelin.) • When the action potential reaches the terminal buttons, it triggers the release of neurotransmitters. Christian Borgelt Introduction to Neural Networks 5

  6. Threshold Logic Units Christian Borgelt Introduction to Neural Networks 6

  7. Threshold Logic Units A Threshold Logic Unit (TLU) is a processing unit for numbers with n inputs x 1 , . . . , x n and one output y . The unit has a threshold θ and each input x i is associated with a weight w i . A threshold logic unit computes the function n  �  1 , if � x� w = w i x i ≥ θ ,   y = i =1  0 , otherwise.   x 1 w 1 . . . . . . y θ w n x n Christian Borgelt Introduction to Neural Networks 7

  8. Threshold Logic Units: Examples Threshold logic unit for the conjunction x 1 ∧ x 2 . x 1 3 x 1 + 2 x 2 x 1 x 2 y 3 0 0 0 0 y 4 1 0 3 0 0 1 2 0 2 x 2 1 1 5 1 Threshold logic unit for the implication x 2 → x 1 . x 1 2 x 1 − 2 x 2 x 1 x 2 y 2 0 0 0 1 y − 1 1 0 2 1 0 1 − 2 0 − 2 x 2 1 1 0 1 Christian Borgelt Introduction to Neural Networks 8

  9. Threshold Logic Units: Examples Threshold logic unit for ( x 1 ∧ x 2 ) ∨ ( x 1 ∧ x 3 ) ∨ ( x 2 ∧ x 3 ) . x 1 x 2 x 3 i w i x i y � 0 0 0 0 0 x 1 2 1 0 0 2 1 0 1 0 − 2 0 − 2 x 2 y 1 1 1 0 0 0 0 0 1 2 1 2 x 3 1 0 1 4 1 0 1 1 0 0 1 1 1 2 1 Christian Borgelt Introduction to Neural Networks 9

  10. Threshold Logic Units: Geometric Interpretation Review of line representations Straight lines are usually represented in one of the following forms: Explicit Form: g ≡ x 2 = bx 1 + c Implicit Form: g ≡ a 1 x 1 + a 2 x 2 + d = 0 Point-Direction Form: g ≡ x = � p + k� � r Normal Form: g ≡ ( � x − � p ) � n = 0 with the parameters: b : Gradient of the line c : Section of the x 2 axis p : Vector of a point of the line (base vector) � r : Direction vector of the line � � n : Normal vector of the line Christian Borgelt Introduction to Neural Networks 10

  11. Threshold Logic Units: Geometric Interpretation A straight line and its defining parameters. x 2 b = r 2 r 1 c n = ( a 1 , a 2 ) � � r q = − d � n � | � n | | � n | g d = − � p � p� n ϕ x 1 O Christian Borgelt Introduction to Neural Networks 11

  12. Threshold Logic Units: Geometric Interpretation How to determine the side on which a point � x lies. x 2 z = � x� n n � � | � n | | � n | z � q = − d � n � | � n | | � n | g � x ϕ x 1 O Christian Borgelt Introduction to Neural Networks 12

  13. Threshold Logic Units: Geometric Interpretation Threshold logic unit for x 1 ∧ x 2 . x 1 1 3 0 1 x 2 y 4 0 2 x 2 0 x 1 1 A threshold logic unit for x 2 → x 1 . 0 x 1 1 1 2 x 2 y − 1 0 − 2 x 2 0 1 x 1 Christian Borgelt Introduction to Neural Networks 13

  14. Threshold Logic Units: Geometric Interpretation (1 , 1 , 1) Visualization of 3-dimensional Boolean functions: x 3 x 2 x 1 (0 , 0 , 0) Threshold logic unit for ( x 1 ∧ x 2 ) ∨ ( x 1 ∧ x 3 ) ∨ ( x 2 ∧ x 3 ) . x 1 2 x 3 − 2 x 2 y x 2 1 x 1 2 x 3 Christian Borgelt Introduction to Neural Networks 14

  15. Threshold Logic Units: Limitations The biimplication problem x 1 ↔ x 2 : There is no separating line. 1 x 1 x 2 y 0 0 1 x 2 1 0 0 0 1 0 0 1 1 1 0 1 x 1 Formal proof by reductio ad absurdum : since (0 , 0) �→ 1: 0 ≥ θ, (1) since (1 , 0) �→ 0: w 1 < θ, (2) since (0 , 1) �→ 0: (3) w 2 < θ, since (1 , 1) �→ 1: w 1 + w 2 ≥ θ. (4) (2) and (3): w 1 + w 2 < 2 θ . With (4): 2 θ > θ , or θ > 0. Contradiction to (1). Christian Borgelt Introduction to Neural Networks 15

  16. Threshold Logic Units: Limitations Total number and number of linearly separable Boolean functions. ([Widner 1960] as cited in [Zell 1994]) inputs Boolean functions linearly separable functions 1 4 4 2 16 14 3 256 104 4 65536 1774 4 . 3 · 10 9 5 94572 1 . 8 · 10 19 5 . 0 · 10 6 6 • For many inputs a threshold logic unit can compute almost no functions. • Networks of threshold logic units are needed to overcome the limitations. Christian Borgelt Introduction to Neural Networks 16

  17. Networks of Threshold Logic Units Solving the biimplication problem with a network. Idea: logical decomposition x 1 ↔ x 2 ≡ ( x 1 → x 2 ) ∧ ( x 2 → x 1 ) computes y 1 = x 1 → x 2 � � � − 2 ✠ � x 1 − 1 computes y = y 1 ∧ y 2 2 � 2 � � ✠ � y = x 1 ↔ x 2 3 2 2 x 2 − 1 − 2 ■ ❅ ❅ ❅ computes y 2 = x 2 → x 1 ❅ Christian Borgelt Introduction to Neural Networks 17

  18. Networks of Threshold Logic Units Solving the biimplication problem: Geometric interpretation g 2 0 1 g 1 1 c ac d 0 1 1 1 0 b g 3 = ⇒ x 2 y 2 a b 0 0 d 0 1 0 1 x 1 y 1 • The first layer computes new Boolean coordinates for the points. • After the coordinate transformation the problem is linearly separable. Christian Borgelt Introduction to Neural Networks 18

  19. Representing Arbitrary Boolean Functions Let y = f ( x 1 , . . . , x n ) be a Boolean function of n variables. (i) Represent f ( x 1 , . . . , x n ) in disjunctive normal form. That is, determine D f = K 1 ∨ . . . ∨ K m , where all K j are conjunctions of n literals, i.e., K j = l j 1 ∧ . . . ∧ l jn with l ji = x i (positive literal) or l ji = ¬ x i (negative literal). (ii) Create a neuron for each conjunction K j of the disjunctive normal form (having n inputs — one input for each variable), where n � θ j = n − 1 + 1 2 , if l ji = x i , � w ji = and w ji . − 2 , if l ji = ¬ x i , 2 i =1 (iii) Create an output neuron (having m inputs — one input for each neuron that was created in step (ii)), where w ( n +1) k = 2 , k = 1 , . . . , m, and θ n +1 = 1 . Christian Borgelt Introduction to Neural Networks 19

  20. Training Threshold Logic Units Christian Borgelt Introduction to Neural Networks 20

  21. Training Threshold Logic Units • Geometric interpretation provides a way to construct threshold logic units with 2 and 3 inputs, but: ◦ Not an automatic method (human visualization needed). ◦ Not feasible for more than 3 inputs. • General idea of automatic training: ◦ Start with random values for weights and threshold. ◦ Determine the error of the output for a set of training patterns. ◦ Error is a function of the weights and the threshold: e = e ( w 1 , . . . , w n , θ ). ◦ Adapt weights and threshold so that the error gets smaller. ◦ Iterate adaptation until the error vanishes. Christian Borgelt Introduction to Neural Networks 21

  22. Training Threshold Logic Units Single input threshold logic unit for the negation ¬ x . x y w y x θ 0 1 1 0 Output error as a function of weight and threshold. e e e 2 2 2 1 1 1 2 2 2 1 1 1 1 1 0 0 0 w w w − 1 2 − 1 2 − 1 2 1 1 1 0 0 0 − 2 − 2 − 2 − 2 − 1 − 2 − 1 − 2 − 1 θ θ θ error for x = 0 error for x = 1 sum of errors Christian Borgelt Introduction to Neural Networks 22

Recommend


More recommend