c w j z
play

c w j z , = zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA - PDF document

[14] H. Ying, W. Siler and zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA [13] B. Kosko, zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, VOL. 24, NO. 10, OCTOBER 1994 1571 Dynamic


  1. [14] H. Ying, W. Siler and zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA [13] B. Kosko, zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, VOL. 24, NO. 10, OCTOBER 1994 1571 Dynamic Systems, Addison Wesley, 1990. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA so that for any x; E X, Neural Networks and Fuzzy Systems, F’rentice Hall, 1992. J . J . Buckley, “Fuzzy Control Theory: A Nonlinear WTX* = 2 Case,” Automatica, vol. 26, no. 3, pp. 513-520, 1990. > 0 ifx, EX1 [15] R. Langari, “A Nonlinear Formulation of a Class of Fuzzy Linguistic < 0 w,zij{ ifx, E X 2 Control Algorithms,” Proc. American Control Conference, Chicago ,=1 Illinois, June 24-26 1992, pp. 2273-2278. [la] G. F. Franklin, J. D. Powell and M. L. Workman, Digital Control of where T denotes the transpose of a vector, and w is the normal vector of the hyperplane. The perceptron algorithm finds a separating hyperplane for a set of linearly separable vectors by iterations. It starts with an arbitrary normal vector wo. Theorem I : zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA The normal vector is then modified according to the following correction rule: Cone Algorithm: An Extension of the Perceptron Algorithm contained vectors. In this paper, we show that the perceptron algorithm zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA S. J. Wan The well-known perceptron convergence theorem is stated below [3], ~51, PI. Abstmet-The perceptron convergence theorem played an important I f X is linearly separable, the above procedure role in the early development of machine learning. Mathematidy, the will converge to a vector w satisfying ( 2 ) in a finite number o f perceptron learning algorithm is an iterative procedure for finding a iterations. separating hyperplane for a finite set of linearly separable vectors, or , . . . , equivalently, for finding a separating hyperplane for a finite set of linearly } is said to be linearly A set of vectors Y = {y1 y ~ , ym Index zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA Machine learning, perceptron, linearly separable zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA contained [9] if all vectors in Y are distributed on one side of a can be extended to a more general algorithm, called the cone algorithm, homogeneous hyperplane. In other words, Y is linearly contained if for finding a covering cone for a finite set of linearly contained vectors. A there exists a separating hyperplane defined by (1) satisfying: proof of the convergence of the cone algorithm is given. The relationship between the cone algorithm and other related algorithms is discussed. The equivalence of the problem of finding a covering cone for a set of linearly contained vectors and the problem of finding a solution cone for a system of homogeneous linear inequalities is established. A linearly separable set X can be transferred to a linearly contained r e m s - sets, set Y by changing the sign of the vectors in one class, Le., linearly contained sets, covering cones, solution cones, linear inequalities. y = {x I x E Xl} u {-x I x E X,}. I. INTRODUCTION Fig. 1 depicts a linearly contained set Y A set of vectors is linearly contained if all the vectors in the set transferred from a linearly separable set X. are distributed on one side of a homogeneous hyperplane. A covering cone of a linearly contained set is a circular hypercone which encloses With a minor modification, the perceptron algorithm can be used for finding a separating hyperplane for a set Y of linearly contained all the vectors in the set. The problem of finding a covering cone vectors [5]. Starting with an arbitrary normal vector WO, the normal of a linearly contained set may arise in some applications such as Let X zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA vector is then modified according to the following correction rule: machine learning [3], [9], [13], computational geometry [IO], and stability analysis [2], [4], [7]. The perceptron learning algorithm was developed in the early 1960s for modeling the learning process of a neuron in the human brain [Ill. Mathematically, it is an iterative procedure for finding a or equivalently, separating hyperplane for a finite set of linearly separable vectors [3], or equivalently, for finding a separating hyperplane for a finite set o f c w j z , = zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA linearly contained vectors [5], [9]. . x2, . . , } be a set of vectors in an n-dimensional where (wk,yz) represents the angle between Wk and yz. The = {XI, xm Euclidean space R”. Suppose each vector in X belongs to one of perceptron convergence theorem in this case is stated as follows [5]: Theorem 2: If Y is linearly contained, the above procedure two classes XI or X2. The set X is said to be linearly separable [3] will converge to a vector w satisfying (4) in afinite number o f if there exists a homogeneous hyperplane: iterations. n WTX = 0 In this paper, we show that the perceptron algorithm can be extended to a more general algorithm, called the cone algorithm, for j = 1 finding a covering cone for a finite set of linearly contained vectors. Manuscript received February 28, 1992; revised March 14, 1993 and A proof of the convergence of the cone algorithm is given. The November 15, 1993. relationship between the cone algorithm and other related algorithms The author was with the Department of Computer Science, University of is discussed. The equivalence of the problem of finding a covering Regina, Regina, Saskatchewan, Canada S4S OA2. He is now with the Imaging cone for a set of linearly contained vectors and the problem of finding Research and Advanced Development, Eastman Kodak Company, Rochester, a solution cone for a system of homogeneous linear inequalities is NY 14650-1907 USA. IEEE Log Number 9403056. established. 0018-9472/94$04.00 0 1994 IEEE

  2. x2 AW zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA 1572 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA x zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, VOL. 24, NO. 10, OCTOBER 1994 Y = {Xl, x2, x3, -x4, -xs, zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA Fig. 1. A linearly contained set zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA transferred from a linearly separable set zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA X zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA = - X h ) {Xl, X2, X3, X4, X5, X6) Y Fig. 2. Three covering cones of Y . 1 1 . THE CONE ALGORITHM is what would happen if we modify 0 = 90" in (6) to 0" < 0 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA In (6), the perceptron algorithm is expressed as a procedure for k+l adjusting angles between the normal vector w k of a hyperplane and the vectors in Y . The normal vector Wk is rotated towards yz if Fig. 3. An illustration of the convergence of the cone algorithm. the angle between w k and yz is larger than or equal to 90". The perceptron convergence theorem guarantees that this procedure stop in a finite number of iterations. The problem that we are interested in If 5 90". 0, < 6, then each correction given by (7) Y. covering cone o f Does the perceptron algorithm still converge in this case? To answer will bring w closer to w, when k is large enough, namely, k this question, we first introduce the notion of covering cones. covering cone of Y with the largest angle zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA (Wk+l,Ws) < (wk,ws), i f k > KO- (8) A hypercone with axis w and angle 0 in R" is defined by: Prooj? See the Appendix. C(W,@) = (xl(w,x) 5 0,x E R"} The convergence of the cone algorithm may be illustrated by Fig. where w # 0 and 0" 5 0 5 9 0 ' . A hypercone C(w,O) is said 3. The length of the normal vector Wk can become arbitrarily large to be a covering cone of a set Y of linearly contained vectors if in Y with the increase of k, but the length of each vector yz is fixed. 5 0 for all yt E (w, 2 8, then w k is rotated towards yl) Y . A covering cone of Y with the smallest When k is large enough, if (wk, yz) angle is called the smallest covering cone, denoted by C(ws, y1 for a small amount, which brings wk+l closer to ws. 0,). A (e = 90") is a halfspace The convergence speed of the cone algorithm may be improved by = bounded by the separating hyperplane wTx 0. Fig. 2 depicts three introducing a proper coefficient pk in the correction rule, namely, w k + covering cones of Y. > 0, (0" < 0 5 90") By modifying 0 = 90" in (6) to 0" < 0 5 go', the percep- wk+l = PkYz (9) if (wk,yz), tron algorithm becomes a more general algorithm, called the cone towards yz. where P k controls the rotation angle of w k algorithm, stated as follows: In what follows, we discuss the relationship between the cone The cone algorithm: Starting with an arbitrary axis WO, if a vector algorithm and other related algorithms, and other related issues. yz in Y is not enclosed by the hypercone c(wk,@), the axis w k is modified by: A. The Cone Algorithm Versus the Perceptron Algorithm (0" < 0 5 90")- if (wk,yz) 2 0 Wk+l = (7) Wk +yt, The only difference between the cone algorithm and the perceptron The convergence of the cone algorithm is stated below. algorithm (the version for linearly contained sets) is the condition for Theorem 3 (the cone algorithm convergence theorem): Let Y be modifying the vector Wk. In the perceptron algorithm (refer to (6)), linearly contained vectors and C(w,, 6,) be the smallest if a vector y; E Y a set o f is not located in the halfspace defined by the

Recommend


More recommend