logistic regression
play

Logistic Regression CS60010: Deep Learning Abir Das IIT Kharagpur - PowerPoint PPT Presentation

Logistic Regression CS60010: Deep Learning Abir Das IIT Kharagpur Jan 22, 23 and 24, 2020 Logistics Agenda Linear Regression Logistic Regression Some Logistics Related Information This Friday (Jan 24), no paper will be presented. It will


  1. Logistic Regression CS60010: Deep Learning Abir Das IIT Kharagpur Jan 22, 23 and 24, 2020

  2. Logistics Agenda Linear Regression Logistic Regression Some Logistics Related Information § This Friday (Jan 24), no paper will be presented. It will be a regular lecture. § The first surprise quiz is today!! Abir Das (IIT Kharagpur) CS60010 Jan 22, 23 and 24, 2020 2 / 35

  3. Logistics Agenda Linear Regression Logistic Regression Surprise Quiz 1 § The duration of the test is 10 minutes. § Question 1: Find the eigenvalues of the following matrix A . Clearly mention if you are making any assumption. [2 Marks]   2 0 0   1 3 0 − 1 0 1 § Question 2: Consider the half-space given by the set of points S = { x ∈ R d | a T x ≤ b } . Prove that the halfspace is convex. [3 Marks] Abir Das (IIT Kharagpur) CS60010 Jan 22, 23 and 24, 2020 3 / 35

  4. Logistics Agenda Linear Regression Logistic Regression Surprise Quiz 1: Answer Keys § Question 1: Find the eigenvalues of the following matrix A . Clearly mention if you are making any assumption.   2 0 0   1 3 0 − 1 0 1 Use the property of eigenvalues of a triangular matrix. § Question 2: Consider the half-space given by the set of points S = { x ∈ R d | a T x ≤ b } . Prove that the halfspace is convex. : If x , y belong to S , then a T x ≤ b and a T y ≤ b . Now, for 0 ≤ θ ≤ 1 , a T { θ x + (1 − θ ) y } = θ a T x + (1 − θ ) a T y ≤ θb + (1 − θ ) b = b Abir Das (IIT Kharagpur) CS60010 Jan 22, 23 and 24, 2020 4 / 35

  5. Logistics Agenda Linear Regression Logistic Regression Agenda § Understand regression and classification with linear models. § Brush-up concepts of maximum likelihood and its use to understand linear regression. § Using logistic function for binary classification and estimating logistic regression parameters. Abir Das (IIT Kharagpur) CS60010 Jan 22, 23 and 24, 2020 5 / 35

  6. Logistics Agenda Linear Regression Logistic Regression Resources § The Elements of Statistical Learning by T Hastie, R Tibshirani, J Friedman. [Link] [Chapter 3 and 4] § Artificial Intelligence: A Modern Approach by S Russell and P Norvig. [Link] [Chapter 18] Abir Das (IIT Kharagpur) CS60010 Jan 22, 23 and 24, 2020 6 / 35

  7. Logistics Agenda Linear Regression Logistic Regression Linear Regression § In a regression problem we want to find the relation between some input variables x and output variables y , where x ∈ R d and y ∈ R . § Inputs are also often referred to as covariates, predictors and features; while outputs are known as variates, targets and labels. § Examples of such input-output pairs can be ◮ { Outside temperature, People inside classroom, target room temperature | Energy requirement } ◮ { Size, Number of Bedrooms, Number of Floors, Age of the Home | Price } Abir Das (IIT Kharagpur) CS60010 Jan 22, 23 and 24, 2020 7 / 35

  8. Logistics Agenda Linear Regression Logistic Regression Linear Regression § In a regression problem we want to find the relation between some input variables x and output variables y , where x ∈ R d and y ∈ R . § Inputs are also often referred to as covariates, predictors and features; while outputs are known as variates, targets and labels. § Examples of such input-output pairs can be ◮ { Outside temperature, People inside classroom, target room temperature | Energy requirement } ◮ { Size, Number of Bedrooms, Number of Floors, Age of the Home | Price } § We have a set of N observations of y as { y 1 , y 2 , · · · , y N } and the corresponding input variables { x 1 , x 2 , · · · , x N } . 𝑧 (#) 𝒚 (#) Abir Das (IIT Kharagpur) CS60010 Jan 22, 23 and 24, 2020 7 / 35

  9. Logistics Agenda Linear Regression Logistic Regression Linear Regression § The input and output variables are assumed to be related via a relation, known as hypothesis. � y = h θ ( x ) , where θ is the parameter vector. y ∗ = f ( x ∗ ) for an arbitrary § The goal is to predict the output variable � value of the input variable x ∗ . § Let us start with scalar inputs ( x ) and scalar outputs ( y ). Abir Das (IIT Kharagpur) CS60010 Jan 22, 23 and 24, 2020 8 / 35

  10. Logistics Agenda Linear Regression Logistic Regression Univariate Linear Regression § hypothesis: h θ ( x ) = θ 0 + θ 1 x . § Cost Function: Sum of squared errors. 𝑧 (#) N � � h θ ( x ( i ) ) − y ( i ) � 2 1 J ( θ 0 , θ 1 ) = 2 N i =1 𝒚 (#) § Optimization objective: find model parameters ( θ 0 , θ 1 ) that will minimize the sum of squared errors. Abir Das (IIT Kharagpur) CS60010 Jan 22, 23 and 24, 2020 9 / 35

  11. Logistics Agenda Linear Regression Logistic Regression Univariate Linear Regression § hypothesis: h θ ( x ) = θ 0 + θ 1 x . § Cost Function: Sum of squared errors. 𝑧 (#) N � � h θ ( x ( i ) ) − y ( i ) � 2 1 J ( θ 0 , θ 1 ) = 2 N i =1 𝒚 (#) § Optimization objective: find model parameters ( θ 0 , θ 1 ) that will minimize the sum of squared errors. § Gradient of the cost function w.r.t. θ 0 : � N � h θ ( x ( i ) ) − y ( i ) � J ( θ 0 , θ 1 ) = 1 θ 0 N i =1 § Gradient of the cost function w.r.t. θ 1 : � N � h θ ( x ( i ) ) − y ( i ) � J ( θ 0 , θ 1 ) = 1 x ( i ) θ 1 N i =1 § Apply your favorite gradient based optimization algorithm. Abir Das (IIT Kharagpur) CS60010 Jan 22, 23 and 24, 2020 9 / 35

  12. Logistics Agenda Linear Regression Logistic Regression Univariate Linear Regression § These being linear equations of θ , have a unique closed form solution too. � N x ( i ) �� N y ( i ) � � N � � y ( i ) x ( i ) − N i =1 i =1 i =1 θ 1 = � x ( i ) � 2 − � N x ( i ) � 2 � N � N i =1 i =1 � N N � � x ( i ) � θ 0 = 1 y ( i ) − θ 1 N i =1 i =1 Abir Das (IIT Kharagpur) CS60010 Jan 22, 23 and 24, 2020 10 / 35

  13. Logistics Agenda Linear Regression Logistic Regression Multivariate Linear Regression § We can easily extend to multivariate linear regression problems, where x ∈ R d § hypothesis: h θ ( x ) = θ 0 + θ 1 x 1 + θ 2 x 2 + · · · + θ d x d . For convenience of notation, define x 0 = 1 . § Thus h is simply the dot product of the parameters and the input vector. h θ ( x ) = θ T x § Cost Function: Sum of squared errors. N � � θ T x ( i ) − y ( i ) � 2 1 J ( θ ) = J ( θ 0 , θ 1 , · · · , θ d ) = (1) 2 N i =1 § We will use the following to write the cost function in a compact matrix vector notation h θ ( x ) = θ T x = x T θ Abir Das (IIT Kharagpur) CS60010 Jan 22, 23 and 24, 2020 11 / 35

  14. Logistics Agenda Linear Regression Logistic Regression Multivariate Linear Regression         θ 0 x (1) x (1) x (1) x (1) y (1) h θ ( x ( 1 ) ) · · · � 0 1 2   d θ 1       x (2) x (2) x (2) x (2)   y (2) h θ ( x ( 2 ) ) �  · · ·        0 1 2 θ 2  d   =  = (2)       . . . . . ...   . . . . .    .  . .  . . .  .   . y ( N ) h θ ( x ( N ) ) x ( N ) x ( N ) x ( N ) x ( N ) � · · · θ d 0 1 2 d � y = X θ Here, X is a N × ( d + 1) matrix with each row an input vector. � y is a N length vector of the outputs in the training set. Abir Das (IIT Kharagpur) CS60010 Jan 22, 23 and 24, 2020 12 / 35

  15. Logistics Agenda Linear Regression Logistic Regression Multivariate Linear Regression § Eqn. (1), gives, N N � � � θ T x ( i ) − y ( i ) � 2 = � y ( i ) − y ( i ) � 2 1 1 J ( θ ) = � (3) 2 N 2 N i =1 i =1 �� � T �� � 1 1 y − y || 2 = 2 N || � 2 = y − y y − y 2 N � � T � � � θ T � � � 1 1 X T X θ − θ T X T y − y T X θ + y T y = X θ − y X θ − y = 2 N 2 N � θ T � � � � T θ − � � T θ + y T y � 1 X T X X T y X T y = θ − 2 N � θ T � � � � T θ + y T y � 1 X T X X T y = θ − 2 2 N Abir Das (IIT Kharagpur) CS60010 Jan 22, 23 and 24, 2020 13 / 35

  16. Logistics Agenda Linear Regression Logistic Regression Multivariate Linear Regression § Equating the gradient of the cost function to 0, � � 1 2 X T X θ − 2 X T y + 0 ∇ θ J ( θ ) = = 0 2 N X T X θ − X T y = 0 � � − 1 X T y X T X θ = (4) Abir Das (IIT Kharagpur) CS60010 Jan 22, 23 and 24, 2020 14 / 35

  17. Logistics Agenda Linear Regression Logistic Regression Multivariate Linear Regression § Equating the gradient of the cost function to 0, � � 1 2 X T X θ − 2 X T y + 0 ∇ θ J ( θ ) = = 0 2 N X T X θ − X T y = 0 � � − 1 X T y X T X θ = (4) § This gives a closed form solution, but another option is to use iterative solution (just like the univariate case). � N � h θ ( x ( i ) ) − y ( i ) � ∂J ( θ ) = 1 x ( i ) j ∂θ j N i =1 Abir Das (IIT Kharagpur) CS60010 Jan 22, 23 and 24, 2020 14 / 35

  18. Logistics Agenda Linear Regression Logistic Regression Multivariate Linear Regression § Iterative Gradient Descent needs to perform many iterations and need to choose a stepsize parameter judiciously. But it works equally well even if the number of features ( d ) is large. § For the least square solution, there is no need to choose the step size � � − 1 can be X T X parameter or no need to iterate. But, evaluating slow if d is large. Abir Das (IIT Kharagpur) CS60010 Jan 22, 23 and 24, 2020 15 / 35

Recommend


More recommend