applications i
play

Applications (I) Lijun Zhang zlj@nju.edu.cn - PowerPoint PPT Presentation

Applications (I) Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Norm Approximation Basic Norm Approximation Penalty Function Approximation Approximation with Constraints Least-norm Problems Regularized


  1. Applications (I) Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj

  2. Outline  Norm Approximation  Basic Norm Approximation  Penalty Function Approximation  Approximation with Constraints  Least-norm Problems  Regularized Approximation  Classification  Linear Discrimination  Support Vector Classifier  Logistic Regression

  3. Basic Norm Approximation  Norm Approximation Problem min 𝐵𝑦 � 𝑐 � are problem data ���  � is the variable  �  is a norm on  Approximation solution of , in  Residual 𝑠 � 𝐵𝑦 � 𝑐  A Convex Problem  , the optimal value is 0  , more interesting

  4. Basic Norm Approximation  Approximation Interpretation 𝐵𝑦 � 𝑦 � 𝑏 � � ⋯ � 𝑦 � 𝑏 � � are the columns of  � �  Approximate the vector by a linear combination  Regression problem  𝑏 � , … , 𝑏 � are regressors  𝑦 � 𝑏 � � ⋯ � 𝑦 � 𝑏 � is the regression of 𝑐

  5. Basic Norm Approximation  Estimation Interpretation  Consider a linear measurement model 𝑧 � 𝐵𝑦 � 𝑤 � is a vector measurement  � is a vector of parameters to be  estimated � is some measurement error that is  unknown, but presumed to be small  Assume smaller values of are more plausible 𝑦 � � argmin � 𝐵𝑨 � 𝑧

  6. Basic Norm Approximation  Geometric Interpretation � , and  Consider the subspace � a point  A projection of the point onto the subspace , in the norm min 𝑣 � 𝑐 s. t. 𝑣 ∈ 𝒝  Parametrize an arbitrary element of as , we see that norm approximation is equivalent to projection

  7. Basic Norm Approximation  Weighted Norm Approximation Problems min 𝑋�𝐵𝑦 � 𝑐� ��� is called the weighting matrix   A norm approximation problem with norm , and data  A norm approximation problem with data and , and the -weighted norm 𝑨 � � 𝑋𝑨

  8. Basic Norm Approximation  Least-Squares Approximation � � 𝑠 � � 𝑠 � � ⋯ � 𝑠 � min 𝐵𝑦 � 𝑐 � � � �  The minimization of a convex quadratic function 𝑔 𝑦 � 𝑦 � 𝐵 � 𝐵𝑦 � 2𝑐 � 𝐵𝑦 � 𝑐 � 𝑐  A point minimizes if and only if 𝛼𝑔 𝑦 � 2𝐵 � 𝐵𝑦 � 2𝐵 � 𝑐 � 0  Normal equations 𝐵 � 𝐵𝑦 � 𝐵 � 𝑐

  9. Basic Norm Approximation  Chebyshev or Minimax Approximation min 𝐵𝑦 � 𝑐 � � max� 𝑠 � , … , 𝑠 � �  Be cast as an LP min 𝑢 s. t. �𝑢1 ≼ 𝐵𝑦 � 𝑐 ≼ 𝑢1 � and with variables  Sum of Absolute Residuals Approximation min 𝐵𝑦 � 𝑐 � � 𝑠 � � ⋯ � 𝑠 �  Be cast as an LP 1 � 𝑢 min s. t. �𝑢 ≼ 𝐵𝑦 � 𝑐 ≼ 𝑢 with variables 𝑦 ∈ 𝐒 � and 𝑢 ∈ 𝐒 �

  10. Outline  Norm Approximation  Basic Norm Approximation  Penalty Function Approximation  Approximation with Constraints  Least-norm Problems  Regularized Approximation  Classification  Linear Discrimination  Support Vector Classifier  Logistic Regression

  11. -norm Approximation  � -norm approximation, for � � � ⋯ � 𝑠 � � �/� 𝑠  The equivalent problem with objective � � � ⋯ � 𝑠 � � 𝑠  A separable and symmetric function of the residuals  Objective depends only on the amplitude distribution of the residuals

  12. Penalty Function Approximation  The Problem min 𝜚 𝑠 � � ⋯ � 𝜚 𝑠 � s. t. 𝑠 � 𝐵𝑦 � 𝑐  is called the penalty function  is convex  is symmetric, nonnegative, and satisfies  A penalty function assesses a cost or penalty for each component of residual

  13. Example  � -norm Approximation 𝜚 𝑣 � 𝑣 � �  Quadratic penalty:  Absolute value penalty:  Deadzone-linear Penalty Function 𝜚 𝑣 � �0 𝑣 � 𝑏 𝑣 � 𝑏 𝑣 � 𝑏  The Log Barrier Penalty Function 𝜚 𝑣 � ��𝑏 � log 1 � 𝑣/𝑏 � 𝑣 � 𝑏 ∞ 𝑣 � 𝑏

  14. Example  Log barrier penalty function assesses an infinite penalty for residuals larger than 𝑏  Log barrier function is very close to the quadratic penalty for |𝑣/𝑏| � 0.25

  15. Discussions  Roughly speaking, is a measure of our dislike of a residual of value  If is very small for small , it means we care very little if residuals have these values  If grows rapidly as becomes large, it means we have a strong dislike for large residuals  If becomes infinite outside some interval, it means that residuals outside the interval are unacceptable

  16. Discussions �  、 � �  For small 𝑣 we have 𝜚 � 𝑣 ≫ 𝜚 � 𝑣 , so ℓ � -norm approximation puts relatively larger emphasis on small residuals  The optimal residual for the ℓ � -norm approximation problem will tend to have more zero and very small residuals  For large 𝑣 we have 𝜚 � 𝑣 ≫ 𝜚 � 𝑣 , so ℓ � -norm approximation puts less weight on large residuals  The ℓ � -norm solution will tend to have relatively fewer large residuals

  17. ��� ������ , b Example 

  18. Observations of Penalty Functions  The ℓ � -norm penalty puts the most weight on small residuals and the least weight on large residuals.  The ℓ � -norm penalty puts very small weight on small residuals, but strong weight on large residuals.  The deadzone-linear penalty function puts no weight on residuals smaller than 0.5 , and relatively little weight on large residuals.  The log barrier penalty puts weight very much like the ℓ � -norm penalty for small residuals, but puts very strong weight on residuals larger than around 0.8 , and infinite weight on residuals larger than 1 .

  19. Observations of Amplitude Distributions  For the ℓ � -optimal solution, many residuals are either zero or very small. The ℓ � -optimal solution also has relatively more large residuals.  The ℓ � -norm approximation has many modest residuals, and relatively few larger ones.  For the deadzone-linear penalty, we see that many residuals have the value �0.5 , right at the edge of the ‘free’ zone, for which no penalty is assessed.  For the log barrier penalty, we see that no residuals have a magnitude larger than 1 , but otherwise the residual distribution is similar to the residual distribution for ℓ � -norm approximation.

  20. Outline  Norm Approximation  Basic Norm Approximation  Penalty Function Approximation  Approximation with Constraints  Least-norm Problems  Regularized Approximation  Classification  Linear Discrimination  Support Vector Classifier  Logistic Regression

  21. Approximation with Constraints  Add Constraints to min 𝐵𝑦 � 𝑐  Rule out certain unacceptable approximations of the vector  Ensure that the approximator satisfies certain properties  Prior knowledge of the vector to be estimated  Prior knowledge of the estimation error  Determine the projection of a point on a set more complicated than a subspace

  22. Approximation with Constraints  Nonnegativity Constraints on Variables min 𝐵𝑦 � 𝑐 s. t. 𝑦 ≽ 0  Estimate a vector of parameters known to be nonnegative  Determine the projection of a vector onto the cone generated by the columns of  Approximate using a nonnegative linear combination of the columns of

  23. Approximation with Constraints  Variable Bounds min 𝐵𝑦 � 𝑐 s. t. 𝑚 ≼ 𝑦 ≼ 𝑣  Prior knowledge of intervals in which each variable lies  Determine the projection of a vector onto the image of a box under the linear mapping induced by

  24. Approximation with Constraints  Probability Distribution min 𝐵𝑦 � 𝑐 𝑦 ≽ 0, 1 � 𝑦 � 1 s. t.  Estimation of proportions or relative frequencies  Approximate 𝑐 by a convex combination of the columns of 𝐵  Norm Ball Constraint min 𝐵𝑦 � 𝑐 s. t. 𝑦 � 𝑦 � � 𝑒  𝑦 � is prior guess of what the parameter 𝑦 is, and 𝑒 is the maximum plausible deviation

  25. Outline  Norm Approximation  Basic Norm Approximation  Penalty Function Approximation  Approximation with Constraints  Least-norm Problems  Regularized Approximation  Classification  Linear Discrimination  Support Vector Classifier  Logistic Regression

  26. Least-norm Problems  Basic least-norm Problem min 𝑦 s. t. 𝐵𝑦 � 𝑐 ��� �  � �  is a norm on  The solution is called a least-norm solution of .  A convex optimization problem  Interesting when

  27. Least-norm Problems  Reformulation as Norm Approximation Problem  Let � be any solution of ��� be a matrix whose columns  Let are a basis for the nullspace of . 𝑦|𝐵𝑦 � 𝑐 � �𝑦 � � 𝑎𝑣|𝑣 ∈ 𝐒 � �  The least-norm problem can be expressed as min 𝑦 � � 𝑎𝑣

Recommend


More recommend