Applications (I) Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj
Outline Norm Approximation Basic Norm Approximation Penalty Function Approximation Approximation with Constraints Least-norm Problems Regularized Approximation Classification Linear Discrimination Support Vector Classifier Logistic Regression
Basic Norm Approximation Norm Approximation Problem min 𝐵𝑦 � 𝑐 � are problem data ��� � is the variable � is a norm on Approximation solution of , in Residual 𝑠 � 𝐵𝑦 � 𝑐 A Convex Problem , the optimal value is 0 , more interesting
Basic Norm Approximation Approximation Interpretation 𝐵𝑦 � 𝑦 � 𝑏 � � ⋯ � 𝑦 � 𝑏 � � are the columns of � � Approximate the vector by a linear combination Regression problem 𝑏 � , … , 𝑏 � are regressors 𝑦 � 𝑏 � � ⋯ � 𝑦 � 𝑏 � is the regression of 𝑐
Basic Norm Approximation Estimation Interpretation Consider a linear measurement model 𝑧 � 𝐵𝑦 � 𝑤 � is a vector measurement � is a vector of parameters to be estimated � is some measurement error that is unknown, but presumed to be small Assume smaller values of are more plausible 𝑦 � � argmin � 𝐵𝑨 � 𝑧
Basic Norm Approximation Geometric Interpretation � , and Consider the subspace � a point A projection of the point onto the subspace , in the norm min 𝑣 � 𝑐 s. t. 𝑣 ∈ Parametrize an arbitrary element of as , we see that norm approximation is equivalent to projection
Basic Norm Approximation Weighted Norm Approximation Problems min 𝑋�𝐵𝑦 � 𝑐� ��� is called the weighting matrix A norm approximation problem with norm , and data A norm approximation problem with data and , and the -weighted norm 𝑨 � � 𝑋𝑨
Basic Norm Approximation Least-Squares Approximation � � 𝑠 � � 𝑠 � � ⋯ � 𝑠 � min 𝐵𝑦 � 𝑐 � � � � The minimization of a convex quadratic function 𝑔 𝑦 � 𝑦 � 𝐵 � 𝐵𝑦 � 2𝑐 � 𝐵𝑦 � 𝑐 � 𝑐 A point minimizes if and only if 𝛼𝑔 𝑦 � 2𝐵 � 𝐵𝑦 � 2𝐵 � 𝑐 � 0 Normal equations 𝐵 � 𝐵𝑦 � 𝐵 � 𝑐
Basic Norm Approximation Chebyshev or Minimax Approximation min 𝐵𝑦 � 𝑐 � � max� 𝑠 � , … , 𝑠 � � Be cast as an LP min 𝑢 s. t. �𝑢1 ≼ 𝐵𝑦 � 𝑐 ≼ 𝑢1 � and with variables Sum of Absolute Residuals Approximation min 𝐵𝑦 � 𝑐 � � 𝑠 � � ⋯ � 𝑠 � Be cast as an LP 1 � 𝑢 min s. t. �𝑢 ≼ 𝐵𝑦 � 𝑐 ≼ 𝑢 with variables 𝑦 ∈ 𝐒 � and 𝑢 ∈ 𝐒 �
Outline Norm Approximation Basic Norm Approximation Penalty Function Approximation Approximation with Constraints Least-norm Problems Regularized Approximation Classification Linear Discrimination Support Vector Classifier Logistic Regression
-norm Approximation � -norm approximation, for � � � ⋯ � 𝑠 � � �/� 𝑠 The equivalent problem with objective � � � ⋯ � 𝑠 � � 𝑠 A separable and symmetric function of the residuals Objective depends only on the amplitude distribution of the residuals
Penalty Function Approximation The Problem min 𝜚 𝑠 � � ⋯ � 𝜚 𝑠 � s. t. 𝑠 � 𝐵𝑦 � 𝑐 is called the penalty function is convex is symmetric, nonnegative, and satisfies A penalty function assesses a cost or penalty for each component of residual
Example � -norm Approximation 𝜚 𝑣 � 𝑣 � � Quadratic penalty: Absolute value penalty: Deadzone-linear Penalty Function 𝜚 𝑣 � �0 𝑣 � 𝑏 𝑣 � 𝑏 𝑣 � 𝑏 The Log Barrier Penalty Function 𝜚 𝑣 � ��𝑏 � log 1 � 𝑣/𝑏 � 𝑣 � 𝑏 ∞ 𝑣 � 𝑏
Example Log barrier penalty function assesses an infinite penalty for residuals larger than 𝑏 Log barrier function is very close to the quadratic penalty for |𝑣/𝑏| � 0.25
Discussions Roughly speaking, is a measure of our dislike of a residual of value If is very small for small , it means we care very little if residuals have these values If grows rapidly as becomes large, it means we have a strong dislike for large residuals If becomes infinite outside some interval, it means that residuals outside the interval are unacceptable
Discussions � 、 � � For small 𝑣 we have 𝜚 � 𝑣 ≫ 𝜚 � 𝑣 , so ℓ � -norm approximation puts relatively larger emphasis on small residuals The optimal residual for the ℓ � -norm approximation problem will tend to have more zero and very small residuals For large 𝑣 we have 𝜚 � 𝑣 ≫ 𝜚 � 𝑣 , so ℓ � -norm approximation puts less weight on large residuals The ℓ � -norm solution will tend to have relatively fewer large residuals
��� ������ , b Example
Observations of Penalty Functions The ℓ � -norm penalty puts the most weight on small residuals and the least weight on large residuals. The ℓ � -norm penalty puts very small weight on small residuals, but strong weight on large residuals. The deadzone-linear penalty function puts no weight on residuals smaller than 0.5 , and relatively little weight on large residuals. The log barrier penalty puts weight very much like the ℓ � -norm penalty for small residuals, but puts very strong weight on residuals larger than around 0.8 , and infinite weight on residuals larger than 1 .
Observations of Amplitude Distributions For the ℓ � -optimal solution, many residuals are either zero or very small. The ℓ � -optimal solution also has relatively more large residuals. The ℓ � -norm approximation has many modest residuals, and relatively few larger ones. For the deadzone-linear penalty, we see that many residuals have the value �0.5 , right at the edge of the ‘free’ zone, for which no penalty is assessed. For the log barrier penalty, we see that no residuals have a magnitude larger than 1 , but otherwise the residual distribution is similar to the residual distribution for ℓ � -norm approximation.
Outline Norm Approximation Basic Norm Approximation Penalty Function Approximation Approximation with Constraints Least-norm Problems Regularized Approximation Classification Linear Discrimination Support Vector Classifier Logistic Regression
Approximation with Constraints Add Constraints to min 𝐵𝑦 � 𝑐 Rule out certain unacceptable approximations of the vector Ensure that the approximator satisfies certain properties Prior knowledge of the vector to be estimated Prior knowledge of the estimation error Determine the projection of a point on a set more complicated than a subspace
Approximation with Constraints Nonnegativity Constraints on Variables min 𝐵𝑦 � 𝑐 s. t. 𝑦 ≽ 0 Estimate a vector of parameters known to be nonnegative Determine the projection of a vector onto the cone generated by the columns of Approximate using a nonnegative linear combination of the columns of
Approximation with Constraints Variable Bounds min 𝐵𝑦 � 𝑐 s. t. 𝑚 ≼ 𝑦 ≼ 𝑣 Prior knowledge of intervals in which each variable lies Determine the projection of a vector onto the image of a box under the linear mapping induced by
Approximation with Constraints Probability Distribution min 𝐵𝑦 � 𝑐 𝑦 ≽ 0, 1 � 𝑦 � 1 s. t. Estimation of proportions or relative frequencies Approximate 𝑐 by a convex combination of the columns of 𝐵 Norm Ball Constraint min 𝐵𝑦 � 𝑐 s. t. 𝑦 � 𝑦 � � 𝑒 𝑦 � is prior guess of what the parameter 𝑦 is, and 𝑒 is the maximum plausible deviation
Outline Norm Approximation Basic Norm Approximation Penalty Function Approximation Approximation with Constraints Least-norm Problems Regularized Approximation Classification Linear Discrimination Support Vector Classifier Logistic Regression
Least-norm Problems Basic least-norm Problem min 𝑦 s. t. 𝐵𝑦 � 𝑐 ��� � � � is a norm on The solution is called a least-norm solution of . A convex optimization problem Interesting when
Least-norm Problems Reformulation as Norm Approximation Problem Let � be any solution of ��� be a matrix whose columns Let are a basis for the nullspace of . 𝑦|𝐵𝑦 � 𝑐 � �𝑦 � � 𝑎𝑣|𝑣 ∈ 𝐒 � � The least-norm problem can be expressed as min 𝑦 � � 𝑎𝑣
Recommend
More recommend