A statement of the standard machine learning problem A new approach for regression and classi…cation modeling Regression models Classi…cation models Interval-valued regression and classi…cation models in the framework of machine learning Lev Utkin and Frank Coolen Innsbruck, July 2011 Lev Utkin and Frank Coolen Interval-valued regression and classi…cation models in the framew
A statement of the standard machine learning problem A new approach for regression and classi…cation modeling Regression models Classi…cation models A general problem statement Given: a training set ( x i , y i ) , i = 1 , ..., n , x 2 R m is a multivariate input of features and a scalar output: regression: y 2 R classi…cation: binary y 2 f� 1 , 1 g or multi-class y 2 f 1 , 2 , ..., l g . The learning problem: to select a function f ( x , w opt ) from a set of functions f ( x , w ) parameterized by a set of parameters w 2 Λ , which regression: best approximates the system response y classi…cation: separates examples of di¤erent classes y . Lev Utkin and Frank Coolen Interval-valued regression and classi…cation models in the framew
A statement of the standard machine learning problem A new approach for regression and classi…cation modeling Regression models Classi…cation models A general problem solution To minimize the risk functional R ( w ) over w 2 Λ : regression Z R ( w ) = R m + 1 L ( y , f ) d F 0 ( x , y ) Z = R L ( z , w ) d F ( z ) , z = y � f ( x , w ) . classi…cation Z R ( w ) = R m �f� 1 , 1 g L ( y , f ) d F 0 ( x , y ) Z = ∑ p ( y ) R m L ( y , f ) d F 0 ( x j y ) . y = 0 , 1 Lev Utkin and Frank Coolen Interval-valued regression and classi…cation models in the framew
A statement of the standard machine learning problem A new approach for regression and classi…cation modeling Regression models Classi…cation models Loss functions in regression models: quadratic, linear, the “pinball” function (for quantile regression), the ε -insensitive loss function . in classi…cation models: indicator, logistic, hinge loss. Lev Utkin and Frank Coolen Interval-valued regression and classi…cation models in the framew
A statement of the standard machine learning problem A new approach for regression and classi…cation modeling Regression models Classi…cation models The main idea It is assumed that the CDF F ( z ) 2 F bounded by the lower 1 F and upper F CDFs (P-boxes): F = f F ( z ) j 8 z , F ( z j y ) � F ( z j y ) � F ( z j y ) g . P-boxes are constructed from training data and they are 2 parametric, i.e., F ! F ( w ) . Two CDFs maximizing and minimizing R ( w ) are taken from 3 F , which determine the largest R and smallest R risk measures as functions of w . w are computed by minimizing the lower and upper risk 4 measures. Lev Utkin and Frank Coolen Interval-valued regression and classi…cation models in the framew
A statement of the standard machine learning problem A new approach for regression and classi…cation modeling Regression models Classi…cation models Two main tasks to be solved How to construct parametric P-boxes, i.e., F and F from the 1 training set? How to …nd “optimal” distributions from the P-box, i.e., the 2 distributions maximizing and minimizing the risk functional (corresponding to minimax and minimin strategies, respectively)? Lev Utkin and Frank Coolen Interval-valued regression and classi…cation models in the framew
A statement of the standard machine learning problem A new approach for regression and classi…cation modeling Regression models Classi…cation models Interval regression (simplest case) y f x 1 Lev Utkin and Frank Coolen Interval-valued regression and classi…cation models in the framew
A statement of the standard machine learning problem A new approach for regression and classi…cation modeling Regression models Classi…cation models Intervals and P-boxes (regression) Given: a training set ( x i , Y i ) , i = 1 , ..., n , x 2 R m , Y i = [ y i , y i ] . P-boxes : F ( z j w ) = Bel (( � ∞ , z ]) = n � 1 ∑ 1 , i : Z i ( w ) � z F ( z j w ) = Pl (( � ∞ , z ]) = n � 1 ∑ 1 . i : Z i ( w ) � z Here Z i ( w ) = y i � f ( x i , w ) and Z i ( w ) = y i � f ( x i , w ) . Lev Utkin and Frank Coolen Interval-valued regression and classi…cation models in the framew
A statement of the standard machine learning problem A new approach for regression and classi…cation modeling Regression models Classi…cation models “Optimal distribution functions” (regression) (Utkin & Destercke 2009) 4 3 2 1 z � Z i ( w ) , Z i ( w ) � Interval-valued estimates Lev Utkin and Frank Coolen Interval-valued regression and classi…cation models in the framew
A statement of the standard machine learning problem A new approach for regression and classi…cation modeling Regression models Classi…cation models Lower and upper CDFs 1.0 F 0.8 0.6 upper F lower F 0.4 0.2 -8 -6 -4 -2 0 2 4 6 8 z Lower and upper probability distributions produced by four intervals Lev Utkin and Frank Coolen Interval-valued regression and classi…cation models in the framew
A statement of the standard machine learning problem A new approach for regression and classi…cation modeling Regression models Classi…cation models The optimal CDF by the minimax strategy 1.0 F 0.8 0.6 upper F lower F 0.4 0.2 -8 -6 -4 -2 0 2 4 6 8 z The optimal probability distribution (thick) by the minimax strategy Lev Utkin and Frank Coolen Interval-valued regression and classi…cation models in the framew
A statement of the standard machine learning problem A new approach for regression and classi…cation modeling Regression models Classi…cation models The optimal CDF by the minimin strategy 1.0 F 0.8 0.6 upper F lower F 0.4 0.2 -8 -6 -4 -2 0 2 4 6 8 z The optimal probability distribution (thick) by the minimin strategy Lev Utkin and Frank Coolen Interval-valued regression and classi…cation models in the framew
A statement of the standard machine learning problem A new approach for regression and classi…cation modeling Regression models Classi…cation models Intervals and expectations in the framework of belief structures (regression) The upper expectation of risk functional (Nguyen & Walker 1994, Strat 1990): R ( w ) = n � 1 ∑ n max L ( z ) ! min w . i = 1 z 2 [ Z i ( w ) , Z i ( w )] The optimization problem for computing w : n w , G i ∑ min i = 1 G i , subject to G i � L ( Z i ( w )) , G i � L ( Z i ( w )) , i = 1 , ..., n . If L ( z ) and f ( x , w ) are linear, then we have the LP problem. Lev Utkin and Frank Coolen Interval-valued regression and classi…cation models in the framew
A statement of the standard machine learning problem A new approach for regression and classi…cation modeling Regression models Classi…cation models Support vector machine (SVM) and interval observations If we take the ε -insensitive loss function L , then � 1 � n 2 h α , α i + C ∑ i = 1 ( ξ i + ξ � i ) min α subject to ξ � ξ i � 0 , i � 0 , ξ i + ε � ( h α x i i + α 0 ) � y i , ξ i + ε � ( h α x i i + α 0 ) � y i , ξ � i + ε � y i � ( h α x i i + α 0 ) , ξ � i + ε � y i � ( h α x i i + α 0 ) . 1 2 h α , α i is the Tikhonov regularization term (the most popular penalty or smoothness term) Lev Utkin and Frank Coolen Interval-valued regression and classi…cation models in the framew
A statement of the standard machine learning problem A new approach for regression and classi…cation modeling Regression models Classi…cation models Advantages of SVMs SVMs are ‡exible in the choice of the form of the discriminant 1 and regression functions ( non-linear functions f due to kernel methodology ); SVMs provide a unique solution (due to convex objective 2 function), there are no false local minima; SVMs are simple to use; 3 SVMs have a clear geometric explanation. 4 Lev Utkin and Frank Coolen Interval-valued regression and classi…cation models in the framew
A statement of the standard machine learning problem A new approach for regression and classi…cation modeling Regression models Classi…cation models Interval data in classi…cation x 2 f x 1 Lev Utkin and Frank Coolen Interval-valued regression and classi…cation models in the framew
A statement of the standard machine learning problem A new approach for regression and classi…cation modeling Regression models Classi…cation models What do the minimax and minimin strategies mean? Regression: 1 minimax: outlying points are taken into account; minimin: neighboring points are taken into account. Classi…cation: 2 minimax: points from two classes approach each other (get mixed); minimin: points are separated. Lev Utkin and Frank Coolen Interval-valued regression and classi…cation models in the framew
A statement of the standard machine learning problem A new approach for regression and classi…cation modeling Regression models Classi…cation models Questions ? Lev Utkin and Frank Coolen Interval-valued regression and classi…cation models in the framew
Recommend
More recommend