regularization with lipschitz loss
play

Regularization with Lipschitz Loss Pierre Alquier Sequential, - PowerPoint PPT Presentation

Motivation Oracle inequalities Applications Regularization with Lipschitz Loss Pierre Alquier Sequential, structured, and/or statistical learning IHES - May 17, 2017 Pierre Alquier Regularized Procedures with Lipschitz Loss Functions


  1. Motivation Oracle inequalities Applications Regularization with Lipschitz Loss Pierre Alquier Sequential, structured, and/or statistical learning IHES - May 17, 2017 Pierre Alquier Regularized Procedures with Lipschitz Loss Functions

  2. Motivation Matrix completion : the L 2 point of view Oracle inequalities Matrix completion : Lipschitz losses ? Applications Motivation : user ratings Stan 7 3 8 Pierre 8 10 9 10 9 10 10 10 8 Zoe 8 3 7 Bob 6 4 2 Oscar 6 10 7 Léa 8 4 9 Tony 9 3 4 8 Pierre Alquier Regularized Procedures with Lipschitz Loss Functions

  3. Motivation Matrix completion : the L 2 point of view Oracle inequalities Matrix completion : Lipschitz losses ? Applications Motivation : user ratings Stan 7 3 8 Pierre 8 10 9 10 9 10 10 10 8 Zoe 8 3 7 Bob 6 4 2 ? ? ? Oscar 6 10 7 Léa 8 4 9 Tony 9 3 4 8 Pierre Alquier Regularized Procedures with Lipschitz Loss Functions

  4. Motivation Matrix completion : the L 2 point of view Oracle inequalities Matrix completion : Lipschitz losses ? Applications Motivation : user ratings Stan 7 3 8 Pierre 8 10 9 10 9 10 10 10 8 Zoe 8 3 7 Bob 6 4 2 7 Oscar 6 10 7 Léa 8 4 9 Tony 9 3 4 8 Pierre Alquier Regularized Procedures with Lipschitz Loss Functions

  5. Motivation Matrix completion : the L 2 point of view Oracle inequalities Matrix completion : Lipschitz losses ? Applications A possible model Notation : � A , B � F = Tr ( A T B ) . Let E j , k be the matrix with zeros everywhere except the ( j , k ) -th entry equal to 1. Pierre Alquier Regularized Procedures with Lipschitz Loss Functions

  6. Motivation Matrix completion : the L 2 point of view Oracle inequalities Matrix completion : Lipschitz losses ? Applications A possible model Notation : � A , B � F = Tr ( A T B ) . Let E j , k be the matrix with zeros everywhere except the ( j , k ) -th entry equal to 1. Observations : Y i = � M ∗ , X i � F + ε i , E ( ε i ) = 0 X i takes values in the set of matrices { E j , k } . Pierre Alquier Regularized Procedures with Lipschitz Loss Functions

  7. Motivation Matrix completion : the L 2 point of view Oracle inequalities Matrix completion : Lipschitz losses ? Applications A possible model Notation : � A , B � F = Tr ( A T B ) . Let E j , k be the matrix with zeros everywhere except the ( j , k ) -th entry equal to 1. Observations : Y i = � M ∗ , X i � F + ε i , E ( ε i ) = 0 X i takes values in the set of matrices { E j , k } . Idea : M ∗ is (approximately) low-rank. E. Candès & T. Tao (2009). The power of convex relaxation : Near-optimal matrix completion. IEEE Trans. Info. Theory. E. Candès & Y. Plan (2010). Matrix completion with noise. Proceedings of the IEEE . Pierre Alquier Regularized Procedures with Lipschitz Loss Functions

  8. Motivation Matrix completion : the L 2 point of view Oracle inequalities Matrix completion : Lipschitz losses ? Applications Penalized ERM First idea : � � N 1 � ( Y i − � M , X i � F ) 2 + λ. rank ( M ) ˆ M ∈ arg min N i = 1 but the rank is not convex... Pierre Alquier Regularized Procedures with Lipschitz Loss Functions

  9. Motivation Matrix completion : the L 2 point of view Oracle inequalities Matrix completion : Lipschitz losses ? Applications Penalized ERM First idea : � � N 1 � ( Y i − � M , X i � F ) 2 + λ. rank ( M ) ˆ M ∈ arg min N i = 1 but the rank is not convex... � � N 1 � ( Y i − � M , X i � F ) 2 + λ � M � ∗ ˆ M ∈ arg min N i = 1 Minimax rates of convergence derived in V. Koltchinskii, K. Lounici, & A. Tsybakov (2011) Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion. Annals of Statistics . O. Klopp (2014). Noisy low-rank matrix completion with general sampling distribution. Bernoulli . Pierre Alquier Regularized Procedures with Lipschitz Loss Functions

  10. Motivation Matrix completion : the L 2 point of view Oracle inequalities Matrix completion : Lipschitz losses ? Applications Is the quadratic loss always a good idea ? Stan 7 3 8 Pierre 8 10 9 10 9 10 10 10 8 Zoe 8 3 7 Bob 6 4 2 Oscar 6 10 7 Léa 8 4 9 Tony 9 3 4 8 Pierre Alquier Regularized Procedures with Lipschitz Loss Functions

  11. Motivation Matrix completion : the L 2 point of view Oracle inequalities Matrix completion : Lipschitz losses ? Applications Is the quadratic loss always a good idea ? Stan 7 3 8 Pierre 8 10 9 10 9 10 10 10 8 Zoe 8 3 7 Bob 6 4 2 ? ? ? Oscar 6 10 7 Léa 8 4 9 Tony 9 3 4 8 Pierre Alquier Regularized Procedures with Lipschitz Loss Functions

  12. Motivation Matrix completion : the L 2 point of view Oracle inequalities Matrix completion : Lipschitz losses ? Applications Is the quadratic loss always a good idea ? Stan 7 3 8 Pierre 8 10 9 10 9 10 10 10 8 Zoe 8 3 7 Bob 6 4 2 [6,8] Oscar 6 10 7 Léa 8 4 9 Tony 9 3 4 8 Pierre Alquier Regularized Procedures with Lipschitz Loss Functions

  13. Motivation Matrix completion : the L 2 point of view Oracle inequalities Matrix completion : Lipschitz losses ? Applications The quantile loss ... suggests to replace the quadratic loss by the quantile loss ℓ τ ( f ( x ) , y ) = ( y − f ( x ))[ τ − 1 ( y − f ( x ) ≤ 0 )] . Pierre Alquier Regularized Procedures with Lipschitz Loss Functions

  14. Motivation Matrix completion : the L 2 point of view Oracle inequalities Matrix completion : Lipschitz losses ? Applications The quantile loss ... suggests to replace the quadratic loss by the quantile loss ℓ τ ( f ( x ) , y ) = ( y − f ( x ))[ τ − 1 ( y − f ( x ) ≤ 0 )] . � � N 1 � ˆ M ∈ arg min ℓ τ ( � M , X i � F , Y i ) + λ � M � ∗ N i = 1 Source : http ://www.lokad.com/ Pierre Alquier Regularized Procedures with Lipschitz Loss Functions

  15. Motivation Matrix completion : the L 2 point of view Oracle inequalities Matrix completion : Lipschitz losses ? Applications 1-bit matrix completion Stan Pierre Zoe Bob Oscar Léa Tony Pierre Alquier Regularized Procedures with Lipschitz Loss Functions

  16. Motivation Matrix completion : the L 2 point of view Oracle inequalities Matrix completion : Lipschitz losses ? Applications 1-bit matrix completion Stan Pierre Zoe Bob ? ? ? Oscar Léa Tony Pierre Alquier Regularized Procedures with Lipschitz Loss Functions

  17. Motivation Matrix completion : the L 2 point of view Oracle inequalities Matrix completion : Lipschitz losses ? Applications 1-bit matrix completion Stan Pierre Zoe Bob Oscar Léa Tony Pierre Alquier Regularized Procedures with Lipschitz Loss Functions

  18. Motivation Matrix completion : the L 2 point of view Oracle inequalities Matrix completion : Lipschitz losses ? Applications 1-bit matrix completion � � N 1 � ˆ M ∈ arg min 1 ( sign ( � M , X i � F ) � = Y i ) + λ � M � ∗ N i = 1 Pierre Alquier Regularized Procedures with Lipschitz Loss Functions

  19. Motivation Matrix completion : the L 2 point of view Oracle inequalities Matrix completion : Lipschitz losses ? Applications 1-bit matrix completion � � N 1 � ˆ M ∈ arg min 1 ( sign ( � M , X i � F ) � = Y i ) + λ � M � ∗ N i = 1 Problem : the indicator function is not convex. Pierre Alquier Regularized Procedures with Lipschitz Loss Functions

  20. Motivation Matrix completion : the L 2 point of view Oracle inequalities Matrix completion : Lipschitz losses ? Applications 1-bit matrix completion � � N 1 � ˆ M ∈ arg min 1 ( sign ( � M , X i � F ) � = Y i ) + λ � M � ∗ N i = 1 Problem : the indicator function is not convex. � � N 1 � ˆ M ∈ arg min ℓ ( � M , X i � F , Y i ) + λ � M � ∗ N i = 1 logistic loss ℓ ( y ′ , y ) = log ( 1 + exp ( − y ′ y )) J. Laffond, O. Klopp, E. Moulines & J. Salmon (2014). Probabilistic low-rank matrix completion on finite alphabets. NIPS . Pierre Alquier Regularized Procedures with Lipschitz Loss Functions

  21. Motivation Matrix completion : the L 2 point of view Oracle inequalities Matrix completion : Lipschitz losses ? Applications 1-bit matrix completion � � N 1 � ˆ M ∈ arg min 1 ( sign ( � M , X i � F ) � = Y i ) + λ � M � ∗ N i = 1 Problem : the indicator function is not convex. � � N 1 � ˆ M ∈ arg min ℓ ( � M , X i � F , Y i ) + λ � M � ∗ N i = 1 logistic loss ℓ ( y ′ , y ) = log ( 1 + exp ( − y ′ y )) J. Laffond, O. Klopp, E. Moulines & J. Salmon (2014). Probabilistic low-rank matrix completion on finite alphabets. NIPS . hinge loss ℓ ( y ′ , y ) = ( 1 − y ′ y ) + etc. Pierre Alquier Regularized Procedures with Lipschitz Loss Functions

  22. Motivation Matrix completion : the L 2 point of view Oracle inequalities Matrix completion : Lipschitz losses ? Applications Lipschitz losses All the aforementionned losses : hinge, logistic, quantile are Lipschitz. And so are other popular losses : Huber, ... Pierre Alquier Regularized Procedures with Lipschitz Loss Functions

Recommend


More recommend