why lasso ridge
play

Why LASSO, Ridge Need for Strictly . . . Regression, and EN: - PowerPoint PPT Presentation

Need for Regularization Which Regularizations . . . Need for Degrees of . . . Need for And- and . . . Why LASSO, Ridge Need for Strictly . . . Regression, and EN: General Analysis of the . . . Why LASSO Explanation Based on Soft Why


  1. Need for Regularization Which Regularizations . . . Need for Degrees of . . . Need for “And”- and . . . Why LASSO, Ridge Need for Strictly . . . Regression, and EN: General Analysis of the . . . Why LASSO Explanation Based on Soft Why Ridge Regression Why EN: Idea Computing Home Page Title Page Woraphon Yamaka 1 , Hamza Alkhatib 2 , Ingo Neumann 2 , and Vladik Kreinovich 3 ◭◭ ◮◮ 1 Faculty of Economics, Chiang Mai University ◭ ◮ Chiang Mai, Thailand, woraphon.econ@gmail.com 2 Geodesic Institute, Leibniz University of Hannover Page 1 of 34 Hannover, Germany, alkhatib@gih.uni-hannover.de neumann@gih.uni-hannover.de Go Back 3 Department of Computer Science, University of Texas at El Paso Full Screen El Paso, Texas 79968, USA, vladik@utep.edu Close Quit

  2. Need for Regularization Which Regularizations . . . 1. Need for Regularization Need for Degrees of . . . • In practice, in addition to measurement results, we of- Need for “And”- and . . . ten use imprecise expert knowledge. Need for Strictly . . . General Analysis of the . . . • For example, physicists usually believe that: Why LASSO – when the value of a physical quantity x is small, Why Ridge Regression – we expand the dependence y = f ( x ) of some other Why EN: Idea quantity y on x in Taylor series, and Home Page – ignore quadratic and higher order terms in this ex- Title Page pansion. ◭◭ ◮◮ • The usual argument is that: ◭ ◮ – when x is small, Page 2 of 34 – its square x 2 is so much smaller than x that it can Go Back safely be ignored. Full Screen Close Quit

  3. Need for Regularization Which Regularizations . . . 2. Need for Regularization (cont-d) Need for Degrees of . . . • This is indeed true: Need for “And”- and . . . – if x = 10% = 0 . 1, then x 2 = 0 . 01 ≪ 0 . 1; Need for Strictly . . . General Analysis of the . . . – if x = 1% = 0 , 01, then we can say that x 2 = Why LASSO 0 . 0001 ≪ x = 0 . 01 with even higher confidence. Why Ridge Regression • However, from the purely mathematical viewpoint, this Why EN: Idea argument is not fully convincing. Home Page • Indeed, the quadratic term in the Taylor expansion is Title Page not x 2 , but a 2 · x 2 for some coefficient a 2 . ◭◭ ◮◮ • From the purely mathematical viewpoint, this coeffi- ◭ ◮ cient a 2 can be huge. Page 3 of 34 • In this case the product a 2 · x 2 will also be big, and we Go Back will not be able to ignore it. Full Screen • From the physicist’s viewpoint, however, this argument is valid. Close Quit

  4. Need for Regularization Which Regularizations . . . 3. Need for Regularization (cont-d) Need for Degrees of . . . • Indeed, physicists usually assume that the coefficients Need for “And”- and . . . cannot be too large, they must be reasonably small. Need for Strictly . . . General Analysis of the . . . • This imprecise additional assumption underlies many Why LASSO successes of physics. Why Ridge Regression • It can also be used as a supplement to measurements Why EN: Idea when we estimate the values of physical quantities. Home Page • This is common sense. Title Page • Sometimes, after applying some mathematical tech- ◭◭ ◮◮ niques, we get too large values of some parameters. ◭ ◮ • This usually means that something is not right: Page 4 of 34 – either with our method Go Back – or with some measurement results – they may be Full Screen outliers. Close Quit

  5. Need for Regularization Which Regularizations . . . 4. Need for Regularization (cont-d) Need for Degrees of . . . • In simple cases, it is clear that if we have a record of Need for “And”- and . . . temperature in some area, Need for Strictly . . . General Analysis of the . . . – and we see 17, 18, 19, 18, 17, and then suddenly 42 Why LASSO degrees, Why Ridge Regression – we should get very suspicious – especially if the Why EN: Idea next day, we again have the high of 19. Home Page • Physicists’ intuition is great, but we cannot always rely Title Page on this intuition. ◭◭ ◮◮ • There are many problems that need solving. ◭ ◮ • It is not realistic to expect to have a skilled physicist Page 5 of 34 for each such problem. Go Back • How to deal with situations when a professional physi- Full Screen cist is not available? Close Quit

  6. Need for Regularization Which Regularizations . . . 5. Need for Regularization (cont-d) Need for Degrees of . . . • We need to have a precise description of: Need for “And”- and . . . Need for Strictly . . . – what we mean General Analysis of the . . . – when we say that the coefficients a 0 , . . . , a n describ- Why LASSO ing a model must be reasonably small. Why Ridge Regression – Such descriptions are known as regularization . Why EN: Idea Home Page Title Page ◭◭ ◮◮ ◭ ◮ Page 6 of 34 Go Back Full Screen Close Quit

  7. Need for Regularization Which Regularizations . . . 6. Which Regularizations Are Currently Used Need for Degrees of . . . • Out of many possible regularizations, the following three Need for “And”- and . . . techniques have been most empirically successful: Need for Strictly . . . General Analysis of the . . . – LASSO technique when we limit the sum of the n Why LASSO � absolute values | a i | ; Why Ridge Regression i =1 – ridge regression method, in which we limit the sum Why EN: Idea n Home Page � a 2 of the squares i ; and i =0 Title Page – the Elastic Net (EN) method, in which we limit a ◭◭ ◮◮ linear combination of the above two sums. ◭ ◮ • Why? Page 7 of 34 • In this paper, we show that: Go Back – a natural formalization of commonsense intuition Full Screen – indeed leads to these three regularization techniques. Close Quit

  8. Need for Regularization Which Regularizations . . . 7. Need for Degrees of Confidence Need for Degrees of . . . • Precise statements like “ x is larger than 5” are either Need for “And”- and . . . true or false. Need for Strictly . . . General Analysis of the . . . • In contrast, imprecise statements like “ x is reasonably Why LASSO small” are not well-defined. Why Ridge Regression • For some values x , for example, for x = 0 . 0001, the Why EN: Idea expert is absolutely sure that x is small. Home Page • For other values like x = 10 7 , the expert is usually Title Page absolutely sure that this value is not reasonably small. ◭◭ ◮◮ • However, for intermediate values x : ◭ ◮ – the expert is usually not 100% sure whether this Page 8 of 34 value is indeed reasonably small; Go Back – he or she is only sure to some degree. Full Screen Close Quit

  9. Need for Regularization Which Regularizations . . . 8. Need for Degrees of Confidence (cont-d) Need for Degrees of . . . • It is therefore reasonable to ask the expert to assign: Need for “And”- and . . . Need for Strictly . . . – to each value x , General Analysis of the . . . – a degree µ ( x ) to which this expert believes that x Why LASSO is reasonably small. Why Ridge Regression • We can use different scales for such degrees. Why EN: Idea Home Page • In the computer, “absolutely true” is usually described as 1, and “absolutely false” as 0. Title Page ◭◭ ◮◮ • So, it is convenient to use a scale from 0 to 1 for such degrees. ◭ ◮ • This assignment is one of the main ideas behind fuzzy Page 9 of 34 logic . Go Back • This technique was specifically developed to deal with Full Screen such imprecision. Close Quit

  10. Need for Regularization Which Regularizations . . . 9. Need for Degrees of Confidence (cont-d) Need for Degrees of . . . • This way, we can assign: Need for “And”- and . . . Need for Strictly . . . – to each imprecise statement, General Analysis of the . . . – a function µ ( x ) that describes to what degree this Why LASSO statement is satisfied for each value x . Why Ridge Regression • This function is known as a membership function or a Why EN: Idea fuzzy set . Home Page Title Page ◭◭ ◮◮ ◭ ◮ Page 10 of 34 Go Back Full Screen Close Quit

  11. Need for Regularization Which Regularizations . . . 10. Need for “And”- and “Or”-Operations Need for Degrees of . . . • Often, experts make complex statements. Need for “And”- and . . . Need for Strictly . . . • For example, they may say that x is reasonably small, General Analysis of the . . . but not very small. Why LASSO • This statement is obtained: Why Ridge Regression – from the basic statements “ x is reasonably small” Why EN: Idea Home Page and “ x is very small” – by applying connectives “not” and “but” (which Title Page here means the same as “and”). ◭◭ ◮◮ • In general: ◭ ◮ – we can use connectives “and”, “or”, and “not” Page 11 of 34 – to combine elementary statements into a composite Go Back one. Full Screen Close Quit

Recommend


More recommend