Need to Estimate . . . Need to Estimate . . . Least Squares pred(25) as an . . . Optimizing pred(25) Is Problem NP-Hard Main Result Acknowledgments Martine Ceberio, Olga Kosheleva, Proof and Vladik Kreinovich Proof: Conclusion Home Page University of Texas at El Paso El Paso, TX 79968, USA Title Page mceberio@utep.edu, olgak@utep.edu, ◭◭ ◮◮ vladik@utep.edu ◭ ◮ Page 1 of 17 Go Back Full Screen Close Quit
Need to Estimate . . . Need to Estimate . . . 1. Need to Estimate Parameters of Models Least Squares • In many practical situations: pred(25) as an . . . Problem – we know that a quantity y depends on the quanti- Main Result ties x 1 , . . . , x n , and Acknowledgments – we know the general type of this dependence. Proof • In precise terms, this means that: Proof: Conclusion Home Page – we know a family of functions f ( c 1 , . . . , c p , x 1 , . . . , x n ) characterized by parame- Title Page ters c i , and ◭◭ ◮◮ – we know that the actual dependence corresponds ◭ ◮ to one of these functions. Page 2 of 17 • Example: we may know that the dependence is linear: Go Back n � f ( c 1 , . . . , c n , c n +1 , x 1 , . . . , x n ) = c n +1 + c i · x i . Full Screen i =1 Close Quit
Need to Estimate . . . Need to Estimate . . . 2. Need to Estimate Parameters (cont-d) Least Squares • In general, we know the type of the dependence, but pred(25) as an . . . we do not know the actual values of the parameters. Problem Main Result • These values can only be determined from the mea- Acknowledgments surements and observations, when we observe: Proof – the values x j and Proof: Conclusion – the corresponding value y . Home Page • Measurement and observations are always approxi- Title Page mate. ◭◭ ◮◮ • So we end up with tuples ( x 1 k , . . . , x nk , y k ), 1 ≤ k ≤ K , ◭ ◮ for which y k ≈ f ( c 1 , . . . , c p , x 1 k , . . . , x nk ) for all k . Page 3 of 17 • We need to estimate the parameters c 1 , . . . , c p based on Go Back these measurement results. Full Screen Close Quit
Need to Estimate . . . Need to Estimate . . . 3. Least Squares Least Squares • In most practical situations, the Least Squares method pred(25) as an . . . is used to estimate the desired parameters: Problem ( y k − f ( c 1 , . . . , c p , x 1 k , . . . , x nk )) 2 → min � Main Result c 1 ,...,c p . Acknowledgments k Proof • When f ( c 1 , . . . , c p , x 1 , . . . , x n ) linearly depends on c i , Proof: Conclusion we get an easy-to-solve system of linear equations. Home Page • This approach is optimal when approximation errors Title Page are independent and normally distributed. ◭◭ ◮◮ • In practice, however, we often have outliers – e.g., due ◭ ◮ to a malfunction of a measuring instrument. Page 4 of 17 • In the presence of even a single outlier, the Least Squares method can give very wrong results. Go Back • Example: y = c for some unknown constant c . Full Screen • In this case, we get c = y 1 + . . . + y K Close . K Quit
Need to Estimate . . . Need to Estimate . . . 4. Least Squares Least Squares • The formula c = y 1 + . . . + y K pred(25) as an . . . works well if all the K Problem values y i are approximately equal to c . Main Result • For example, if the actual value of c is 0, and | y i | ≤ 0 . 1, Acknowledgments we get an estimate | c | ≤ 0 . 1. Proof • However, if out of 100 measurements y i , one is an out- Proof: Conclusion lier equal to 1000, the estimate becomes close to 10. Home Page Title Page • This estimate is far from the actual value 0. ◭◭ ◮◮ • So, we need estimates which do not change as much in the presence of possible outliers. ◭ ◮ • Such methods are called robust . Page 5 of 17 Go Back Full Screen Close Quit
Need to Estimate . . . Need to Estimate . . . 5. pred(25) as an Example of a Robust Estimate. Least Squares • One of the possible robust estimates consists of: pred(25) as an . . . Problem – selecting a percentage α and Main Result – selecting the parameters for which the # of points Acknowledgments within α % from the observed value is the largest. Proof • In other words: Proof: Conclusion Home Page – each prediction is formulated as a constraint, and Title Page – we look for parameters that maximize the number of satisfied constraint. ◭◭ ◮◮ • This technique is known as pred( α ). ◭ ◮ Page 6 of 17 • This method is especially widely used in software en- gineering, usually for α = 25. Go Back Full Screen Close Quit
Need to Estimate . . . Need to Estimate . . . 6. Problem Least Squares • For the Least Squares approach, the usual calculus pred(25) as an . . . ideas lead to an efficient optimization algorithm. Problem Main Result • However, no such easy solution is known for pred(25) Acknowledgments estimates. Proof • All known algorithms for this estimation are rather Proof: Conclusion time-consuming. Home Page • A natural question arises: Title Page – is this because we have not yet found a feasible ◭◭ ◮◮ algorithm for computing these estimates, or ◭ ◮ – is this estimation problem really hard? Page 7 of 17 • We prove that even for a linear model with no free term Go Back c n +1 , pred(25) estimation is NP-hard. Full Screen • In plain terms, this means that this problem is indeed inherently hard. Close Quit
Need to Estimate . . . Need to Estimate . . . 7. Main Result Least Squares Definition. Let α ∈ (0 , 1) be a rational number. By a pred(25) as an . . . linear pred( α )-estimation problem , we means the following: Problem Main Result • Given: an integer n , rational-valued tuples K Acknowledgments ( x 1 k , . . . , x nk , y k ) , 1 ≤ k ≤ K , and an integer M < K ; Proof • Check: whether there exist parameters c 1 , . . . , c n for Proof: Conclusion which in at least M cases k , we have Home Page � n � � n � Title Page � � � � � � � y k − c i · x ik � ≤ α · c i · x ik � . � � � � � � � � ◭◭ ◮◮ � i =1 i =1 ◭ ◮ Proposition. For every α , the linear pred( α )-estimation problem is NP-hard. Page 8 of 17 Go Back Full Screen Close Quit
Need to Estimate . . . Need to Estimate . . . 8. Acknowledgments Least Squares This work was supported in part by the National Science pred(25) as an . . . Foundation grants: Problem Main Result • HRD-0734825 and HRD-1242122 Acknowledgments (Cyber-ShARE Center of Excellence), and Proof • DUE-0926721. Proof: Conclusion Home Page Title Page ◭◭ ◮◮ ◭ ◮ Page 9 of 17 Go Back Full Screen Close Quit
Need to Estimate . . . Need to Estimate . . . 9. Proof Least Squares • To prove this result, we will reduce, to this problem, a pred(25) as an . . . known NP-hard problem of checking whether: Problem Main Result – a set of integer weights s 1 , . . . , s m Acknowledgments – can be divided into two parts of equal overall Proof weight. Proof: Conclusion • I.e., whether there exist integers y j ∈ {− 1 , 1 } for which Home Page m � y j · s j = 0. Title Page j =1 ◭◭ ◮◮ • In the reduced problem, we will have n = m + 1, with ◭ ◮ n = m + 1 unknown coefficients c 1 , . . . , c m , c m +1 . Page 10 of 17 • The parameters c i correspond to y i , and c m +1 = 1. Go Back • We build tuples corresponding to y i = 1 and y i = − 1 m Full Screen � for i ≤ m , to c m +1 = 1, and to c m +1 + y i · s i = 1. i =1 Close Quit
Need to Estimate . . . Need to Estimate . . . 10. Proof (cont-d) Least Squares • For y i = 1 or c m +1 = 1, we build tuples. pred(25) as an . . . Problem • In the first tuple, x ik = 1 + ε , x jk = 0 for all j � = i , and Main Result y k = 1. Acknowledgments • The resulting linear term has the form c i · (1 + ε ). Proof • Thus, the corr. inequality takes the form 1 − ε ≤ Proof: Conclusion (1 + ε ) · c i ≤ 1 + ε , i.e., equivalently, 1 − ε Home Page 1 + ε ≤ c i ≤ 1 . Title Page • In the second tuple, x ik = 1 − ε , x jk = 0 for all j � = i , and y k = 1. ◭◭ ◮◮ • The resulting linear term has the form c i · (1 − ε ). ◭ ◮ • Thus, the corr. inequality takes the form 1 − ε ≤ Page 11 of 17 (1 − ε ) · c i ≤ 1 + ε , i.e., equivalently, 1 ≤ c i ≤ 1 + ε 1 − ε. Go Back Full Screen • It should be mentioned that the only value c i that sat- isfies both inequalities is the value c i = 1. Close Quit
Need to Estimate . . . Need to Estimate . . . 11. Proof (cont-d) Least Squares • Similarly, for each y i = − 1, we build two tuples. pred(25) as an . . . Problem • In the first tuple, x ik = 1 + ε , x jk = 0 for all j � = i , and Main Result y k = − 1. Acknowledgments • The resulting linear term has the form c i · (1 + ε ). Proof • Thus, the corr. inequality takes the form − 1 − ε ≤ Proof: Conclusion (1+ ε ) · c i ≤ − 1 − ε , i.e., equivalently, − 1 ≤ c i ≤ − 1 − ε Home Page 1 + ε. Title Page • In the second tuple, x ik = 1 − ε , x jk = 0 for all j � = i , and y k = − 1. ◭◭ ◮◮ • The resulting linear term has the form c i · (1 − ε ). ◭ ◮ Page 12 of 17 • Thus, the corr. inequality takes the form − 1 − ε ≤ (1 − ε ) · c i ≤ − 1+ ε , i.e., equivalently, − 1 + ε 1 − ε ≤ c i ≤ − 1 . Go Back Full Screen • Here also, the only value c i that satisfies both inequal- ities is the value c i = − 1. Close Quit
Recommend
More recommend