CS147 2015-06-15 CS 147: Computer Systems Performance Analysis Linear Regression Models CS 147: Computer Systems Performance Analysis Linear Regression Models 1 / 32
Overview CS147 Overview 2015-06-15 What is a (good) model? Estimating Model Parameters Allocating Variation Confidence Intervals for Regressions Overview Parameter Intervals Prediction Intervals Verifying Regression What is a (good) model? Estimating Model Parameters Allocating Variation Confidence Intervals for Regressions Parameter Intervals Prediction Intervals Verifying Regression 2 / 32
What is a (good) model? What Is a (Good) Model? CS147 What Is a (Good) Model? 2015-06-15 What is a (good) model? ◮ For correlated data, model predicts response given an input ◮ Model should be equation that fits data ◮ Standard definition of “fits” is least-squares ◮ Minimize squared error ◮ Keep mean error zero What Is a (Good) Model? ◮ Minimizes variance of errors ◮ For correlated data, model predicts response given an input ◮ Model should be equation that fits data ◮ Standard definition of “fits” is least-squares ◮ Minimize squared error ◮ Keep mean error zero ◮ Minimizes variance of errors 3 / 32
What is a (good) model? Least-Squared Error CS147 Least-Squared Error 2015-06-15 What is a (good) model? ◮ If ˆ y = b 0 + b 1 x then error in estimate for x i is e i = y i − ˆ y i ◮ Minimize Sum of Squared Errors (SSE) n n � e 2 � ( y i − b 0 − b 1 x i ) 2 i = i = 1 i = 1 ◮ Subject to the constraint Least-Squared Error n n � � e i = ( y i − b 0 − b 1 x i ) = 0 i = 1 i = 1 ◮ If ˆ y = b 0 + b 1 x then error in estimate for x i is e i = y i − ˆ y i ◮ Minimize Sum of Squared Errors (SSE) n n � � e 2 ( y i − b 0 − b 1 x i ) 2 i = i = 1 i = 1 ◮ Subject to the constraint n n � � e i = ( y i − b 0 − b 1 x i ) = 0 i = 1 i = 1 4 / 32
Estimating Model Parameters Estimating Model Parameters CS147 Estimating Model Parameters 2015-06-15 Estimating Model Parameters ◮ Best regression parameters are � x i y i − nxy b 1 = b 0 = y − b 1 x � x 2 i − nx 2 where Estimating Model Parameters x = 1 y = 1 � x i � y i n n ◮ Note that book may have errors in these equations! ◮ Best regression parameters are � x i y i − nxy b 1 = b 0 = y − b 1 x � x 2 i − nx 2 where x = 1 y = 1 � � x i y i n n ◮ Note that book may have errors in these equations! 5 / 32
Estimating Model Parameters Parameter Estimation Example CS147 Parameter Estimation Example 2015-06-15 Estimating Model Parameters ◮ Execution time of a script for various loop counts: Loops 3 5 7 9 10 Time 1.2 1.7 2.5 2.9 3.3 ◮ x = 6 . 8, y = 2 . 32, � xy = 88 . 54, � x 2 = 264 ◮ b 1 = 88 . 54 − 5 ( 6 . 8 )( 2 . 32 ) Parameter Estimation Example = 0 . 29 264 − 5 ( 6 . 8 ) 2 ◮ b 0 = 2 . 32 − ( 0 . 29 )( 6 . 8 ) = 0 . 35 ◮ Execution time of a script for various loop counts: Loops 3 5 7 9 10 Time 1.2 1.7 2.5 2.9 3.3 ◮ x = 6 . 8, y = 2 . 32, � xy = 88 . 54, � x 2 = 264 ◮ b 1 = 88 . 54 − 5 ( 6 . 8 )( 2 . 32 ) = 0 . 29 264 − 5 ( 6 . 8 ) 2 ◮ b 0 = 2 . 32 − ( 0 . 29 )( 6 . 8 ) = 0 . 35 6 / 32
Estimating Model Parameters Graph of Parameter Estimation Example CS147 Graph of Parameter Estimation Example 2015-06-15 Estimating Model Parameters 3 2 Graph of Parameter Estimation Example 1 0 0 2 4 6 8 10 12 3 2 1 0 0 2 4 6 8 10 12 7 / 32
Allocating Variation Allocating Variation CS147 Allocating Variation 2015-06-15 Allocating Variation Analysis of Variation (ANOVA): ◮ If no regression, best guess of y is y ◮ Observed values of y differ from y , giving rise to errors (variance) ◮ Regression gives better guess, but there are still errors Allocating Variation ◮ We can evaluate quality of regression by allocating sources of errors Analysis of Variation (ANOVA): ◮ If no regression, best guess of y is y ◮ Observed values of y differ from y , giving rise to errors (variance) ◮ Regression gives better guess, but there are still errors ◮ We can evaluate quality of regression by allocating sources of errors 8 / 32
Allocating Variation The Total Sum of Squares CS147 The Total Sum of Squares 2015-06-15 Without regression, squared error is Allocating Variation n n ( y i − y ) 2 = = � � ( y 2 i − 2 y i y + y 2 ) SST i = 1 i = 1 � n � n � � � y 2 � + ny 2 = − 2 y y i i i = 1 i = 1 � n Without regression, squared error is � � y 2 − 2 y ( ny ) + ny 2 = The Total Sum of Squares i i = 1 � n � = � y 2 − ny 2 i i = 1 = SSY − SS0 n n ( y i − y ) 2 = � � i − 2 y i y + y 2 ) ( y 2 SST = i = 1 i = 1 � n � n � � � y 2 � + ny 2 = − 2 y y i i i = 1 i = 1 � n � � y 2 − 2 y ( ny ) + ny 2 = i i = 1 � n � � − ny 2 y 2 = i i = 1 = SSY − SS0 9 / 32
Allocating Variation The Sum of Squares from Regression CS147 The Sum of Squares from Regression 2015-06-15 Allocating Variation ◮ Recall that regression error is SSE = � e 2 i = � ( y i − y ) 2 ◮ Error without regression is SST (previous slide) ◮ So regression explains SSR = SST − SSE The Sum of Squares from Regression ◮ Regression quality measured by coefficient of determination R 2 = SSR SST = SST − SSE SST ◮ Recall that regression error is � � e 2 ( y i − y ) 2 SSE = i = ◮ Error without regression is SST (previous slide) ◮ So regression explains SSR = SST − SSE ◮ Regression quality measured by coefficient of determination R 2 = SSR SST = SST − SSE SST 10 / 32
Allocating Variation Evaluating Coefficient of Determination CS147 Evaluating Coefficient of Determination 2015-06-15 Allocating Variation ◮ Compute SST = ( � y 2 ) − ny 2 ◮ Compute SSE = � y 2 − b 0 � y − b 1 � xy ◮ Compute R 2 = SST − SSE SST Evaluating Coefficient of Determination ◮ Compute SST = ( � y 2 ) − ny 2 ◮ Compute SSE = � y 2 − b 0 � y − b 1 � xy ◮ Compute R 2 = SST − SSE SST 11 / 32
Allocating Variation Example of Coefficient of Determination CS147 Example of Coefficient of Determination 2015-06-15 Allocating Variation For previous regression example: Loops 3 5 7 9 10 Time 1.2 1.7 2.5 2.9 3.3 ◮ � y = 11 . 60, � y 2 = 29 . 79, � xy = 88 . 54, ny 2 = 5 ( 2 . 32 ) 2 = 26 . 9 ◮ SSE = 29 . 79 − ( 0 . 35 )( 11 . 60 ) − ( 0 . 29 )( 88 . 54 ) = 0 . 05 Example of Coefficient of Determination ◮ SST = 29 . 79 − 26 . 9 = 2 . 89 ◮ SSR = 2 . 89 − 0 . 05 = 2 . 84 ◮ R 2 = ( 2 . 89 − 0 . 05 ) / 2 . 89 = 0 . 98 For previous regression example: Loops 3 5 7 9 10 Time 1.2 1.7 2.5 2.9 3.3 ◮ � y = 11 . 60, � y 2 = 29 . 79, � xy = 88 . 54, ny 2 = 5 ( 2 . 32 ) 2 = 26 . 9 ◮ SSE = 29 . 79 − ( 0 . 35 )( 11 . 60 ) − ( 0 . 29 )( 88 . 54 ) = 0 . 05 ◮ SST = 29 . 79 − 26 . 9 = 2 . 89 ◮ SSR = 2 . 89 − 0 . 05 = 2 . 84 ◮ R 2 = ( 2 . 89 − 0 . 05 ) / 2 . 89 = 0 . 98 12 / 32
Allocating Variation Standard Deviation of Errors CS147 Standard Deviation of Errors 2015-06-15 Allocating Variation ◮ Variance of errors is SSE divided by degrees of freedom ◮ DOF is n − 2 because we’ve calculated 2 regression parameters from the data ◮ So variance ( mean squared error , MSE) is SSE / ( n − 2 ) � SSE ◮ Standard deviation of errors is square root: s e = Standard Deviation of Errors n − 2 (minor error in book) ◮ Variance of errors is SSE divided by degrees of freedom ◮ DOF is n − 2 because we’ve calculated 2 regression parameters from the data ◮ So variance ( mean squared error , MSE) is SSE / ( n − 2 ) � SSE ◮ Standard deviation of errors is square root: s e = n − 2 (minor error in book) 13 / 32
Allocating Variation Checking Degrees of Freedom CS147 Checking Degrees of Freedom 2015-06-15 Allocating Variation Degrees of freedom always equate: ◮ SS0 has 1 (computed from y ) ◮ SST has n − 1 (computed from data and y , which uses up 1) ◮ SSE has n − 2 (needs 2 regression parameters) ◮ So Checking Degrees of Freedom SST = SSY − SS0 = SSR + SSE n − 1 = n − 1 = 1 + ( n − 2 ) Degrees of freedom always equate: ◮ SS0 has 1 (computed from y ) ◮ SST has n − 1 (computed from data and y , which uses up 1) ◮ SSE has n − 2 (needs 2 regression parameters) ◮ So SST = SSY − SS0 = SSR + SSE n − 1 = n − 1 = 1 + ( n − 2 ) 14 / 32
Allocating Variation Example of Standard Deviation of Errors CS147 Example of Standard Deviation of Errors 2015-06-15 Allocating Variation ◮ For regression example, SSE was 0.05, so MSE is 0 . 05 / 3 = 0 . 017 and s e = 0 . 13 ◮ Note high quality of our regression: ◮ R 2 = 0 . 98 ◮ s e = 0 . 13 Example of Standard Deviation of Errors ◮ Why such a nice straight-line fit? ◮ For regression example, SSE was 0.05, so MSE is 0 . 05 / 3 = 0 . 017 and s e = 0 . 13 ◮ Note high quality of our regression: ◮ R 2 = 0 . 98 ◮ s e = 0 . 13 ◮ Why such a nice straight-line fit? 15 / 32
Recommend
More recommend