Statistical View of Linear Least Squares Minjung Kyung - PowerPoint PPT Presentation

2005 SAMSI Undergraduate Workshop Statistical View of Linear Least Squares Minjung Kyung mkyung@stat.ncsu.edu May 22, 2005 0-0

2005 SAMSI Undergraduate Workshop Introduction to Linear Regression • Functional relation between two variables is expressed by a mathematical formula. – X denotes the independent variable – Y denotes the dependent variable – A functional relation is of the form Y = f ( X ) . – Given a particular value of X , the function f indicates the corresponding value of Y . Linear Least Squares SAMSI

2005 SAMSI Undergraduate Workshop Example of Functional Relation 300 250 200 150 Y 100 50 0 0 50 100 150 X Linear Least Squares SAMSI

2005 SAMSI Undergraduate Workshop Introduction to Linear Regression • Statistical relation between two variables – not a perfect one – in general, the observations for a statistical relation do not fall directly on the curve of relationship Linear Least Squares SAMSI

2005 SAMSI Undergraduate Workshop Scatter Plot Scatter Plot and Line of Statistical Relationship 500 500 400 400 Work Hrs Work Hrs 300 300 200 200 100 100 20 40 60 80 100 120 20 40 60 80 100 120 Lot Size Lot Size y=62.37+3.57x Linear Least Squares SAMSI

2005 SAMSI Undergraduate Workshop Curvilinear Statistical Realtion Example 50 40 Prognosis 30 20 10 0 10 20 30 40 50 60 Days Linear Least Squares SAMSI

2005 SAMSI Undergraduate Workshop Introduction to Linear Regression • A regression model is a formal means of expression the two essential ingredients of a statistical relation: 1. A tendency of the response variable Y to vary with the predictor variable X in a systematic fashion. 2. A scattering of points around the curve of statistical relationship. Linear Least Squares SAMSI

2005 SAMSI Undergraduate Workshop Simple Linear Regression Model Y i = β 0 + β 1 X i + ǫ i • Y i is the value of the response variable in the i th trial • β 0 and β 1 are parameters (the regression coefficients) • X i is the value of the predictor variable in the i th trial • ǫ i is a random error term Linear Least Squares SAMSI

2005 SAMSI Undergraduate Workshop Simple Linear Regression Model Model Assumptions 1. the error terms are normally distributed with mean 0 and variance σ 2 for all values of i ǫ i ∼ N (0 , σ 2 ) 2. the error terms ǫ i and ǫ j are independent if i � = j 3. Although the model explicitly allows for measurement error in Y , measurements made on X are known precisely (there is no measurement error) Linear Least Squares SAMSI

2005 SAMSI Undergraduate Workshop Simple Linear Regression Model Important Features of the Simple Linear Regression Model 1. The response Y i is a sum of 2 components: the deterministic term β 0 + β 1 X i and the random error term ǫ i . Therefore, Y i is a random variable. 2. The response Y i comes from a probability distribution whose mean is E [ Y i ] = β 0 + β 1 X i . 3. The response Y i exceeds or falls short of the value of the regression function by the error term amount ǫ i . 4. The responses Y i have the same constant variance as the error term ǫ i var [ Y i ] = var [ β 0 + β 1 X i + ǫ i ] = σ 2 . Linear Least Squares SAMSI

2005 SAMSI Undergraduate Workshop 5. The responses Y i and Y j are uncorrelated, since the error terms ǫ i and ǫ j are uncorrelated. In summary, the responses Y i come from normal distribution with mean E [ Y i ] = β 0 + β 1 X i and variance σ 2 , the same for all levels of X . Further, any two responses Y i and Y j are uncorrelated. N ( β 0 + β 1 X i , σ 2 ) Y i ∼ i.i.d Linear Least Squares SAMSI

2005 SAMSI Undergraduate Workshop Steps for Selecting an Appropriate Regression Model 1. Exploratory data analysis 2. Develop one or more tentative regression models 3. Examine and revise the regression models for their appropriateness for the data at hand (or develop new models) 4. Make inferences on basis of the selected regression model Linear Least Squares SAMSI

2005 SAMSI Undergraduate Workshop Estimation of the Regression Function Method of least square • For the observations ( X i , Y i ) , consider the deviation of Y i from its expected value Y i − ( β 0 + β 1 X i ) . • Consider the sum of the squared deviations n � ( Y i − β 0 − β 1 X i ) 2 . Q = (1) i =1 The estimators of β 0 and β 1 are those values � β 0 and � β 1 that minimize Q for the given sample observations ( X 1 , Y 1 ) , ( X 2 , Y 2 ) , . . . , ( X n , Y n ) . Linear Least Squares SAMSI

2005 SAMSI Undergraduate Workshop Estimation of the Regression Function Least square estimator • The estimator � β 0 and � β 1 that satisfy the least squares criterion can be found in 2 ways 1. Numerical Search Procedures 2. Analytical procedures We will use analytical approach. Linear Least Squares SAMSI

2005 SAMSI Undergraduate Workshop Estimation of the Regression Function Least square estimator • The values of β 0 and β 1 that minimize Q can be derived by differentiating (1) with respect to β 0 and β 1 and setting the result equal to 0 n � ∂Q = − 2 ( Y i − β 0 − β 1 X i ) = 0 ∂β 0 i =1 n � ∂Q = − 2 X i ( Y i − β 0 − β 1 X i ) = 0 ∂β 1 i =1 • Simplifying, we get the normal equations n n � � Y i − nβ 0 − β 1 X i = 0 i =1 i =1 Linear Least Squares SAMSI

2005 SAMSI Undergraduate Workshop n n n � � � X 2 X i Y i − β 0 X i − β 1 i = 0 i =1 i =1 i =1 • The normal equations can be solved simultaneously to get estimates of the parameters β 0 and β 1 � n i =1 ( X i − X )( Y i − Y ) � β 1 = � n i =1 ( X i − X ) 2 � n � n � � β 0 = 1 � Y i − � = Y − � β 1 X i β 1 X n i =1 i =1 where X and Y are the means of the X and Y observations, respectively. Linear Least Squares SAMSI

2005 SAMSI Undergraduate Workshop Estimation of the Regression Function Residuals • The fitted value for the i th case, Y i = � � β 0 + � β 1 X i • The i th residual is the difference between the observed value Y i and the fitted value � Y i e i = Y i − � Y i = Y i − ( � β 0 + � β 1 X i ) . • Model Error Term: ǫ i = Y i − ( β 0 + β 1 X i ) Represents the vertical deviation of Y i from the unknown true regression line. • Residual: e i = Y i − � Y i Linear Least Squares SAMSI

2005 SAMSI Undergraduate Workshop – Represents the vertical deviation of Y i from the fitted value � Y i on the estimated regression line. – Residuals are useful for studying whether a given regression model is appropriate for the given data. Linear Least Squares SAMSI

2005 SAMSI Undergraduate Workshop Properties of fitted regression line • � n i =1 e i = 0 . • � n i =1 e i , is a minimum. • � n i =1 Y i = � n i =1 � Y i . • � n i =1 X i e i = 0 . • � n i =1 � Y i e i = 0 . • The regression line always goes through the point ( X, Y ) . Linear Least Squares SAMSI

2005 SAMSI Undergraduate Workshop Estimation of σ 2 • A variety of inferences concerning the regression function require an estimate of σ 2 . – To get an estimate of σ 2 , first compute the error sum of squares or residual sum of squares: n n � � Y i ) 2 = ( Y i − � e 2 SSE = i . i =1 i =1 – The mean square error(MSE) is computed as MSE = SSE n − 2 . – It can be shown that MSE is an unbiased estimator of σ 2 . Linear Least Squares SAMSI

2005 SAMSI Undergraduate Workshop Matrix Approach to Least Squares • The regression model Y i = β 0 + β 1 X i + ǫ i can be written in matrix notation as Y = X β + ǫ where       Y 1 1 X 1 ǫ 1               Y 1 1 X 2 ǫ 2        β 0      ,   Y = , X = β = ǫ = . . . .       . . . . β 1       . . . .       Y n 1 X n ǫ n Linear Least Squares SAMSI

2005 SAMSI Undergraduate Workshop Matrix Approach to Least Squares • The normal equations in matrix form are X ′ X β = X ′ Y • The model parameters can be estimated as follows: β = ( X ′ X ) − 1 X ′ Y � Linear Least Squares SAMSI

2005 SAMSI Undergraduate Workshop Matrix Approach to Least Squares • The residuals are computed using: e = Y − � Y = Y − X � β, where   β 0 + � � β 1 X 1    .  � . Y =   .   β 0 + � � β 1 X n • The estimate for σ 2 is computed as follows: e ′ e σ 2 = n − 2 Linear Least Squares SAMSI

2005 SAMSI Undergraduate Workshop Inferences in Regression Analysis Inferences concerning β 1 • Point Estimator of β 1 � n i =1 ( X i − X )( Y i − Y ) � β 1 = � n i =1 ( X i − X ) 2 • Estimate of the standard error of � β 1 MSE SE [ � β 1 ] = � n i =1 ( X i − X ) 2 • Confidence interval for β 1 : β 1 ± t 1 − α/ 2; n − 2 SE [ � � β 1 ] Linear Least Squares SAMSI

2005 SAMSI Undergraduate Workshop Inferences in Regression Analysis Inferences concerning β 1 • To test H 0 : β 1 = 0 vs. H 0 : β 1 � = 0 , – Test statistics � β 1 − 0 t = SE [ � β 1 ] – p-value p ( | t | > T (1 − α/ 2 ,n − 2) ) → if p-value < α , we reject H 0 . Linear Least Squares SAMSI

Statistical View of Linear Least Squares Minjung Kyung - PowerPoint PPT Presentation

2005 SAMSI Undergraduate Workshop Statistical View of Linear Least Squares Minjung Kyung mkyung@stat.ncsu.edu May 22, 2005 0-0 2005 SAMSI Undergraduate Workshop Introduction to Linear Regression Functional relation between two variables

Practical Least-Squares for Computer Graphics Siggraph Course 11 Siggraph Course 11 Practical

Statistical Properties of the Regularized Least Squares Functional and a hybrid LSQR Newton method

Linear Least Squares I Steve Marschner Cornell CS 322 Cornell CS 322 Linear Least Squares I 1

Least Mean Squares Regression Machine Learning 1 Least Squares Method for regression

Non linear Least Squares Lectures for PHD course on Numerical optimization Enrico Bertolazzi

The Mathemagic of Magic Squares History of Magic Squares Mathematics and Magic Squares

Statistical Geometry Processing Winter Semester 2011/2012 Least-Squares Least-Squares Fitting

The Chi-squared Distribution of the Regularized Least Squares Functional for Regularization

8. Least squares Review of linear equations Least squares Example: curve-fitting

Geometry of Least Squares 2 Least squares from the

Non-linear Least Squares and Durbins Problem Asymptotic Theory Part V James J. Heckman

ECE 516: Adaptive Digital Filters Lecture 13 (Recursive Least-Squares) Mojtaba Soltanalian 2

9. Equality constraints and tradeoffs More least squares Example: moving average model

Moving Least Squares Outline The Approximation Power of Moving Least- Squares D. Levin

Topic 5: Non-Linear Relationships and Non-Linear Least Squares Non-linear Relationships Many

A fast way to compute Least Squares Teo Zhi Shen Anderson Serangoon Junior College Least

Handling of Position Errors in Variational and Hybrid Ensemble/Variational Data Assimilation Using

Diamonds in the Rough: Generating Fluent Sentences from Early-stage Drafts for Academic Writing

Some categorical aspects of coarse spaces and balleans Nicol` o Zava joint work with Dikran

Annual Meeting April 16, 2015 Note: All financial disclosure in this presentation is, unless

Applied Political Research Session 13: Multiple Regression Analysis Lecturer: Prof. A.

CS 147: Computer Systems Performance Analysis Two-Factor Designs 1 / 34 Overview CS147

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture14: Logistic regression

Machine Learning Lecture 5 Support Vector Machines Justin Pearson 1 2020 1