Lecture 13. Nonparametric GLMs Nan Ye School of Mathematics and - PowerPoint PPT Presentation

Lecture 13. Nonparametric GLMs Nan Ye School of Mathematics and Physics University of Queensland 1 / 21

Nonparametric Models Parametric models • Fixed structure and number of parameters. • Represent a fixed class of functions. Nonparametric models • Flexible structure where the number of parameters usually grow as more data becomes available. • The class of functions represented depends on the data. • Not models without parameters, but nonparametric in the sense that they do not have fixed structures and numbers of parameters as in parametric models. 2 / 21

This Lecture • k -NN • LOESS • Splines 3 / 21

k -NN Regression Algorithm • Training set is ( x 1 , y 1 ) , . . . , ( x n , y n ). • To compute E ( Y | x ) for any x • N k ( x ) ← nearest k training examples. • Predict the average response for the examples in N α ( x ). 4 / 21

Effect of k • Training error is zero when k = 1, and approximately increases as k increases. • However, the fitted 1-NN model is often not smooth and does not work well on test data. • Cross-validation can be used to choose a suitable k . 5 / 21

Remarks • k -NN is data inefficient • For high-dimensional problems, the amount of data required for good performance is often huge. • k -NN is computationally inefficient • Naively, predicting on m test examples requires O ( nmk ) time. • This can be improved, but still k -NN is very slow. 6 / 21

LOESS (LOcal regrESSion) Idea • Training set is ( x 1 , y 1 ) , . . . , ( x n , y n ). • To compute E ( Y | x ) for any x • N α ( x ) ← nearest n α training examples. • Perform a weighted linear regression using N α ( x ). • Evaluate the fitted linear model at x . • The locality parameter α controls the neighborhood size. 7 / 21

Details • Local weighted linear regression is as follows w ( ‖ x − x ′ ‖ )( y ′ − β ⊤ x ′ ) 2 , ∑︂ θ = arg min β ( x ′ , y ′ ) ∈ N α ( x ) • The weight function w is defined by )︃ 3 1 − d 3 (︃ w ( d ) = , M 3 where M = max(1 , α ) 1 / p max ( x ′ , y ′ ) ∈ N α ( x ) ‖ x − x ′ ‖ is the scaled maximum distance. 8 / 21

Effect of α • If α is very small, the neighborhood may have too few points, for the weighted least squares problem to have a unique solution. • In general, a smaller α makes the fitted surface more wiggly. • As α → ∞ , we have w ( d ) → 1, and θ becomes the OLS parameter. Thus LOESS converges to OLS as α → ∞ . 9 / 21

LOESS with higher degree terms • We can add higher degree terms like quadratic terms x i x j before we perform regression. • This can be helpful if the linear predictor does not work well. 10 / 21

Data > head(cars) speed dist 1 4 2 2 4 10 3 7 4 4 7 22 5 8 16 6 9 10 > dim(cars) [1] 50 2 11 / 21

Scatterplot 120 ● ● ● ● 80 ● ● ● ● ● dist ● 60 ● ● ● ● ● ● ● ● ● ● ● 40 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 20 ● ● ● ● ● ● ● ● ● ● ● ● 0 5 10 15 20 25 speed 12 / 21

LOESS in R a = 2 deg = 2 fit.loess <- loess(dist ~ speed, cars, span=a, degree=deg) 13 / 21

Comparison of OLS and LOESS 120 ● lm loess (a=2, d=2) ● ● ● 80 ● ● ● ● ● dist ● 60 ● ● ● ● ● ● ● ● ● ● ● 40 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 20 ● ● ● ● ● ● ● ● ● ● ● ● 0 5 10 15 20 25 speed • The linearity assumption of OLS is rigid and does not adapt to the data’s complexity. • LOESS is capable of adapting to the data’s complexity through local regression, and better fits the data than OLS. 14 / 21

Effect of α 120 ● loess (a=.5, d=2) loess (a=2, d=2) ● ● ● 80 ● ● ● ● ● dist ● 60 ● ● ● ● ● ● ● ● ● ● ● 40 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 20 ● ● ● ● ● ● ● ● ● ● ● ● 0 5 10 15 20 25 speed Smaller α leads to a more wiggly fit. 15 / 21

Effect of degree 120 ● loess (a=.5, d=1) loess (a=.5, d=2) ● ● ● ● 80 ● ● ● ● dist ● 60 ● ● ● ● ● ● ● ● ● ● ● 40 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 20 ● ● ● ● ● ● ● ● ● ● ● ● 0 5 10 15 20 25 speed Higher degree leads to a more wiggly fit. 16 / 21

Splines • A flat spline is a device used for drawing smooth curves. • A spline is a smooth piecewise polynomial function. 17 / 21

Spline, order, and knots • A function f : R → R is a spline of order k with knots at t 1 < . . . < t m if • f ( x ) is a polynomial of degree k on each of the interval ( −∞ , t 1 ] , [ t 1 , t 2 ] , . . . , [ t m , ∞ ), and • its i -th derivative f ( i ) ( x ) is continuous at each knot for each i = 0 , . . . , k − 1. • The cubic splines ( k = 3) are most commonly used. • Natural splines are linear beyond t 1 and t m . 18 / 21

Truncated power basis • An order- k spline with knots t 1 , . . . , t m is a linear combination of the following k + m + 1 basis functions h 1 ( x ) = 1 , h 2 ( x ) = x , . . . , h k +1 ( x ) = x k , h k +1+ j ( x ) = ( x − t j ) k + , j = 1 , . . . , m , where ( x ) + = max(0 , x ) is the positive part function. • These basis functions are called the truncated power basis. 19 / 21

Spline regression as linear regression • Training data: ( x 1 , y 1 ) , . . . , ( x n , y n ) ∈ R × R . • Given knots t 1 , . . . , t m , an order k spline is fitted by minimizing n ˆ ∑︂ ( β ⊤ z i − y i ) 2 , β = i =1 where z i = ( h 1 ( x i ) , . . . , h k +1+ m ( x i )). • The fitted spline is ∑︂ ˆ f ( x ) = β i h i ( x ) . i • The knots can be chosen in a data-dependent way (e.g. equally spaced between min and max x ). 20 / 21

What You Need to Know • Nonparametric models can adapt to data’s complexity. • k -NN: averaging over a neighborhood. • LOESS: weighted linear regression over a neighborhood. • Splines: fit smooth piecewise polynomials. 21 / 21

Lecture 13. Nonparametric GLMs Nan Ye School of Mathematics and - PowerPoint PPT Presentation

Lecture 13. Nonparametric GLMs Nan Ye School of Mathematics and Physics University of Queensland 1 / 21 Nonparametric Models Parametric models Fixed structure and number of parameters. Represent a fixed class of functions.

Lecture 14. Nonparametric GLMs (cont.) Nan Ye School of Mathematics and Physics University of

Lecture 15 GPs for GLMs + Spatial Data 3/20/2018 1 GPs and GLMs 2 Bern (

Lecture 16. Mixed Models Nan Ye School of Mathematics and Physics University of Queensland 1 /

Data Cleaning Nan Tang, QCRI Big Data Cleaning Nan Tang, QCRI Big Data Cleaning Nan Tang,

Development of Taiwans Bunun Tribe Nan-An Tribe Natural Environment in Nan-An Location

CNBC Matlab Mini-Course Inf and NaN 3/0 returns Inf 0/0 returns NaN David S. Touretzky

Nonparametric Regression Splines for Nonparametric Regression Splines for Regional Atmospheric

Nonparametric Sequential Change Detection for High-Dimensional Problems Yasin Ylmaz Electrical

The np package np : A Package for Nonparametric Kernel The np package implements a variety of

Nonparametric analysis of CMB Nonparametric analysis of CMB power spectrum data and consistency

Maron & Ibben Dron Nan Kojbarok Aurok In Dren Maron & Ibben Dron Nan Kojbarok Aurok In Dren

Repairing Four-Atom Conjecture Ting-Ting Nan Advisor: Nigel Boston SP Coding and Information

Lecture 12. Quasi-likelihood Nan Ye School of Mathematics and Physics University of Queensland

Lecture 7: GLMs: Score equations, Residuals Author: Nick Reich / Transcribed by Bing Miu and

Lecture 6: GLMs Author: Nicholas Reich Transcribed by Nutcha Wattanachit/Edited by Bianca Doone

Introduction to Nonparametric Bayesian Modeling and Gaussian Process Regression Piyush Rai Dept.

Chapter 1 Hardware Unit 1.3 Microprocessor 2019-09-14 1. The

Assembly Language Programming Processor architecture Zbigniew Jurkiewicz, Instytut Informatyki UW

The Instruction Set Architecture Level Wolfgang Schreiner Research Institute for Symbolic

Selected Pentium Instructions Chapter 12 S. Dandamudi Outline Status flags Conditional

R E G R E S S I O N D I S C O N T I N U I T Y I PMAP 8521: Program Evaluation for Public Service

ggplot and the GRAMMAR OF GRAPHICS MAPPING vs SETTING AESTHETICS p <- ggplot (data =

ASIAN TRANSFORMATIONS Some Discussants Comments Will Martin International Food Policy

Classes in R Modified from materials by Mark Hansen, STAT 202a Object oriented programming Based

Lecture 13. Nonparametric GLMs Nan Ye School of Mathematics and - PowerPoint PPT Presentation

Lecture 13. Nonparametric GLMs Nan Ye School of Mathematics and Physics University of Queensland 1 / 21 Nonparametric Models Parametric models Fixed structure and number of parameters. Represent a fixed class of functions.

Lecture 14. Nonparametric GLMs (cont.) Nan Ye School of Mathematics and Physics University of

Lecture 15 GPs for GLMs + Spatial Data 3/20/2018 1 GPs and GLMs 2 Bern (

Lecture 16. Mixed Models Nan Ye School of Mathematics and Physics University of Queensland 1 /

Data Cleaning Nan Tang, QCRI Big Data Cleaning Nan Tang, QCRI Big Data Cleaning Nan Tang,

Development of Taiwans Bunun Tribe Nan-An Tribe Natural Environment in Nan-An Location

CNBC Matlab Mini-Course Inf and NaN 3/0 returns Inf 0/0 returns NaN David S. Touretzky

Nonparametric Regression Splines for Nonparametric Regression Splines for Regional Atmospheric

Nonparametric Sequential Change Detection for High-Dimensional Problems Yasin Ylmaz Electrical

The np package np : A Package for Nonparametric Kernel The np package implements a variety of

Nonparametric analysis of CMB Nonparametric analysis of CMB power spectrum data and consistency

Maron &amp; Ibben Dron Nan Kojbarok Aurok In Dren Maron &amp; Ibben Dron Nan Kojbarok Aurok In Dren

Repairing Four-Atom Conjecture Ting-Ting Nan Advisor: Nigel Boston SP Coding and Information

Lecture 12. Quasi-likelihood Nan Ye School of Mathematics and Physics University of Queensland

Lecture 7: GLMs: Score equations, Residuals Author: Nick Reich / Transcribed by Bing Miu and

Lecture 6: GLMs Author: Nicholas Reich Transcribed by Nutcha Wattanachit/Edited by Bianca Doone

Introduction to Nonparametric Bayesian Modeling and Gaussian Process Regression Piyush Rai Dept.

Chapter 1 Hardware Unit 1.3 Microprocessor 2019-09-14 1. The

Assembly Language Programming Processor architecture Zbigniew Jurkiewicz, Instytut Informatyki UW

The Instruction Set Architecture Level Wolfgang Schreiner Research Institute for Symbolic

Selected Pentium Instructions Chapter 12 S. Dandamudi Outline Status flags Conditional

R E G R E S S I O N D I S C O N T I N U I T Y I PMAP 8521: Program Evaluation for Public Service

ggplot and the GRAMMAR OF GRAPHICS MAPPING vs SETTING AESTHETICS p &lt;- ggplot (data =

ASIAN TRANSFORMATIONS Some Discussants Comments Will Martin International Food Policy

Classes in R Modified from materials by Mark Hansen, STAT 202a Object oriented programming Based

Maron & Ibben Dron Nan Kojbarok Aurok In Dren Maron & Ibben Dron Nan Kojbarok Aurok In Dren

ggplot and the GRAMMAR OF GRAPHICS MAPPING vs SETTING AESTHETICS p <- ggplot (data =