Nonparametric Methods Michael R. Roberts Department of Finance The - PowerPoint PPT Presentation

Introduction Density Estimation Regression Nonparametric Methods Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania July 28, 2009 Michael R. Roberts Nonparametric Methods 1/42

Introduction Density Estimation Regression Overview Great for data analysis and robustness tests. Also used extensively in program evaluation Estimation of propensity scores 1 Estimation of conditional regression functions 2 Goal here is to introduce and operationalize nonparametric density estimation, and 1 regression 2 Michael R. Roberts Nonparametric Methods 2/42

Introduction Histogram Density Estimation Kernel Estimator Regression Probability Density Functions (PDF) Basic characteristics of a random variable X is its PDF, f or CDF , F Given a sample of observations X i : i = 1 , ..., N , goal is to estimate the PDF Options Parametric: Assume a functional form for f and estimate the 1 parameters of the function. E.g., N ( µ, σ 2 ) Nonparametric: Estimate the full function, f , without assuming a 2 particular functional form for f . Nonparametric “let the data speak.” We’re going to follow Silverman (1986) closely. Michael R. Roberts Nonparametric Methods 3/42

Introduction Histogram Density Estimation Kernel Estimator Regression Histogram Origin : x 0 Bin Width : h (a.k.a. window width ) Bins : [ x 0 + mh , x 0 + ( m + 1) h ) for m ∈ Z Histogram : f ( x ) = 1 ˆ nh (# of X i in the same bin as x ) Michael R. Roberts Nonparametric Methods 4/42

Introduction Histogram Density Estimation Kernel Estimator Regression Sample Histograms N = 100, Origin = Min ( X i ), Bin Width = 0 . 79 × IQR × N 1 / 5 Michael R. Roberts Nonparametric Methods 5/42

Introduction Histogram Density Estimation Kernel Estimator Regression Sensitivity of Histograms Histogram estimate is sensitive to choice of origin and bin width Michael R. Roberts Nonparametric Methods 6/42

Introduction Histogram Density Estimation Kernel Estimator Regression Naive Estimator The density, f , of rv X can be written 1 f ( x ) = lim 2 hPr ( x − h < X < x + h ) h → 0 Given h , we can estimate Pr ( x − h < X < x + h ) by the proportion of observations falling in the interval (bin) 1 ˆ f ( x ) = 2 nh [# of X i falling in ( x − h , x + h )] Mathematically, this is just N � x − X i � f ( x ) = 1 1 � ˆ hW n h i =1 where � 1 / 2 if | x | < 1 W ( x ) = 0 otherwise Michael R. Roberts Nonparametric Methods 7/42

Introduction Histogram Density Estimation Kernel Estimator Regression Naive Estimator - An Example Consider a sample { X i } 10 i =1 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 Let the bin width = 2, then � 1 � 4 − 1 � � 4 − 2 � � 4 − 10 �� 1 + 1 + ... + 1 ˆ f (4) = 2 W 2 W 2 W 10 2 2 2 � � 1 � � 1 � � 1 � � 1 1 1 1 = 0 + 0 + + + + 0 + ... + 0 10 2 2 2 2 2 2 3 = 40 Michael R. Roberts Nonparametric Methods 8/42

Introduction Histogram Density Estimation Kernel Estimator Regression Naive Estimator - An Example from Silverman Michael R. Roberts Nonparametric Methods 9/42

Introduction Histogram Density Estimation Kernel Estimator Regression Naive Estimator - Discussion From def of W ( x ), estimate of f is constructed by placing box of width 2 h and height (2 nh ) − 1 on each observation and summing. Attempt to construct histogram where every point, x , is the center of a sampling interval ( x + h , x − h ) We don’t need a choice of origin, x 0 , anymore Choice of bin width, h , remains and is crucial for controlling degree of smoothing Large h produce smoother estimates Small h produce more jagged estimates Drawbacks: ˆ f is discontinuous, jumps at points X + i ± h and zero derivative everywhere else Michael R. Roberts Nonparametric Methods 10/42

Introduction Histogram Density Estimation Kernel Estimator Regression Definition & Intuition Replace weight fxn W in naive estimator by a Kernel Function K : � ∞ K ( x ) dx = 1 − infty Kernel estimator is: N � x − X i � f ( x ) = 1 � ˆ K nh h i =1 where h is window width or smoothing parameter or bandwidth Intuition: Naive estimator is a sum of boxes centered at observations Kernel estimator is a sum of bumps centered at observations Kernel choice determines shape of bumps Michael R. Roberts Nonparametric Methods 11/42

Introduction Histogram Density Estimation Kernel Estimator Regression Kernel Estimator - Example Michael R. Roberts Nonparametric Methods 12/42

Introduction Histogram Density Estimation Kernel Estimator Regression Varying the Window Width Michael R. Roberts Nonparametric Methods 13/42

Introduction Histogram Density Estimation Kernel Estimator Regression Example Discussion X ’s correspond to data points (the sample: N = 7) Centered over each data point, is a little curve — bump — 1 / ( nh ) K [( x − X i ) / h ] The estimated density, ˆ f , constructed by adding up each bump at each data point is also shown As h → 0 we get a sum of Dirac delta function spikes at the observations If K is a PDF, then so is ˆ f ˆ f inherits the continuity and differentiability properties of K For data with long-tails, get spurious noise to appear in the tails since window width is fixed across entire sample If window width widened to smooth away tail detail, detail in main part of dist is lost adaptive methods address this problem Michael R. Roberts Nonparametric Methods 14/42

Introduction Histogram Density Estimation Kernel Estimator Regression Long Tail Data Michael R. Roberts Nonparametric Methods 15/42

Introduction Histogram Density Estimation Kernel Estimator Regression Sample Kernels: Definitions � 1 | t | < 1 2 Rectangular (Uniform) : K ( t ) = 0 otherwise � 1 − | t | | t | < 1 Triangular : K ( t ) = 0 otherwise 5 t 2 �� √ √ � � 3 1 − 1 5 | t | < 5 Epanechnikov : K ( t ) = 4 0 otherwise � 15 � 1 − t 2 � 2 | t | < 1 Biweight (Quartic) : K ( t ) = 16 0 otherwise � 35 � 1 − t 2 � 3 | t | < 1 Triweight : K ( t ) = 32 0 otherwise 1 e ( − 1 / 2) t 2 √ Gaussian : K ( t ) = 2 π Michael R. Roberts Nonparametric Methods 16/42

Introduction Histogram Density Estimation Kernel Estimator Regression Sample Kernels - Figures Michael R. Roberts Nonparametric Methods 17/42

Introduction Histogram Density Estimation Kernel Estimator Regression Measures of Discrepancy Mean Square Error (Pointwise Accuracy) MSE x (ˆ E [ˆ f ( x ) − f ( x )] 2 f ) = [ E ˆ f ( x ) − f ( x )] 2 + Var ˆ = f ( x ) � �� Bias Variance Tradeoff: Bias can be reduced at expense of increased variance by adjusting the amount of smoothing Mean Integrated Square Error (Global Accuracy) � MISE x (ˆ [ˆ f ( x ) − f ( x )] 2 dx f ) = E � � [ E ˆ f ( x ) − f ( x )] 2 dx Var ˆ = + f ( x ) dx � �� Integrated Bias Integrated Variance Michael R. Roberts Nonparametric Methods 18/42

Introduction Histogram Density Estimation Kernel Estimator Regression Useful Facts The bias is not a fxn of sample size = ⇒ Increasing sample size will not reduce bias ∴ Need to adjust the weight fxn (i.e., Kernel) Bias is a fxn of window width (and Kernel) = ⇒ Decreasing window width reduces bias If window width fxn of sample size, then bias Michael R. Roberts Nonparametric Methods 19/42

Introduction Histogram Density Estimation Kernel Estimator Regression Choosing the Smoothing Parameter Optimal window width derived as minimizer of (approximate) MISE is a fxn of the unknown density f Appropriate choice of smooth parameter depends on the goal of the density estimation If goal is data exploration to guide models and hypotheses, subjective 1 criteria probably ok (see below) When drawing conclusions from estimated density, undersmoothing is 2 probably good idea (easier to smooth than unsmooth a picture) Michael R. Roberts Nonparametric Methods 20/42

Introduction Histogram Density Estimation Kernel Estimator Regression Reference to a Standard Distribution Use a standard family of distributions to assign a value to unknown density in optimal window width computation. E.g., assume f normal with Var = σ 2 and Gaussian kernel = ⇒ h ∗ = 1 . 06 σ n − 1 / 5 Can estimate σ from the data using SD If pop dist is multimodal or heavily skewed, h ∗ will oversmooth Michael R. Roberts Nonparametric Methods 21/42

Introduction Histogram Density Estimation Kernel Estimator Regression Robust Measures of Spread Can use robust measure of spread ( R =IQR) to get different optimal smoothing parameter h ∗ = 0 . 79 Rn − 1 / 5 but this exacerbates problems from multimodality/skew because it oversmooths Can try h ∗ = 1 . 06 An − 1 / 5 or h ∗ = 0 . 9 An − 1 / 5 or where A = min ( SD , IQR / 1 . 34) Michael R. Roberts Nonparametric Methods 22/42

Introduction Introduction Density Estimation Kernel Regression Regression Local Polynomial Regression Setup The basic problem is to estimate a function m : y i = m ( x i ) + ε i where x i is scalar rv (for ease), E ( ε i | x ) = 0 This is just a generalization of the linear model: m ( x i ) = x ′ i β The goal is to estimate m Michael R. Roberts Nonparametric Methods 23/42

Nonparametric Methods Michael R. Roberts Department of Finance The - PowerPoint PPT Presentation

Introduction Density Estimation Regression Nonparametric Methods Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania July 28, 2009 Michael R. Roberts Nonparametric Methods 1/42 Introduction Density

Nonparametric Methods Marc H. Mehlman marcmehlman@yahoo.com University of New Haven

Nonparametric Regression Splines for Nonparametric Regression Splines for Regional Atmospheric

Nonparametric Sequential Change Detection for High-Dimensional Problems Yasin Ylmaz Electrical

The np package np : A Package for Nonparametric Kernel The np package implements a variety of

Nonparametric analysis of CMB Nonparametric analysis of CMB power spectrum data and consistency

Nonparametric Methods Recap Aarti Singh Machine Learning 10-701/15-781 Oct 4, 2010

More Nonparametric Methods December 4, 2019 December 4, 2019 1 / 18 Wilcoxon Signed-Rank Test

Fast Methods and Nonparametric Belief Propagation Alexander Ihler Massachusetts Institute of

Introduction to Big Data and Machine Learning Nonparametric methods Dr. Mihail October 1, 2019

STAT 401A - Statistical Methods for Research Workers Nonparametric two-sample tests Jarad Niemi

Estimating the Survival Function One-sample nonparametric methods: We will consider three methods

Dr. Nonparametric Bayes Or: How I Learned to Stop Worrying and Love the Dirichlet Process Kurt

Computational Statistics Lectures 10-13: Smoothing and Nonparametric Inference Dr Jennifer

Nonparametric combinatorial sequence models Fabian L. Wauthier, UC Berkeley with Nebojsa Jojic

Nonparametric Density Estimation October 1, 2018 Introduction If we cant fit a

Advanced fMRI Prac/cal Nonparametric Inference, Power & Meta-Analysis Thomas E. Nichols

3 = JKR F W d Pull- -Off Force Off Force Pull Contact Radius Contact Radius po A

Members of the SLS Beam Dynamics Group J. Chrin, M. Mu noz, A. Streun, M. B oge JLab

Data Mining and Matrices 05 Semi-Discrete Decomposition Rainer Gemulla, Pauli Miettinen May

for the IBL Detector Andrea Gaudiello Universit degli Studi Di Genova INFN On behalf of the

3.1 Iterated Partial Derivatives Prof. Tesler Math 20C Fall 2018 Prof. Tesler 3.1 Iterated

Cache Modeling and Optimization using Miniature Simulations Carl Waldspurger CachePhysics, Inc.

Memory Allocation Nima Honarmand (Based on slides by Don Porter and Mike Ferdman) Fall 2014 ::

Implicit Regularization in Nonconvex Statistical Estimation Yuxin Chen Electrical Engineering,

Nonparametric Methods Michael R. Roberts Department of Finance The - PowerPoint PPT Presentation

Introduction Density Estimation Regression Nonparametric Methods Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania July 28, 2009 Michael R. Roberts Nonparametric Methods 1/42 Introduction Density

Nonparametric Methods Marc H. Mehlman marcmehlman@yahoo.com University of New Haven

Nonparametric Regression Splines for Nonparametric Regression Splines for Regional Atmospheric

Nonparametric Sequential Change Detection for High-Dimensional Problems Yasin Ylmaz Electrical

The np package np : A Package for Nonparametric Kernel The np package implements a variety of

Nonparametric analysis of CMB Nonparametric analysis of CMB power spectrum data and consistency

Nonparametric Methods Recap Aarti Singh Machine Learning 10-701/15-781 Oct 4, 2010

More Nonparametric Methods December 4, 2019 December 4, 2019 1 / 18 Wilcoxon Signed-Rank Test

Fast Methods and Nonparametric Belief Propagation Alexander Ihler Massachusetts Institute of

Introduction to Big Data and Machine Learning Nonparametric methods Dr. Mihail October 1, 2019

STAT 401A - Statistical Methods for Research Workers Nonparametric two-sample tests Jarad Niemi

Estimating the Survival Function One-sample nonparametric methods: We will consider three methods

Dr. Nonparametric Bayes Or: How I Learned to Stop Worrying and Love the Dirichlet Process Kurt

Computational Statistics Lectures 10-13: Smoothing and Nonparametric Inference Dr Jennifer

Nonparametric combinatorial sequence models Fabian L. Wauthier, UC Berkeley with Nebojsa Jojic

Nonparametric Density Estimation October 1, 2018 Introduction If we cant fit a

Advanced fMRI Prac/cal Nonparametric Inference, Power &amp; Meta-Analysis Thomas E. Nichols

3 = JKR F W d Pull- -Off Force Off Force Pull Contact Radius Contact Radius po A

Members of the SLS Beam Dynamics Group J. Chrin, M. Mu noz, A. Streun, M. B oge JLab

Data Mining and Matrices 05 Semi-Discrete Decomposition Rainer Gemulla, Pauli Miettinen May

for the IBL Detector Andrea Gaudiello Universit degli Studi Di Genova INFN On behalf of the

3.1 Iterated Partial Derivatives Prof. Tesler Math 20C Fall 2018 Prof. Tesler 3.1 Iterated

Cache Modeling and Optimization using Miniature Simulations Carl Waldspurger CachePhysics, Inc.

Memory Allocation Nima Honarmand (Based on slides by Don Porter and Mike Ferdman) Fall 2014 ::

Implicit Regularization in Nonconvex Statistical Estimation Yuxin Chen Electrical Engineering,

Advanced fMRI Prac/cal Nonparametric Inference, Power & Meta-Analysis Thomas E. Nichols