When distributions fail: nonparametrics, permutations, and the - PowerPoint PPT Presentation

When distributions fail: nonparametrics, permutations, and the bootstrap Joshua Loftus July 30, 2015

Remember these?

Sometimes we can’t use them Why? I Sample size too small for asymptotic distributions (e.g. can’t use CLT) I Shape is qualitatively wrong (e.g. skewness) I Some other assumptions are unrealistic (e.g. heteroscedasticity) Today we’ll discuss three somewhat related topics which are often used to address these issues: non-parametric methods , permutation methods , and the bootstrap .

(Non)parametric distributions Hypothesis tests are usually asking which distribution F from a family of distributions F has generated a given dataset. Often the family is parametrized F = { F θ } θ ∈ Θ , i.e. each F can be described by an associated value of the parameter ◊ . Non-parametric statistics refers to methods that either do not assume there is a nice family F to begin with, or allow the dimension of ◊ to be infinite or to grow with the sample size. Sometimes combined with parametric distributions as well: non-parametric regression of Y = f ( X ) + ‘ does not assume f is linear, but may assume ‘ ∼ N ( 0 , ‡ 2 ) .

Non-parametric testing In consulting we usually encounter non-parametric methods in the context of two sample tests. If the data don’t look normal and/or the sample size is too small to rely on asymptotic assumptions, we may be skeptical of a very small p -value coming from a t test. What do we do? Mann-Whitney U test (aka Wilcoxon rank-sum test) if the samples are not paired, or Wilcoxon signed-rank test (or sign test, more general) for paired samples. If interested in proportions rather than location shift (median), McNemar’s test. Kruskal-Wallis if there are more than two groups (one way ANOVA). Kolmogorov-Smirnov to test if one sample comes from a given distribution or if two samples have equal distributions. And many more. . . (anything based on ranks / ecdf)

What you need to do as a consultant I Assess concern about possibly violated assumptions I Check wikipedia (seriously) to determine appropriate non-parametric test and verify the assumptions of that test I Think critically / sanity check: do you trust the conclusion now? Will others? Is n = 8 enough? I Explain potential loss/gain of power x = c (1,2,3,4) y = c (5,6,7,8) round ( c ( t.test (x,y)$p.value, wilcox.test (x,y)$p.value), 5) ## [1] 0.00466 0.02857

Permutation tests I Another type of non-parametric testing method I Can be used for any statistic I Assumption: observations are “exchangeable” under the null I Rationale: if the null is true, the distribution won’t change when we permute the labels of observations. Applying many random permutations and re-computing the statistic gives an approximation of its distribution under the null I Importantly, exchangeability is more general than independence

Example: “re-randomization” Suppose we have a randomized clinical trial. 20 people randomly assigned, 10 to treatment and 10 to control. Outcome measured and statistic t obs computed. What if a di ff erent random assignment occurred? Shu ffl e the T/C “labels” with a random permutation fi 1 and recompute t π 1 . Do this for i = 2 , . . . , B more times. Approximate p -value is then (# { i : t π i ≤ t obs } + 1 ) / ( B + 1 ) This can be an exact p -value if n is small enough to compute all n ! permutations instead of B < n ! random ones.

Example: unknown distribution Consider assessing significance of the most correlated regressor: set.seed (1) y <- rnorm (50) x <- scale ( matrix ( rnorm (50*20), nrow=50)) t_obs <- max ( abs ( t (x) %*% y)) t_perm <- c () for (i in 1:10000) { t_perm <- c (t_perm, max ( abs ( t (x) %*% sample (y)))) } mean (t_perm >= t_obs) ## [1] 0.1607

Results hist (t_perm) abline (v = t_obs, col="red") Histogram of t_perm 1500 1000 Frequency 500 0 5 10 15 20 25

The bootstrap! I Brad I Another way of generating randomness in a controlled fashion I Instead of permuting: resample with replacement I For n distinct observations there are n ! permutations but n n potential bootstrap samples I Conceptually, plug in the ecdf ˆ F as an estimate of F I i.e. treat the sample as though it is a population If resampling from the actual population were free, we could generate distributions of any statistic by just resampling and recomputing it many times. Resampling from our sample is free! (Almost: computation).

Bootstraps, bootstraps, bootstraps, bootstraps, bootstraps everywhere Here are a few examples of kinds of bootstraps. I Case bootstrap (rows of data, e.g. for eigenvalue/vector stats) I Dependent data: block bootstrap (resample clusters of obs.) I Time series: moving block bootstrap (resample contiguous pieces of time series) I Heteroscedastic regression: wild bootstrap (re-randomize the residuals) I Parametric bootstrap (bootstrap samples from, e.g. rnorm)

Flexibility I Bootstrap and permutation methods can be used for almost anything I Both have limitations I Permutations: exchangeability (e.g. equal variance) I Bootstrap: bad for statistics that are not smooth functions of ˆ F (and it’s not exact)

Example for intuition: U [ 0 , θ ] data <- runif (100) theta_hat <- max (data) # MLE df <- data.frame (boot=NA, pboot=NA) for (i in 1:1000) { max_b <- max ( sample (data, replace = T)) max_pb <- max ( runif (100, max = theta_hat)) df <- rbind (df, c (max_b, max_pb)) } df <- df[-1,]

ggplotting results ggplot2 is great. Learn it. library (ggplot2) library (reshape2) df <- melt (df)

ggplotting results, part 2 ggplot (df, aes (value)) + geom_histogram () + facet_wrap (~ variable) boot pboot 600 400 count 200 0 0.925 0.950 0.975 1.000 0.925 0.950 0.975 1.000 value

When distributions fail: nonparametrics, permutations, and the - PowerPoint PPT Presentation

When distributions fail: nonparametrics, permutations, and the bootstrap Joshua Loftus July 30, 2015 Remember these? Sometimes we cant use them Why? I Sample size too small for asymptotic distributions (e.g. cant use CLT) I Shape is

Bayesian nonparametrics Dr. Jarad Niemi STAT 615 - Iowa State University December 5, 2017 Jarad

Bayesian Nonparametrics Lorenzo Rosasco 9.520 Class 18 April 11, 2011 L. Rosasco Bayesian

Formal Modeling in Cognitive Science 1 Distributions Lecture 20: Joint, Marginal, and Conditional

Variational Russian Roulette for Variational Russian Roulette for Deep Bayesian Nonparametrics

A Tutorial on Bayesian Nonparametrics Fatima Al-Raisi Carnegie Mellon University

Bayesian Nonparametrics Charlie Frogner 9.520 Class 11 March 14, 2012 C. Frogner Bayesian

MATH 105: Finite Mathematics 6-4: Permutations Prof. Jonathan Duncan Walla Walla College Winter

Random permutations with logarithmic cycle Random permutations weights Classical measures The

? ? ? ? Basic Charts Outline - Distributions & Histograms - Mean, Mode, Average - Chart

Chapters 12 Discrete random variables Permutations Binomial and related distributions

Cut Not and Fail Cut, Not, and Fail York University CSE 3401 Vida Movahedi 1 York University

mndag 13 maj 13 OVERVIEW Fail-recovery Precedence (1,N) Logged register Byzantine (1,N)

Random permutations and the two-parameter Poisson-Dirichlet distribution. Sasha Gnedin Queen

JUST THE MATHS SLIDES NUMBER 19.2 PROBABILITY 2 (Permutations and combinations) by

Posets and Permutations in the Duplication-Loss Model: Minimal Permutations with d Descents.

Fighting fish and two-stack sortable permutations Wenjie Fang, TU Graz 8 May 2018, University of

Choice with multiple alternatives Specification of the deterministic part Michel Bierlaire

The Immovable Valuation Missions of the General Administration of the Patrimonial Documentation

How to make R, PostGIS and QGis cooperate for statistical modelling duties a case study on

A Course in Applied Econometrics 1. Introduction Lecture 8 2. Multinomial and Conditional

Hybrid Models with Deep and Invertible Features Eric Nalisnick , Akihiro Matsukawa, Yee Whye

Statistics 151a - Linear Modelling: Theory and Applications Adityanand Guntuboyina Department of

Data Analytics Instructor: Prof. Shuai Huang Industrial and Systems Engineering University of

Estimating Gaussian Mixture Models from Data with Missing Features by Daniel McMichael CSSIP

When distributions fail: nonparametrics, permutations, and the - PowerPoint PPT Presentation

When distributions fail: nonparametrics, permutations, and the bootstrap Joshua Loftus July 30, 2015 Remember these? Sometimes we cant use them Why? I Sample size too small for asymptotic distributions (e.g. cant use CLT) I Shape is

Bayesian nonparametrics Dr. Jarad Niemi STAT 615 - Iowa State University December 5, 2017 Jarad

Bayesian Nonparametrics Lorenzo Rosasco 9.520 Class 18 April 11, 2011 L. Rosasco Bayesian

Formal Modeling in Cognitive Science 1 Distributions Lecture 20: Joint, Marginal, and Conditional

Variational Russian Roulette for Variational Russian Roulette for Deep Bayesian Nonparametrics

A Tutorial on Bayesian Nonparametrics Fatima Al-Raisi Carnegie Mellon University

Bayesian Nonparametrics Charlie Frogner 9.520 Class 11 March 14, 2012 C. Frogner Bayesian

MATH 105: Finite Mathematics 6-4: Permutations Prof. Jonathan Duncan Walla Walla College Winter

Random permutations with logarithmic cycle Random permutations weights Classical measures The

? ? ? ? Basic Charts Outline - Distributions &amp; Histograms - Mean, Mode, Average - Chart

Chapters 12 Discrete random variables Permutations Binomial and related distributions

Cut Not and Fail Cut, Not, and Fail York University CSE 3401 Vida Movahedi 1 York University

mndag 13 maj 13 OVERVIEW Fail-recovery Precedence (1,N) Logged register Byzantine (1,N)

Random permutations and the two-parameter Poisson-Dirichlet distribution. Sasha Gnedin Queen

JUST THE MATHS SLIDES NUMBER 19.2 PROBABILITY 2 (Permutations and combinations) by

Posets and Permutations in the Duplication-Loss Model: Minimal Permutations with d Descents.

Fighting fish and two-stack sortable permutations Wenjie Fang, TU Graz 8 May 2018, University of

Choice with multiple alternatives Specification of the deterministic part Michel Bierlaire

The Immovable Valuation Missions of the General Administration of the Patrimonial Documentation

How to make R, PostGIS and QGis cooperate for statistical modelling duties a case study on

A Course in Applied Econometrics 1. Introduction Lecture 8 2. Multinomial and Conditional

Hybrid Models with Deep and Invertible Features Eric Nalisnick *, Akihiro Matsukawa*, Yee Whye

Statistics 151a - Linear Modelling: Theory and Applications Adityanand Guntuboyina Department of

Data Analytics Instructor: Prof. Shuai Huang Industrial and Systems Engineering University of

Estimating Gaussian Mixture Models from Data with Missing Features by Daniel McMichael CSSIP

? ? ? ? Basic Charts Outline - Distributions & Histograms - Mean, Mode, Average - Chart

Hybrid Models with Deep and Invertible Features Eric Nalisnick , Akihiro Matsukawa, Yee Whye