Calibrated Bayes: an attractive framework for official statistics in - - PowerPoint PPT Presentation

calibrated bayes an attractive framework for official
SMART_READER_LITE
LIVE PREVIEW

Calibrated Bayes: an attractive framework for official statistics in - - PowerPoint PPT Presentation

Calibrated Bayes: an attractive framework for official statistics in the 21st century Roderick J. Little Overview Design-based versus model-based survey inference Current orthodoxy: design-model compromise Strengths and drawbacks


slide-1
SLIDE 1

Calibrated Bayes: an attractive framework for official statistics in the 21st century

Roderick J. Little

slide-2
SLIDE 2

Overview

  • Design-based versus model-based survey

inference

  • Current orthodoxy: design-model

compromise

– Strengths and drawbacks

  • An alternative: Calibrated Bayes
  • Two US Census Bureau applications

– Disclaimer: views are mine, not US Census Bureau

NTTS 2015: Calibrated Bayes

2

slide-3
SLIDE 3

Overview

  • Design-based versus model-based survey

inference

  • Current orthodoxy: design-model

compromise

– Strengths and drawbacks

  • An alternative: Calibrated Bayes
  • Two US Census Bureau applications

– Disclaimer: views are mine, not US Census Bureau

NTTS 2015: Calibrated Bayes

3

slide-4
SLIDE 4

Survey estimation

  • Design-based inference: population values are

fixed, inference is based on probability distribution of sample selection. Obviously this assumes that we have a probability sample (or “quasi-randomization”, where we pretend that we have one)

  • Model-based inference: survey variables are

assumed to come from a statistical model: probability sampling is not the basis for inference, but useful for making the sample selection ignorable. (see e.g. Gelman et al., 2003; Little 2004)

NTTS 2015: Calibrated Bayes

4

slide-5
SLIDE 5

Design vs model-based survey inference

  • Two main variants of model-based inference:

– Superpopulation models: Frequentist inference based on repeated samples from a “superpopulation” model – Bayes: add prior distribution for parameters; inference about finite population quantities or parameters based on posterior distribution

  • A fascinating part of the more general debate

about frequentist versus Bayesian inference in statistics at large:

– Design-based inference is inherently frequentist – Purest form of model-based inference is Bayes

NTTS 2015: Calibrated Bayes

5

slide-6
SLIDE 6

Design-based inference

1

( ,..., ) = population values (fixed); design variables

N

Y Y Y Z   ( , ) = finite population quantity Q Q Y Z 

1

( ,..., ) = Sample Inclusion Indicators (random)

N

I I I 

Ii R

S T

1 , , unit included in sample

  • therwise

inc

ˆ ˆ( , , ) = sample estimate of q q Y I Z Q 

inc

ˆ ˆ ( , , ) = sample estimate of , the variance of V Y I Z V q

inc

part of included in the survey Y Y 

 

ˆ ˆ ˆ ˆ 1.96 , 1.96 95% confidence interval for q V q V Q   

NTTS 2015: Calibrated Bayes

6

slide-7
SLIDE 7

Choice of ˆ

q

NTTS 2015: Calibrated Bayes

It is natural to seek an estimate that is

  • However, this kind of optimality is not possible without

a model (Horvitz and Thompson 1952, Godambe 1955) design efficient There are many choices of design-consistent estimates ... Many survey estimates are motivated by Regression model regression estimator Ratio model rat mod io els: estimator, etc. implicit   Seek good design-based properties: ˆ : ( | ) (too strong) ˆ Or weaker: : as sample size gets large design unbiasedness E q Y Q design consistency q Q  

7

slide-8
SLIDE 8

Limitations of design-based approach

  • Inference is based on probability sampling, but true

probability samples are harder and harder to come by:

– Noncontact, nonresponse is increasing – Face-to-face interviews increasingly expensive – High proportion of available information is now not based on probability samples (e.g. internet, administrative data)

  • Theory is basically asymptotic -- limited tools for

small samples, e.g. small area estimation

NTTS 2015: Calibrated Bayes

8

slide-9
SLIDE 9

Asymptotia Highlands

Murky sub- asymptotial forests

How many more to reach the promised land of asymptotia? Design-based methods live in the land of asymptotia

9

slide-10
SLIDE 10

Model-based approaches

  • In model-based, or model-dependent, approaches,

models are the basis for the entire inference: estimator, standard error, interval estimation

  • Two variants:

– Superpopulation modeling – Bayesian (full probability) modeling

  • Common theme is to “infer” or “predict” about non-

sampled portion of the population, conditional on the sample and model

  • Superpopulation is super, but Bayes is better … for

small samples

NTTS 2015: Calibrated Bayes

10

slide-11
SLIDE 11

Bayes inference for surveys

inc

Model: ( | ) = prior distribution for Data: ampled values of ; = design variables p Y Z Y Y s Y Z 

inc

Inference about ( , ) are based on posterior predictive distribution ( ( , ) | , ) Q Q Y Z p Q Y Z Y Z 

inc inc

In particular: ˆ One estimate is posterior mean: ( | , ) Standard error is posterior sd: ( | , ) 95% posterior probability interval plays role

  • f confidence interval (with a simpler interpretat

q E Q Y Z Var Q Y Z  ion)

NTTS 2015: Calibrated Bayes

11

slide-12
SLIDE 12

Inference about is then obtained from its posterior distribution, computed via Bayes’ Theorem:

Parametric models

Usually prior distribution is specified via parametric models: ( | ) ( | , ) ( | ) p Y Z p Y Z p Z d      ( | , ) = parametric model, as in superpopulation approach p Y Z  ( | ) = prior distribution for p Z    That is: Posterior = Prior x Likelihood

inc inc inc

( | , ) ( | ) ( | , ) ( | , ) Likelihood function p Y Z p Z L Y Z L Y Z       

NTTS 2015: Calibrated Bayes

12

slide-13
SLIDE 13
  • Example. Spline model on weights

Z Y Z

Sample Population

HT 1

1 / ; selection prob

n i i i i

y y N  

       

mod 1 1 2 2

A modeling alternative to the HT estimator is create predictions from a more robust model relating to : 1 ˆ ˆ = , predictions from model, e.g.: ~ Nor( , ); leads to

n N i i i i i n i i i

Y Z y y y y N y   

  

      

 

HT 2

~ Nor( ( ), ); ( ) = penalized spline of on Simulations in Zheng and Little (2005) suggest better RMSE, confidence coverage for spline model compared with design-based approaches

k i i i i

y y S S Y Z    

NTTS 2015: Calibrated Bayes

13

slide-14
SLIDE 14

The model-based perspective- pros

  • Flexible, unified approach for all survey problems

– Models for nonresponse, response and matching errors, small area models, combining data sources

  • Bayesian approach is not asymptotic, provides better

small-sample inferences

  • Probability sampling is justified as making sampling

mechanism ignorable, improving robustness

NTTS 2015: Calibrated Bayes

14

slide-15
SLIDE 15

Models bring survey inference closer to the statistical mainstream

B/F Gorilla Follow my (frequentist) statistical standards Why? I am an economist, I build models!

15

NTTS 2015: Calibrated Bayes

slide-16
SLIDE 16

The model-based perspective- cons

  • Explicit dependence on the choice of model, which

has subjective elements (but assumptions are explicit, not buried in a formula)

  • Bad models provide bad answers – justifiable

concerns about the effect of model misspecification

  • Models are needed for all survey variables – need

to understand the data, and potential for more complex computations

NTTS 2015: Calibrated Bayes

16

slide-17
SLIDE 17

Overview

  • Design-based versus model-based survey

inference

  • Current orthodoxy: design-model

compromise

– Strengths and drawbacks

  • An alternative: Calibrated Bayes
  • Two US Census Bureau applications

– Disclaimer: views are mine, not US Census Bureau

NTTS 2015: Calibrated Bayes

17

slide-18
SLIDE 18

The current “status quo” -- design- model compromise

  • Design-based for large samples, descriptive statistics

– But may be model assisted, e.g. regression calibration: – model estimates adjusted to protect against misspecification, (e.g. Särndal, Swensson and Wretman 1992).

  • Model-based for small area estimation, nonresponse,

time series,…

  • Attempts to capitalize on best features of both

paradigms… but … at the expense of “inferential schizophrenia” (Little 2012)?

NTTS 2015: Calibrated Bayes

18

GREG 1 1

ˆ ˆ ˆ ˆ ( ) / , model prediction

N N i i i i i i i i

T y I y y y 

 

   

 

slide-19
SLIDE 19

Example: when is an area “small”?

n

  • m

e t e r Design-based inference

  • Model-based inference

n0 = “Point of inferential schizophrenia” How do I choose n0? If n0 = 35, should my entire statistical philosophy and inference be different when n=34 and n=36?

n=36, CI: [ ] (wider since based on direct estimate) n=34, CI: [ ] (narrower since based on model)

NTTS 2015: Calibrated Bayes

19

slide-20
SLIDE 20

Multilevel (hierarchical Bayes) models

n

  • m

e t e r Bayesian multilevel model estimates borrow strength increasingly from model as n decreases

ˆ (1 )

a a a a a

w y w

    

a

w

1

Sample size n Model estimate Direct estimate

NTTS 2015: Calibrated Bayes

20

slide-21
SLIDE 21

Overview

  • Design-based versus model-based survey

inference

  • Current orthodoxy: design-model

compromise

– Strengths and drawbacks

  • An alternative: Calibrated Bayes
  • Two US Census Bureau applications

– Disclaimer: views are mine, not US Census Bureau

NTTS 2015: Calibrated Bayes

21

slide-22
SLIDE 22

An alternative paradigm: Calibrated Bayes

  • Frequentists should be Bayesian

– Bayes is optimal under a correctly specified model

  • Bayesians should be frequentist

– We never know the model (and all models are wrong) – Inferences should be robust to misspecification, have good repeated sampling characteristics

  • Calibrated Bayes (Box 1980, Rubin 1984, Little 2006, 2012,

2013) – Inference based on a Bayesian model – Model chosen to yield inferences that are well-calibrated in a frequentist sense – Aim for posterior probability intervals that have (approximately) nominal frequentist coverage

NTTS 2015: Calibrated Bayes

22

slide-23
SLIDE 23

NTTS 2015: Calibrated Bayes

23

Bayes/frequentist compromises

“I believe that … sampling theory is needed for exploration and ultimate criticism of the entertained model in the light of the current data, while Bayes’ theory is needed for estimation of parameters conditional on adequacy of the model.” George Box (1980)

slide-24
SLIDE 24

Calibrated Bayes

“The applied statistician should be Bayesian in principle and calibrated to the real world in practice – appropriate frequency calculations help to define such a tie.”

NTTS 2015: Calibrated Bayes

24

“… frequency calculations are useful for making Bayesian statements scientific, … in the sense of capable of being shown wrong by empirical test; here the technique is the calibration of Bayesian probabilities to the frequencies of actual events.”

Rubin (1984)

slide-25
SLIDE 25

NTTS 2015: Calibrated Bayes

Calibrated Bayes models for surveys should incorporate sample design features

  • The “Calibrated” part of Calibrated Bayes requires

robust models with good repeated sampling properties:

  • Generally weak priors that are dominated by the

likelihood (“objective Bayes”)

  • Models that incorporate sampling design features:

– Capture design weights and stratifying variables as covariates in the prediction model (e.g. Gelman 2007) – Clustering via hierarchical random effects models

25

slide-26
SLIDE 26

Overview

  • Design-based versus model-based survey

inference

  • Current orthodoxy: design-model

compromise

– Strengths and drawbacks

  • An alternative: Calibrated Bayes
  • Two US Census Bureau applications

– Disclaimer: views are mine, not US Census Bureau

NTTS 2015: Calibrated Bayes

26

slide-27
SLIDE 27

Applications

  • Voting Rights Act special tabulation
  • The American Community Survey (ACS)

and the “standard error error”

NTTS 2015: Calibrated Bayes

27

slide-28
SLIDE 28

Voting Rights Act Special Tabulation

  • Section 203 Language Provisions of the Voting

Rights Act

  • Determines counties and townships required to

provide language assistance at the polls

  • Determinations are based in part on the

following “more than 5%” provision:

… More than 5 percent of voting age citizens of political district are members of a single language minority and are Limited English Proficient (LEP).

28

NTTS 2015: Calibrated Bayes

slide-29
SLIDE 29

Voting Rights Act Tabulations

  • Previously used direct estimates from Long Form

Decennial Census Data

  • Used ACS 2005-2009 and 2010 Census data to

produce estimates by fall 2011

  • Direct estimates for some districts are based on small

ACS sample and hence have unacceptably high variance

  • E.g. let P be proportion of voting age citizens in

political district who are members of a single language minority and are Limited English Proficient

  • Suppose ACS was a simple random sample, a direct

estimate of P is the sample proportion m/n

– District A with n=105, m=5, m/n < 0.05 – District B with n=105, m=6, m/n > 0.05 – Direct ACS estimation is more complex, but same idea applies

NTTS 2015: Calibrated Bayes 29

slide-30
SLIDE 30

Voting Rights Tabulations

  • Overview of approach to the “more than 5%” provision:
  • Build a district level regression model to predict P based
  • n variables in the ACS
  • Classify districts into classes with similar predicted P

based on the model [predictive mean stratification]

  • Within classes, apply a Beta-Binomial model that pulls the

direct ACS estimate of P towards the average P for districts in that class

  • Compare Beta-Binomial model estimate with 5% for this

aspect of the determination

  • Rationale: increased precision of Beta-Binomial estimates

in small samples increases the probability of getting the determination right, particularly in small districts

  • See Joyce et al. (2014)

NTTS 2015: Calibrated Bayes 30

slide-31
SLIDE 31
  • Small p and n, posterior distribution is skewed to

right mode median mean

  • What’s the right point estimate: median, mode, mean?

Bayes forces a choice …

  • Design-based, superpopulation model approaches fail

to address the issue

– Maximum likelihood is equivalent to mode with flat prior, which does not correspond to a sensible loss function

Bayes forces a loss function

NTTS 2015: Calibrated Bayes

31

slide-32
SLIDE 32

American Community Survey

  • US Census Bureau is making available thousands
  • f ACS tables, with millions of cells
  • A high fraction of these estimates are based on

very little data, and hence are very noisy

– Many people want information, not data, so ACS should produce information products, as well as data products – When noise swamps the signal, the information content is buried – Data products are highly constrained by confidentiality requirements, leading to incompleteness

NTTS 2015: Calibrated Bayes 32

slide-33
SLIDE 33

The Statistical Problem

  • The ACS philosophy is essentially to produce

“direct” (“design-based”) estimates, together with margins of error

  • This works fine with large samples, but most of

the ACS estimates are based on small samples

– The estimates are often too noisy to be useful – The confidence intervals derived from the estimates and margins of error are known to be of poor quality, violating statistical standards

  • Intervals include proportions outside the range (0,1)
  • Intervals do not have nominal coverage

NTTS 2015: Calibrated Bayes 33

slide-34
SLIDE 34

The “standard error” error

  • ACS reports estimates and margins of error that

yield asymptotic 90% confidence intervals

  • But in small samples, the implied confidence

intervals do not have the stated coverage; so

  • Seek to replaces estimates and margins of error

by posterior means and 5% to 95% credibility intervals that have the approximately the nominal coverage

  • A non-Bayesian can interpret the posterior means as

estimates, and the 90% credibility intervals as 90% confidence intervals.

NTTS 2015: Calibrated Bayes 34

slide-35
SLIDE 35

35

Binary outcome: Schmertmann example

Margins of error exceed the estimates

slide-36
SLIDE 36

Data for example

NTTS 2015: Calibrated Bayes 36

  • utcome (e.g. poverty)

covariates (e.g. categorized age=a, gender = g, stratum = h) In county : sample count with age=a, gender = g, stratum = h sample count in poverty with age=a, ge

aghc aghc

Y x c n x     nder = g, stratum = h ˆ / sample proportion

aghc aghc aghc

p x n  

slide-37
SLIDE 37

Fully Bayesian model

NTTS 2015: Calibrated Bayes 37

 

*

| ~ Bin( , ) ~ Beta( , ) Beta ( , ) [Assumption: ] | ~ Beta , (1 )

aghc aghc aghc aghc aghc agh agh agh agh agh agh aghc aghc aghc agh aghc aghc agh

x p p n p p x x n x                   

Key is how to determine prior parameters , (or , ) (a) Empirical Bayes: estimate prior parameters, then treat as if known Simple beta intervals, but understates uncertainty

agh agh agh

    (b) Full Bayes: Incorporate uncertainty of prior parameter estimates More work, but better reflects uncertainty; Consider approximations, since full Bayes seems computationally complex

slide-38
SLIDE 38

Pragmatic “pseudo-Bayes” approach

Tom Louis suggested this simple “Bayes-like” approach:

  • A. Compute design-based estimate of proportion and

standard error using existing methods

  • B. Pretend data are binomial with number of successes x*

and sample size n* that lead to the estimates in A.

  • C. Compute Beta posterior distribution with

noninformative prior (e.g. uniform or Jeffreys)

  • D. Compute 90% posterior credibility interval based on

this Beta posterior (reflects asymmetry, always between 0 and 1) Simple to implement and easily beats standard Wald-type confidence intervals in simulations (Franco, Little, Louis and Slud 2015, in preparation)

NTTS 2015: Calibrated Bayes

38

slide-39
SLIDE 39

Barriers to Calibrated Bayes

  • It’s a major paradigm shift
  • It’s too much work/computation

– but this concern is alleviated by gains in computing power and advances in Bayesian computational methods

  • More explicit dependence on the choice of model --

concerns with model misspecification

– “Design-based is model-free and hence robust…model- based requires models, which are inherently subjective”

  • But models are essential for today’s data, and
  • a judicious Calibrated Bayes model is robust and

incorporates key design features – and would bring

  • fficial statistics back in the statistical mainstream

NTTS 2015: Calibrated Bayes

39

slide-40
SLIDE 40

References 1

Box, G.E.P. (1980), Sampling and Bayes inference in scientific modeling and robustness (with discussion), JRSSA, 143, 383-430. Joyce, P.M., Malec, D., Little, R.J., Gilary, A., Navarro, A. and Asiala, M.E. (2014). Statistical Modeling Methodology for the Voting Rights Act Section 203 Language Assistance Determinations. JASA, 109, 36- 47. Gelman, A. (2007). Struggles with survey weighting and regression

  • modeling. Statist. Sci., 22, 2, 153-164 (with discussion and

rejoinder). Gelman, A., Carlin, J.B., Stern, H.S. and Rubin, D.B. (2003), Bayesian Data Analysis, 2nd. edition. New York: CRC Press. Godambe, V.P. (1955). A unified theory of sampling from finite

  • populations. JRSSB, 17, 269-278.

Horvitz, D.G. & Thompson, D.J. (1952). A generalization of sampling without replacement from a finite universe. JASA, 47, 663-685. Little, R.J.A. (2004). To Model or Not to Model? Competing Modes of Inference for Finite Population Sampling. JASA, 99, 546-556.

NTTS 2015: Calibrated Bayes

40

slide-41
SLIDE 41

References 2

Little, R.J.A. (2006). Calibrated Bayes: A Bayes/frequentist roadmap.

  • Am. Statist., 60, 3, 213-223

_____ (2012). Calibrated Bayes: an alternative inferential paradigm for

  • fficial statistics (with discussion and rejoinder). JOS, 28, 3, 309-372.

_____ (2013). Survey Sampling: Past Controversies, Current Orthodoxies, and Future Paradigms. In Past, Present and Future of Statistical Science, COPSS 50th Anniversary Volume, X. Lin, D. L. Banks, C. Genest, G. Molenberghs, D.W. Scott, and J.-L. Wang, eds. CRC Press. Rubin, DB (1984), Bayesianly justifiable and relevant frequency calculations for the applied statistician, Annals Statist. 12, 1151-1172. Särndal, C.-E., Swensson, B. & Wretman, J.H. (1992), Model Assisted Survey Sampling, Springer Verlag: New York. Zheng, H. & Little, R.J. (2005). Inference for the population total from probability-proportional-to-size samples based on predictions from a penalized spline nonparametric model. JOS, 21, 1-20.

NTTS 2015: Calibrated Bayes

41