Statistics & Bayesian Inference Lecture 1 Joe Zuntz Lecture 1 - PowerPoint PPT Presentation

Statistics & Bayesian Inference Lecture 1 Joe Zuntz

Lecture 1 Essentials of probability • Some analytic • Motivations distributions • Definitions • Bayes Theorem • Probability • Models & Parameter Distributions Spaces • Basic probability • How scientists can use operations probability

Motivations • Learn as much as possible from our (expensive) data H 0 = (72 ± 8) km s − 1 Mpc − 1 • Constrain parameters in models • Test & compare models • Characterize collections of numbers

Probability Distributions: Definitions • Assign real number P ≥ 0 to each H H } 0.25 member of a sample space   (discrete or continuous, finite or infinite) • P=probability density function (PDF) or H T } 0.25 probability mass function (PMF) • This set represents possible outcomes T H } 0.25 of an experiment/game/event/situation • e.g. possible results tossing two coins, height of next person to walk through T T } 0.25 door

Probability Distributions: Definitions • Assign real number P ≥ 0 to each member of a sample space   (discrete or continuous, finite or infinite) • P=probability density function (PDF) or probability mass function (PMF) • This set represents possible outcomes of an experiment/game/event/situation • e.g. possible results tossing two coins, height of next person to walk through door

Probability Distributions: Definitions • A random variable X is any value subject to randomness, e.g.: • was first toss heads?   was the sequence Heads-Tails?   were both tosses the same? • Discrete X: P is a list of values • Continuous X: P is a function, PDF, (which we have to integrate to answer questions)

Probability Distributions: Basic properties • Since X must have exactly one value: X P ( x ) = 1 • Discrete: x ∈ X • Continuous: Z P ( x )d x = 1 x ∈ X • P(X=x) = f(x)   Usually just write P(X) = f(x) • 0 ≤ P(x) ≤ 1

Probability Distributions: Combining Probabilities • Joint probability   P(XY)   P(X=x and Y=y)   P(X ∩ Y) • Union   P(X=x or Y=y)   P(X ∪ Y)

Probability Distributions: Combining Probabilities • Conditional   P(X=x given Y=y)   P(X|Y) • Independence: • P(X|Y) = P(X) • X independent of Y

Probability Distributions: Identities • P(not X) = 1-P(X) • P(XY) = P(X|Y) P(Y) • P(XY) = P(X)+P(Y)-P(X ∩ Y)

Probability Distributions: Expectations • The expectation (or mean) of a random variable X is given by: Z X E ( X ) = P ( X ) X E ( X ) = P ( X ) X d X � • Or a function of it by: Z X E ( f ( X )) = P ( X ) f ( X )d X E ( f ( X )) = P ( X ) f ( X )

Probability Distributions: Expectations MODE • Expectations are one measure if centrality, and not always a good one. • Mode and median also exist • All just ways of reducing or characterizing a distribution MEAN

Probability Distributions: Marginalizing • Discrete: X P ( x ) = P ( x | y i ) P ( y i ) � i • Continuous: Z P ( x ) = P ( x | y ) P ( y )d y � • If you don’t care about something, marginalize over it

Probability Distributions: Changing variables u = f ( x ) • Probability mass P ( u )d u = P ( x )d x must be conserved, not density P ( u ) = P ( x )d x d u • Relate with a = P ( x ) / d u Jacobian d x = P ( x ) /f 0 ( x ) • Be especially careful in more dimensions

Probability Distributions: Drawing samples • Generate values of X with probability specified by P(X) • Draw enough samples: histogram looks like PDF • See lecture 3

Probability Distributions: Analytic examples • Wikipedia is brilliant for this • Uniform • Delta function • Gaussian (normal) • Exponential • Poisson 1 P ( x ) = b − a, x ∈ [ a, b ]

Probability Distributions: Analytic examples • Wikipedia is brilliant for this • Uniform • Delta function • Gaussian (normal) • Exponential • Poisson P ( x ) = δ ( x − x 0 )

Probability Distributions: Analytic examples • Wikipedia is brilliant for this • Uniform • Delta function • Gaussian (normal) • Exponential • Poisson 2 πσ 2 exp − ( x − µ ) 2 1 P ( x ) = √ 2 σ 2

Probability Distributions: Analytic examples • Wikipedia is brilliant for this • Uniform • Delta function • Gaussian (normal) • Exponential • Poisson P ( x ) = λ e − λ x , x > 0

Probability Distributions: Analytic examples • Wikipedia is brilliant for this • Uniform • Delta function • Gaussian (normal) • Exponential • Poisson P ( n ) = λ n e − λ n !

Bayes Theorem   and Inference P ( AB ) = P ( A | B ) P ( B ) = P ( B | A ) P ( A )

Bayes Theorem   and Inference P ( AB ) = P ( A | B ) P ( B ) = P ( B | A ) P ( A ) ∴ P ( A | B ) = P ( B | A ) P ( A ) P ( B )

  Bayes Theorem   and Inference What you know after looking at the data =   what you knew before   + what the data told you

Models & Parameters • A model is the mathematical theory that describes how your data arose. • It is not a theory of how what you wanted to measure arose. • Non-trivial models include some deterministic and some stochastic parts. • Noise is one stochastic; many (most?) astrophysical models also have others too

Models & Parameters • Parameters are any unknown numerical values in your model • A parameter can have probability distributions • You need (and have) some prior (background) information about all your parameters • This may be subjective!

Parameter Spaces • Can use continuous parameters as dimensions in an abstract space c • Probabilities become functions of many variables:   P(uvwxyz) m • As the dimension of this space increases your intuition becomes worse

Descriptive Statistics • Reduce samples or distribution to set of characteristic numbers • In a analytic cases this is all you need to describe a distribution • Statistics of samples   = estimators/approximations to underlying distribution stats

Descriptive Statistics: Mean Z E [ X ] = XP ( X )d X • Distribution mean P X i • Sample mean ¯ X = N

Descriptive Statistics: Mean • Means can be   misleading! • Most distributions are asymmetric

Descriptive Statistics: Variance Var( X ) = E [( X − ¯ X ) 2 ] • Distribution variance Z ( X − ¯ X ) 2 P ( X )d X = � • Sample variance P ( X i − ¯ X ) 2 σ 2 X = N � P ( X i − ¯ • Population variance X ) 2 s 2 X = N − 1

Descriptive Statistics: Covariance Cov( X, Y ) = E [( X − ¯ X )( Y − ¯ Y )] Z ( X − ¯ X )( Y − ¯ = Y ) P ( XY )d X d Y • Covariance P ( X i − ¯ X )( Y i − ¯ Y ) σ XY = N

Descriptive Statistics: Covariance σ XY > 0 σ XY < 0 Y Y X X

Gaussians:   The Basics • One dimensional − ( x − µ ) 2 1  � P ( x ; µ, σ ) = 2 πσ exp continuous PDF √ 2 σ 2 • Two parameters:   Mean μ   Standard deviation σ • Symmetric • Common! But often an over-simplification.

Gaussians:   Sigma numbers • Distance from mean defined in number of standard deviations sigma • Probability mass: 68% • 68% within 1 σ 95% • 95% within 2 σ 99.7% • 99.7% within 3 σ

Gaussians:   Properties • Error function is cumulative integral of Gaussian • Sigma numbers can be read off

Gaussians:   Properties • Sum of Gaussians has simple form: � X ∼ N ( µ x , σ 2 x ) Y ∼ N ( µ y , σ 2 y ) � ⇒ X + Y ∼ N ( µ x + µ y , σ 2 x + σ 2 = y ) � • Especially useful for sum of identical Gaussians, and leads to formula that error on the mean ~ n 1/2

Gaussians:   Properties • Central limit theorem:   Given a collection of random variables X i : n 1 X � ( X i − µ i ) → N (0 , 1) s n i =1 � n X s 2 σ 2 n = i � i =1 • Provided that: 1 X ( X − µ i ) 2 ⇤ ⇥ → 0 E s 2 n

Gaussians:   Properties • Central limit theorem: Single   Mean of 2 distribution Mean of 3 Mean of 4

Gaussians:   Multivariate  � 1 − 1 2( x − µ ) T C − 1 ( x − µ ) P ( x ; µ , C ) = 2 | C | exp n (2 π ) • C is the covariance matrix - describes correlations between quantities • For example: data points often have correlated errors

Interpretations of Probability Frequentists Bayesians Use probabilities to … describe frequencies quantify information Think model random variables with fixed unknowns parameters are … probabilities a repeatable random observed and therefore Think data is … variable fixed Call their work … “Statistics" “Inference" Make statements   intervals covering the truth constraints on model about … x% of the time parameters many approaches with   one approach with   Have … lots of implicit choices explicit choices

Why Bayesian probability for science? • Answers the right question • We want facts about the world, not about hypothetical ensembles of experiments • The ideal process is always clear • Practical implementations more difficult • Problems and questions are more explicit

Statistics & Bayesian Inference Lecture 1 Joe Zuntz Lecture 1 - PowerPoint PPT Presentation

Statistics & Bayesian Inference Lecture 1 Joe Zuntz Lecture 1 Essentials of probability Some analytic Motivations distributions Definitions Bayes Theorem Probability Models & Parameter Distributions Spaces

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Basics of Bayesian Inference A frequentist thinks of unknown parameters as fixed Basics of

Non-parametric Bayesian Statistics Graham Neubig 2011-12-22 1 Graham Neubig Non-parametric

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Machine Learning: Foundations Lecturer: Yishay Mansour Lecture 2 Bayesian Inference Kfir Bar

Stat 5102 Lecture Slides: Deck 4 Bayesian Inference Charles J. Geyer School of Statistics

Lecture 6. Bayesian estimation Lecture 6. Bayesian estimation 1 (172) 6. Bayesian estimation

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

CS 730/730W/830: Intro AI Bayesian Networks Approx. Inference Exact Inference 1 handout: slides

CS 730/830: Intro AI Bayesian Networks Approx. Inference Exact Inference Wheeler Ruml (UNH)

Meta-Bayesian Analysis A Bayesian decision-theoretic analysis of Bayesian inference under model

Statistics for Analytical Science at Warwick Simon Spencer Bayesian statistics in epidemiology

EST5104 Bayesian Inference EST5803 Advanced Bayesian Inference Ricardo Ehlers ehlers@icmc.usp.br

Analytics, Inference and Computation in Cosmology: Exercises on Bayesian Inference Roberto

Approximate Bayesian inference for latent Gaussian models avard Rue 1 H Department of

Gov 2000: 1. Introduction Matthew Blackwell Fall 2016 1 / 40 1. Welcome and Motivation 2.

Biostatistics Burkhardt Seifert & Alois Tschopp Department of Biostatistics Epidemiology,

EViews Dr. Peerapat Wongchaiwat wongchaiwat@hotmail.com Todays Workshop Basic grasp of how

Analyzing a Factorial ANOVA: Significant interaction Rick Balkin, Ph.D., LPC-S, NCC Department

UPDATE Interim Chief Facilities Executive Kelly Schmader Presenting Rob Brykalski DBIA

Facilities Design and Construction Center, Atlantic Vicki Worrell What we do: Provide

Evaluations, Studies, and Research 707.031: Evaluation Methodology Winter 2014/15 Eduardo Veas

Contract award status Includes enhancements to the station access from East Glebe Road

Statistics & Bayesian Inference Lecture 1 Joe Zuntz Lecture 1 - PowerPoint PPT Presentation

Statistics & Bayesian Inference Lecture 1 Joe Zuntz Lecture 1 Essentials of probability Some analytic Motivations distributions Definitions Bayes Theorem Probability Models & Parameter Distributions Spaces

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Basics of Bayesian Inference A frequentist thinks of unknown parameters as fixed Basics of

Non-parametric Bayesian Statistics Graham Neubig 2011-12-22 1 Graham Neubig Non-parametric

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Machine Learning: Foundations Lecturer: Yishay Mansour Lecture 2 Bayesian Inference Kfir Bar

Stat 5102 Lecture Slides: Deck 4 Bayesian Inference Charles J. Geyer School of Statistics

Lecture 6. Bayesian estimation Lecture 6. Bayesian estimation 1 (172) 6. Bayesian estimation

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

CS 730/730W/830: Intro AI Bayesian Networks Approx. Inference Exact Inference 1 handout: slides

CS 730/830: Intro AI Bayesian Networks Approx. Inference Exact Inference Wheeler Ruml (UNH)

Meta-Bayesian Analysis A Bayesian decision-theoretic analysis of Bayesian inference under model

Statistics for Analytical Science at Warwick Simon Spencer Bayesian statistics in epidemiology

EST5104 Bayesian Inference EST5803 Advanced Bayesian Inference Ricardo Ehlers ehlers@icmc.usp.br

Analytics, Inference and Computation in Cosmology: Exercises on Bayesian Inference Roberto

Approximate Bayesian inference for latent Gaussian models avard Rue 1 H Department of

Gov 2000: 1. Introduction Matthew Blackwell Fall 2016 1 / 40 1. Welcome and Motivation 2.

Biostatistics Burkhardt Seifert &amp; Alois Tschopp Department of Biostatistics Epidemiology,

EViews Dr. Peerapat Wongchaiwat wongchaiwat@hotmail.com Todays Workshop Basic grasp of how

Analyzing a Factorial ANOVA: Significant interaction Rick Balkin, Ph.D., LPC-S, NCC Department

UPDATE Interim Chief Facilities Executive Kelly Schmader Presenting Rob Brykalski DBIA

Facilities Design and Construction Center, Atlantic Vicki Worrell What we do: Provide

Evaluations, Studies, and Research 707.031: Evaluation Methodology Winter 2014/15 Eduardo Veas

Contract award status Includes enhancements to the station access from East Glebe Road

Biostatistics Burkhardt Seifert & Alois Tschopp Department of Biostatistics Epidemiology,