Semantic Foundations for Probabilistic Programming Chris Heunen - PowerPoint PPT Presentation

Semantic Foundations for Probabilistic Programming Chris Heunen Ohad Kammar, Sam Staton, Frank Wood, Hongseok Yang 1 / 21

Semantic foundations � − � programs mathematical objects s1 • s2 • 2 / 21

Semantic foundations � − � programs mathematical objects s1 • s2 • • s1;s2 2 / 21

Semantic foundations � − � programs mathematical objects s1 • s2 • • s1;s2 ◮ Operational : remember implementation details (efficiency) ◮ Denotational : see what program does conceptually (correctness) 2 / 21

Semantic foundations � − � programs mathematical objects s1 • s2 • • s1;s2 ◮ Operational : remember implementation details (efficiency) ◮ Denotational : see what program does conceptually (correctness) Motivation: ◮ Ground programmer’s unspoken intuitions ◮ Justify/refute/suggest program transformations ◮ Understand programming through mathematics 2 / 21

Semantic foundations � − � programs mathematical objects s1 • s2 • • s1;s2 ◮ Operational : remember implementation details (efficiency) ◮ Denotational : see what program does conceptually (correctness) Motivation: ◮ Ground programmer’s unspoken intuitions ◮ Justify/refute/suggest program transformations ◮ Understand probability through program equations 2 / 21

Probabilistic programming P ( A | B ) = P ( B | A ) × P ( A ) P ( B ) 3 / 21

Probabilistic programming P ( A | B ) ∝ P ( B | A ) × P ( A ) 3 / 21

Probabilistic programming P ( A | B ) ∝ P ( B | A ) × P ( A ) posterior ∝ likelihood × prior 3 / 21

Probabilistic programming P ( A | B ) ∝ P ( B | A ) × P ( A ) posterior ∝ likelihood × prior idealized Anglican = functional programming + normalize observe sample http://www.robots.ox.ac.uk/~fwood/anglican 3 / 21

Overview ◮ Interpret types as measurable spaces e.g. � real � = R ◮ Interpret (open) terms as kernels ◮ Interpret closed terms as measures ◮ Inference normalizes measures posterior ∝ likelihood × prior [Kozen, “Semantics of probabilistic programs”, J Comp Syst Sci, 1981] 4 / 21

Overview ◮ Interpret types as measurable spaces e.g. � real � = R ◮ Interpret (open) terms as kernels ◮ Interpret closed terms as measures ◮ Inference normalizes measures posterior ∝ likelihood × prior But: ◮ Commutativity? Fubini not true for all kernels ◮ Higher order functions? R → R not a measurable space ◮ Extensionality? ◮ Recursion? [Kozen, “Semantics of probabilistic programs”, J Comp Syst Sci, 1981] [Aumann, “Borel structures for function spaces”, Ill J Math, 1961] 4 / 21

Example 1. Toss a fair coin to get outcome x 2. Set up exponential decay with rate r depending on x 3. Observe immediate decay 4. What is the outcome x ? 5 / 21

Example 1. Toss a fair coin to get outcome x 2. Set up exponential decay with rate r depending on x 3. Observe immediate decay 4. What is the outcome x ? let x = sample(bern(0.5)) in let r = if x then 2.0 else 1.0 observe(0.0 from exp(r)); return x 5 / 21

Example 1. Toss a fair coin to get outcome x 2. Set up exponential decay with rate r depending on x 3. Observe immediate decay 4. What is the outcome x ? two traces: 0.5 0.5 let x = sample(bern(0.5)) in x=true x=false let r = if x then 2.0 else 1.0 observe(0.0 from exp(r)); return x 5 / 21

Example 1. Toss a fair coin to get outcome x 2. Set up exponential decay with rate r depending on x 3. Observe immediate decay 4. What is the outcome x ? two traces: 0.5 0.5 let x = sample(bern(0.5)) in x=true x=false let r = if x then 2.0 else 1.0 r=2.0 observe(0.0 from exp(r)); score 2 return x return true 5 / 21

Example 1. Toss a fair coin to get outcome x 2. Set up exponential decay with rate r depending on x 3. Observe immediate decay 4. What is the outcome x ? two traces: 0.5 0.5 let x = sample(bern(0.5)) in x=true x=false let r = if x then 2.0 else 1.0 r=2.0 r=1.0 observe(0.0 from exp(r)); score 2 score 1 return x return true return false 5 / 21

Example 1. Toss a fair coin to get outcome x 2. Set up exponential decay with rate r depending on x 3. Observe immediate decay 4. What is the outcome x ? two traces: 0.5 0.5 let x = sample(bern(0.5)) in x=true x=false let r = if x then 2.0 else 1.0 r=2.0 r=1.0 observe(0.0 from exp(r)); score 2 score 1 return x return true return false 2 × 0.5: true posterior ∝ likelihood × prior 1 × 0.5: false 5 / 21

Example 1. Toss a fair coin to get outcome x 2. Set up exponential decay with rate r depending on x 3. Observe immediate decay 4. What is the outcome x ? P ( true ) = 1, P ( false ) = 0 . 5 two traces: 0.5 0.5 let x = sample(bern(0.5)) in x=true x=false let r = if x then 2.0 else 1.0 r=2.0 r=1.0 observe(0.0 from exp(r)); score 2 score 1 return x return true return false 2 × 0.5: true posterior ∝ likelihood × prior 1 × 0.5: false 5 / 21

Example 1. Toss a fair coin to get outcome x 2. Set up exponential decay with rate r depending on x 3. Observe immediate decay model evidence (score): 1.5 4. What is the outcome x ? P ( true ) = 66 % , P ( false ) = 33 % two traces: 0.5 0.5 let x = sample(bern(0.5)) in x=true x=false let r = if x then 2.0 else 1.0 r=2.0 r=1.0 observe(0.0 from exp(r)); score 2 score 1 return x return true return false 2 × 0.5: true posterior ∝ likelihood × prior 1 × 0.5: false 5 / 21

Example 1. Toss a fair coin to get outcome x 2. Set up exponential decay with rate r depending on x 3. Observe immediate decay model evidence (score): 1.5 4. What is the outcome x ? P ( true ) = 66 % , P ( false ) = 33 % Programs may also sample continuous distributions so have to deal with uncountable number of traces: let y = sample(gauss(7,2)) 5 / 21

Measure theory Impossible to sample 0 . 5 from standard normal distribution But sample in interval ( 0 , 1 ) with probability around 0 . 34 6 / 21

Measure theory Impossible to sample 0 . 5 from standard normal distribution But sample in interval ( 0 , 1 ) with probability around 0 . 34 A measurable space is a set X with a family Σ X of subsets that is closed under countable unions and complements A (probability) measure on X is a function p : Σ X → [ 0 , ∞ ] that satisfies p ( � U n ) = � p ( U n ) (and has p ( X ) = 1) 6 / 21

First order language � ◮ Types: A , B ::= R | P( A ) | | A × B | i ∈ I A i 1 real numbers finite products distributions over A countable sums bool := 1 + 1 nat := � i ∈ N 1 7 / 21

First order language � ◮ Types: A , B ::= R | P( A ) | | A × B | i ∈ I A i 1 ◮ Deterministic terms may not sample: ◮ variables x , y , z ◮ constructors for sums and products case , in i , if , false , true ◮ measurable functions bern , exp , gauss , dirac ⊢ d 42.0 : R ⊢ d gauss ( 2.0 , 7.0 ) : P ( R ) x : R , y : R ⊢ d x + y : R x : R , y : R ⊢ d x < y : bool 7 / 21

First order language � ◮ Types: A , B ::= R | | | | i ∈ I A i P( A ) 1 A × B ◮ Deterministic terms may not sample: ◮ variables x , y , z ◮ constructors for sums and products case , in i , if , false , true ◮ measurable functions bern , exp , gauss , dirac ◮ Probabilistic terms may sample: ◮ sequencing return , let ◮ constraints score ◮ priors sample Γ ⊢ p t : A x : A ⊢ p u : B Γ ⊢ d t : A Γ ⊢ Γ ⊢ p return ( t ): A p let x = t in u : B Γ ⊢ d t : P ( A ) Γ ⊢ d t : R Γ ⊢ Γ ⊢ p score ( t ): 1 p sample ( t ): A 7 / 21

First order language � ◮ Types: A , B ::= R | | | | i ∈ I A i P( A ) 1 A × B ◮ Deterministic terms may not sample: ◮ variables x , y , z ◮ constructors for sums and products case , in i , if , false , true ◮ measurable functions bern , exp , gauss , dirac ◮ inference norm ◮ Probabilistic terms may sample: ◮ sequencing return , let ◮ constraints score ◮ priors sample Γ ⊢ p t : A x : A ⊢ p u : B Γ ⊢ d t : A Γ ⊢ Γ ⊢ p return ( t ): A p let x = t in u : B Γ ⊢ d t : P ( A ) Γ ⊢ d t : R Γ ⊢ Γ ⊢ p score ( t ): 1 p sample ( t ): A 7 / 21

First order semantics Interpret ◮ type A as measurable space � A � ◮ deterministic term Γ ⊢ d t : A as measurable function � Γ � → � A � ◮ probabilistic term Γ ⊢ as kernel � t � : � Γ � × Σ � A � → [ 0 , ∞ ] p t : A fixing first argument: measure, fixing second argument: measurable 8 / 21

First order semantics Interpret ◮ type A as measurable space � A � ◮ deterministic term Γ ⊢ d t : A as measurable function � Γ � → � A � ◮ probabilistic term Γ ⊢ as kernel � t � : � Γ � × Σ � A � → [ 0 , ∞ ] p t : A fixing first argument: measure, fixing second argument: measurable Γ ⊢ d t : R � score ( t ) � ( γ, ∗ ) = � t � ( γ ) Γ ⊢ p score ( t ): 1 8 / 21

First order semantics Interpret ◮ type A as measurable space � A � ◮ deterministic term Γ ⊢ d t : A as measurable function � Γ � → � A � ◮ probabilistic term Γ ⊢ as kernel � t � : � Γ � × Σ � A � → [ 0 , ∞ ] p t : A fixing first argument: measure, fixing second argument: measurable Γ ⊢ d t : R � score ( t ) � ( γ, ∗ ) = � t � ( γ ) Γ ⊢ p score ( t ): 1 Γ ⊢ d t : P ( A ) � sample ( t ) � ( γ, U ) = ( � t � ( γ ))( U ) Γ ⊢ p sample ( t ): A 8 / 21

Semantic Foundations for Probabilistic Programming Chris Heunen - PowerPoint PPT Presentation

Semantic Foundations for Probabilistic Programming Chris Heunen Ohad Kammar, Sam Staton, Frank Wood, Hongseok Yang 1 / 21 Semantic foundations programs mathematical objects s1 s2 2 / 21 Semantic foundations

CS 4110 Probabilistic Programming Probabilistic Programming It's not about writing software.

Probabilistic model Probabilistic model c Probabilistic model Probabilistic model c c

recap to this point foundations foundations foundations foundations genetics =

Boosting: Foundations and Algorithms Boosting: Foundations and Algorithms Boosting: Foundations

Principles of Probabilistic Programming Lectures at EWSCS 2020 Winter School Joost-Pieter Katoen

Creating Semantic Mashups: Bridging Web 2.0 and the Semantic Web Jamie Taylor, Colin Evans, Toby

: on the Semantic Web : on the Semantic Web Building a Semantic Prototype for Danish Building a

Semantic Processing Augmenting CFGs Currying Quantifier scope Semantic Grammars L445 / L545

Align, Disambiguate, and Walk A Unified Approach for Measuring Semantic Similarity Semantic

Semantic Similarity MultiJEDI ERC 259234 Semantic Similarity Semantic Similarity Mostly

Module 13 Introduction to Semantic Technology, Ontologies and the Semantic Web Module 13 Outline

Reactive Probabilistic Programming Semantics with Mixed Nondeterministic/Probabilistic Automata

An MCMC library for probabilistic programming Rob Zinkov June 13th, 2014 Rob Zinkov An MCMC

A Brief Introduction to Probabilistic and Quantum Programming Part II Ugo Dal Lago Universidade

Introduction to Probabilistic and Quantum Programming Part II Ugo Dal Lago BISS 2014, Bertinoro

Running Probabilistic Running Probabilistic Running Probabilistic Programs Backwards Programs

technique: assessing anthropogenic emissions of CO,NOx and CO2 and their impacts. J. Brioude

Large Sample Robustness Bayes Nets with Incomplete Information Jim Smith and Ali Daneshkhah

FEASIBLE JOINT POSTERIOR BELIEFS BAYESIAN COMMUNICATION N Receivers: POSTERIOR s 1 S 1 p

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture 25: Introduction to

Data Asymptotics Dr. Jarad Niemi STAT 544 - Iowa State University February 7, 2018 Jarad Niemi

Thompson Sampling and Linear Bandits Instructor: Sham Kakade 1 Review The basic paradigm is as

Statistics for Applications Chapter 8: Bayesian Statistics 1/17 The Bayesian approach (1)

Efficient Probabilistic Inference in the Quest for Physics Beyond the Standard Model Atlm