Probabilistic Programming Practical Frank Wood, Brooks Paige - PowerPoint PPT Presentation

Probabilistic Programming Practical Frank Wood, Brooks Paige {fwood,brooks}@robots.ox.ac.uk MLSS 2015

Java (> v. 1.5) Java Installation Mac and Windows: Linux: Download and run the installer # Debian/Ubuntu from https://www.java.com/ sudo apt-get install en/download/manual.jsp default-jre → , # Fedora sudo yum install java-1.7.0-openjdk , →

Leiningen (v. > 2.0) Leiningen Installation # Download lien to ˜/bin mkdir ˜/bin cd ˜/bin wget http://git.io/XyijMQ # Make executable chmod a+x ˜/bin/lein # Add ˜/bin to path echo ’export PATH="$HOME/bin:$PATH"’ >> ˜/.bashrc # Run lein lein Further details: http://leiningen.org/

Practical Materials • https://bitbucket.org/probprog/mlss2015/get/ master.zip • cd mlss2015 • lein gorilla • open the url 5

Schedule • 15:35 - 16:05 Intro/Hello World! • 16:05 - 16:30 Gaussian (you code) • 16:30 - 16:40 Discuss / intro to physics problem • 16:40 - 16:55 Physics (you code) • 16:55 - 17:00 Share / discuss solutions • 17:00 - 17:20 Inference explanation • 17:20 - 17:45 Poisson (you code) • 17:45 - 17:50 Inference Q/A • 17:50 - 18:05 Coordination (you code)

What is probabilistic programming?

An Emerging Field ML: STATS: Algorithms & Inference & Applications Theory Probabilistic Programming PL: Compilers, Semantics, Analysis

Conceptualization p ( x | Parameters Parameters p ( y | x ) p ( x ) Program Program Observations Output y CS Probabilistic Programming Statistics

Operative Definition “Probabilistic programs are usual functional or imperative programs with two added constructs: (1) the ability to draw values at random from distributions, and (2) the ability to condition values of variables in a program via observations.” Gordon et al, 2014

What are the goals of probabilistic programming?

Simplify Machine Learning… Existing Model Performs Well Search For Useable Start Performs Well Scale Sufficient? Statistically? Implementation Computationally? Y Y N N N Y Derive Updates And Exists And Identify And Formalize Design Model = Test Code Inference Deploy Can Use? Problem, Gather Data Read Papers, Do Math Algorithm Y N Y N Simple Model? End Feasible? LEGEND Color indicates the skills that are N required to traverse the edge. Non-specialist Y Y PhD-level machine learning or statistics N Tool PhD-level machine learning or computer Implement Using High Choose Approximate Supports Required science Level Modeling Tool Inference Algorithm Features?

To This Existing Model Search For Useable Start Performs Well Performs Well? Scale Sufficient? Implementation Computationally? N Y Design Model = Derive Updates And Exists And Identify And Formalize Write Probabilistic Debug, Test, Profile Code Inference Deploy Can Use? Problem, Gather Data Program Algorithm Simple Model? End Feasible? LEGEND Color indicates the skills that are required to traverse the edge. Non-specialist Tool Implement Using High Choose Approximate Supports Required Level Modeling Tool Inference Algorithm Features?

Automate Inference Models / Stochastic Simulators α α H r c 0 γ λ m r 0 r 1 r 2 r 3 π r T π G G o o k k z d w d K 1 β k θ d α γ i i c 1 π m i = 1 . . . N d s 0 s 1 s 2 s 3 s T . k = 1 ...K c θ c θ d = 1 ...D i k i k H y θ m K 1 y 1 y 2 y 3 y T ∞ y y i i N N Programming Language Representation / Abstraction Layer Inference Engine(s)

Hello World! 15

First Exercise Gaussian Unknown Mean • Learning objectives 1. Clojure 2. Gorilla REPL 3. Anglican 4. Automatic inference over generative models expressed as programs via query • Resources • https://clojuredocs.org/ • https://bitbucket.org/probprog/anglican/ • http://www.robots.ox.ac.uk/~fwood/anglican/index.html 16

Simulation

Second Exercise Learning objectives 1. Develop experience thinking about expressing problems as inference over program executions 2. Understand how to perform inference over a complex deterministic generative process, here a 2D-physics simulator 18

Second Exercise Use inference to solve a mechanism design optimization task: • get all balls safely in bin 19

Inference

Trace Probability • observe data points N y n Y p ( y 1: N , x 1: N ) = g ( y n | x 1: n ) f ( x n | x 1: n − 1 ) n =1 • internal random choices x n x 1 x 2 x 3 • simulate from y 3 y 1 y 2 f ( x n | x 1: n − 1 ) x 2 by running the program x 1 { { etc forward θ x 12 x 11 x 13 x 21 x 22 • weight traces by observes y 1 y 2 g ( y n | x 1: n )

Trace x 2 , 2 = 0 x 2 , 2 = 1 x 2 , 1 = 7 . . . x 1 , 2 = 0 x 1 , 2 = 1 x 1 , 1 = 3 x 1 , 2 = 2 x 2 , 1 = 9 ( let [x-1-1 3 x-1-2 ( sample ( discrete (range x-1-1)))] ( if (not= x-1-2 1) ( let [x-2-1 (+ x-1-2 7)] ( sample ( poisson x-2-1)))))

Observe x 2 , 2 = 0 x 2 , 2 = 1 x 2 , 1 = 7 . . . x 1 , 2 = 0 x 1 , 2 = 1 x 1 , 1 = 3 x 1 , 2 = 2 x 2 , 1 = 9 ( let [x-1-1 3 x-1-2 ( sample ( discrete (range x-1-1)))] ( if (not= x-1-2 1) ( let [x-2-1 (+ x-1-2 7)] ( sample ( poisson x-2-1)))) ( observe ( gaussian x-2-1 0.0001) 7)))

“Single Site” MCMC = LMH Posterior distribution of execution traces is proportional to trace score with observed values plugged in p ( x | y ) ∝ ˜ p ( y = observes , x ) Metropolis-Hastings acceptance rule ✓ ◆ 1 , p ( y | x 0 ) p ( x 0 ) q ( x | x 0 ) min p ( y | x ) p ( x ) q ( x 0 | x ) ▪ Need ▪ Proposal ▪ Have ▪ Likelihoods (via observe statement restrictions) ▪ Prior (sequence of ERP returns; scored in interpreter) 24 Lightweight Implementations of Probabilistic Programming Languages Via Transformational Compilation [Wingate, Stuhlmüller et al, 2011]

LMH Proposal Probability of new part of Single stochastic proposed execution trace procedure (SP) output p ( x 0 \ x | x 0 ∩ x ) q ( x 0 | x ) = κ ( x 0 m,j | x m,j ) m,j | x 0 ∩ x ) . | x | p ( x 0 Number of SP’s in original trace Probability of new SP return value (sample) given trace prefix [Wingate, Stuhlmüller et al, 2011]

LMH Implementation “Single site update” = sample from the prior = run program forward pled conditioned on the preced m,j | x 0 ∩ x ). at κ ( x 0 m,j | x m,j ) = p ( x 0 density can now be expressed Simplified MH acceptance ratio Number of SP applications Probability of regenerating current trace in original trace continuation given proposal trace beginning p ( y | x 0 ) p ( x 0 ) | x | p ( x \ x 0 | x ∩ x 0 ) p ( y | x ) p ( x ) | x 0 | p ( x 0 \ x | x 0 ∩ x ) . Number of SP applications Probability of generating proposal trace in new trace continuation given current trace beginning 26

Introduction : Sequential Monte Carlo Sequential ¡Monte ¡Carlo ¡targets ¡ N Y p ( x 1: N | y 1: N ) ∝ ˜ p ( y 1: N , x 1: N ) g ( y n | x 1: n ) f ( x n | x 1: n − 1 ) ≡ n =1 With ¡a ¡weighted ¡set ¡of ¡particles ¡ L X w ` p ( x 1: N | y 1: N ) ≈ N � x ` 1: N ( x 1: N ) . ` =1 Noting ¡the ¡identity ¡ We ¡can ¡use ¡importance ¡sampling ¡to ¡generate ¡samples ¡from ¡ ¡ p ( x 1: n | y 1: n ) = Given ¡a ¡sample-‑based ¡approximation ¡to ¡ ¡ 1 ) p ( x 1: n − 1 | y 1: n − 1 ) ¡ ¡

SMC n = 1 n = 2 Iteratively,   - simulate   - weight   Particle - resample Observe

SMC for Probabilistic Programming Parallel executions L X w ` p ( x 1: n � 1 | y 1: n � 1 ) 1: n − 1 ( x 1: n � 1 ) n � 1 δ x ` ≈ ` =1 Sequence of environments p ( x 1: n | y 1: n ) = g ( y n | x 1: n ) f ( x n | x 1: n � 1 ) p ( x 1: n � 1 | y 1: n � 1 ) q ( x 1: n | y 1: n ) = f ( x n | x 1: n � 1 ) p ( x 1: n � 1 | y 1: n � 1 ) Proposal L a ` X g ( y n | x ` x ` 1: n = x ` n − 1 p ( x 1: n | y 1: n ) ≈ 1: n ) δ x ` 1: n ( x 1: n ) , 1: n � 1 ∼ f n x ` =1 Run program forward Weight of particle until next observe Is observation likelihood W., van de Meent, and Mansinghka “A New Approach to Probabilistic Programming Inference” AISTATS 2014 Fischer, Kiselyov, and Shan “Purely functional lazy non-deterministic programming” ACM Sigplan 2009 Paige and W. “A Compilation Target for Probabilistic Programming Languages” ICML 2014

SMC Methods Only Require • Initialization (sample) p ( x 1 ) • Forward simulation (sample) f ( x n | x 1: n � 1 ) • Observation likelihood computation • pointwise evaluation up to normalization g ( y n | x 1: n )

SMC for Probabilistic Programming Algorithm 1 Parallel SMC program execution Assume: N observations, L particles launch L copies of the program (parallel) for n = 1 . . . N do wait until all L reach observe y n (barrier) w 1: L update unnormalized weights ˜ (serial) n if ESS < ⌧ then sample number of offspring O 1: L (serial) n w 1: L set weight ˜ = 1 (serial) n for ` = 1 . . . L do fork or exit (parallel) end for else set all number of offspring O ` n = 1 (serial) end if continue program execution (parallel) end for wait until L program traces terminate (barrier) p ( x 1: L predict from L samples from ˆ 1: N | y 1: N ) (serial) Paige and W. “A Compilation Target for Probabilistic Programming Languages” ICML 2014 Paige and W. “A Compilation Target for Probabilistic Programming Languages.” ICML, 2014

SMC for Probabilistic Programming Intuitively   - run   - wait   Threads - fork continuations observe delimiter

Probabilistic Programming Practical Frank Wood, Brooks Paige - PowerPoint PPT Presentation

Probabilistic Programming Practical Frank Wood, Brooks Paige {fwood,brooks}@robots.ox.ac.uk MLSS 2015 Setup Java (> v. 1.5) Java Installation Mac and Windows: Linux: Download and run the installer # Debian/Ubuntu from

CS 4110 Probabilistic Programming Probabilistic Programming It's not about writing software.

Probabilistic model Probabilistic model c Probabilistic model Probabilistic model c c

Principles of Probabilistic Programming Lectures at EWSCS 2020 Winter School Joost-Pieter Katoen

Reactive Probabilistic Programming Semantics with Mixed Nondeterministic/Probabilistic Automata

Practical Experience with Practical Experience with Practical Experience with Practical

An MCMC library for probabilistic programming Rob Zinkov June 13th, 2014 Rob Zinkov An MCMC

A Brief Introduction to Probabilistic and Quantum Programming Part II Ugo Dal Lago Universidade

Introduction to Probabilistic and Quantum Programming Part II Ugo Dal Lago BISS 2014, Bertinoro

Running Probabilistic Running Probabilistic Running Probabilistic Programs Backwards Programs

Probabilistic Tracking and Probabilistic Tracking and Probabilistic Tracking and Thesis

Probabilistic Computation Lecture 13 BPP vs. PH 1 Recap 2 Recap Probabilistic computation 2

Table of Contents I Probabilistic Reasoning Classical Probabilistic Models Basic Probabilistic

Probabilistic Computation Lecture 12 Flipping coins, taking chances PP, BPP 1 Probabilistic

Probabilistic Tracking and Probabilistic Tracking and Probabilistic Tracking and Reconstruction

Probabilistic Computation Lecture 13 Understanding BPP 1 Recap 2 Recap Probabilistic

From Probabilistic Circuits to Probabilistic Programs and Back Guy Van den Broeck PROBPROG - Oct

Statistical Methods for Infectious Diseases Household Based Studies I Lecture 7C M. Elizabeth

t trt s

Probabilistische graphische Modelle mit Scala Andreas Bille rcs systems GmbH Ab 1500 G.

Workshop 5: Introduction to Bayesian models Murray Logan April 9, 2016 Table of contents 0.1.

TOS Arno Puder 1 Objectives Enhance TOS: Add malloc(), free() Overlapping windows

L p eigenfunction estimates and directional oscillation Melissa Tacy Department of Mathematics

Topic 11 Simple Graphics "What makes the situation worse is that the highest level CS

Art by Numbers Creative Coding & Generative Art in Processing 2 Ira Greenberg, Dianna Xu,

Probabilistic Programming Practical Frank Wood, Brooks Paige - PowerPoint PPT Presentation

Probabilistic Programming Practical Frank Wood, Brooks Paige {fwood,brooks}@robots.ox.ac.uk MLSS 2015 Setup Java (> v. 1.5) Java Installation Mac and Windows: Linux: Download and run the installer # Debian/Ubuntu from

CS 4110 Probabilistic Programming Probabilistic Programming It's not about writing software.

Probabilistic model Probabilistic model c Probabilistic model Probabilistic model c c

Principles of Probabilistic Programming Lectures at EWSCS 2020 Winter School Joost-Pieter Katoen

Reactive Probabilistic Programming Semantics with Mixed Nondeterministic/Probabilistic Automata

Practical Experience with Practical Experience with Practical Experience with Practical

An MCMC library for probabilistic programming Rob Zinkov June 13th, 2014 Rob Zinkov An MCMC

A Brief Introduction to Probabilistic and Quantum Programming Part II Ugo Dal Lago Universidade

Introduction to Probabilistic and Quantum Programming Part II Ugo Dal Lago BISS 2014, Bertinoro

Running Probabilistic Running Probabilistic Running Probabilistic Programs Backwards Programs

Probabilistic Tracking and Probabilistic Tracking and Probabilistic Tracking and Thesis

Probabilistic Computation Lecture 13 BPP vs. PH 1 Recap 2 Recap Probabilistic computation 2

Table of Contents I Probabilistic Reasoning Classical Probabilistic Models Basic Probabilistic

Probabilistic Computation Lecture 12 Flipping coins, taking chances PP, BPP 1 Probabilistic

Probabilistic Tracking and Probabilistic Tracking and Probabilistic Tracking and Reconstruction

Probabilistic Computation Lecture 13 Understanding BPP 1 Recap 2 Recap Probabilistic

From Probabilistic Circuits to Probabilistic Programs and Back Guy Van den Broeck PROBPROG - Oct

Statistical Methods for Infectious Diseases Household Based Studies I Lecture 7C M. Elizabeth

t trt s

Probabilistische graphische Modelle mit Scala Andreas Bille rcs systems GmbH Ab 1500 G.

Workshop 5: Introduction to Bayesian models Murray Logan April 9, 2016 Table of contents 0.1.

TOS Arno Puder 1 Objectives Enhance TOS: Add malloc(), free() Overlapping windows

L p eigenfunction estimates and directional oscillation Melissa Tacy Department of Mathematics

Topic 11 Simple Graphics &quot;What makes the situation worse is that the highest level CS

Art by Numbers Creative Coding &amp; Generative Art in Processing 2 Ira Greenberg, Dianna Xu,

Topic 11 Simple Graphics "What makes the situation worse is that the highest level CS

Art by Numbers Creative Coding & Generative Art in Processing 2 Ira Greenberg, Dianna Xu,