From lazy evaluation to Gibbs sampling Chung-chieh Shan Indiana - PowerPoint PPT Presentation

From lazy evaluation to Gibbs sampling Chung-chieh Shan Indiana University March 19, 2014 This work is supported by DARPA grant FA8750-14-2-0007. 1

Come to Indiana University to create essential abstractions and practical languages for clear, robust and efficient programs. Dan Friedman Ryan Newton relational & logic languages, streaming, distributed & GPU DSLs, meta-circularity & reflection Haskell deterministic parallelism Amr Sabry Chung-chieh Shan quantum computing, type probabilistic programming, theory, information effects semantics Jeremy Siek Sam Tobin-Hochstadt gradual typing, types for untyped languages, mechanized metatheory, contracts, high performance languages for the Web Check out our work: Boost Libraries · Build to Order BLAS · C++ Concepts · Chapel Generics · HANSEI · JavaScript Modules · Racket & Typed Racket · miniKanren · LVars · monad-par · meta-par · WaveScript http:/ /lambda.soic.indiana.edu/

Probabilistic programming Alice beat Bob at a game. Is she better than him at it? Generative story 4

Probabilistic programming Alice beat Bob at a game. Is she better than him at it? 300 Generative story 250 a <- normal 10 3 200 150 100 50 0 0 5 10 15 20 25 4 a

Probabilistic programming Alice beat Bob at a game. Is she better than him at it? 20 Generative story a <- normal 10 3 15 b <- normal 10 3 10 b 5 0 0 5 10 15 20 4 a

Probabilistic programming Alice beat Bob at a game. Is she better than him at it? Generative story a <- normal 10 3 b <- normal 10 3 l <- normal 0 2 4 2 noise 0 -2 -4 20 15 20 15 10 b 10 a 5 5 0 4 0

Probabilistic programming Alice beat Bob at a game. Is she better than him at it? Generative story a <- normal 10 3 b <- normal 10 3 l <- normal 0 2 4 Observed effect 2 noise 0 condition (a-b > l) -2 -4 20 15 20 15 10 b 10 a 5 5 0 4 0

Probabilistic programming Alice beat Bob at a game. Is she better than him at it? Generative story a <- normal 10 3 b <- normal 10 3 l <- normal 0 2 4 Observed effect 2 noise 0 condition (a-b > l) -2 -4 Hidden cause 20 return (a > b) 15 20 15 10 b 10 a 5 5 0 4 0

Probabilistic programming Alice beat Bob at a game. Is she better than him at it? Generative story a <- normal 10 3 b <- normal 10 3 l <- normal 0 2 Observed effect condition (a-b > l) Hidden cause return (a > b) Denoted measure: ❩ ❩ ❩ ❞❧ ❤ ❛ � ❜ ❃ ❧ ✐ ❝ ✭ ❛ ❃ ❜ ✮ ✕❝✿ ❞❛ ❞❜ ◆ ✭✶✵ ❀ ✸✮ ◆ ✭✶✵ ❀ ✸✮ ◆ ✭✵ ❀ ✷✮ 4

♣ ✶✵ ✶✵ q q ✾✶ ✾ ✭ ✾ ✰ ✶ ✾ ✶✵ ✶✵✮ ✶✵ ✶✵ ✶✵ ✶✵ q ✶✹✶ ✹✾ ✶✵ ✶✵ Sampling is hard. Let’s do math! Filtering = tracking current state with uncertainty 5

q q ✾✶ ✾ ✭ ✾ ✰ ✶ ✾ ✶✵ ✶✵✮ ✶✵ ✶✵ ✶✵ ✶✵ q ✶✹✶ ✹✾ ✶✵ ✶✵ ♣ ✶✵ ✶✵ Sampling is hard. Let’s do math! Filtering = tracking current state with uncertainty 70 60 Generative story 50 x <- normal 10 3 40 30 20 10 0 0 5 10 15 20 x 5

q ✶✹✶ ✹✾ ✶✵ ✶✵ ♣ ✶✵ ✶✵ q q ✾✶ ✾ ✭ ✾ ✰ ✶ ✾ ✶✵ ✶✵✮ ✶✵ ✶✵ ✶✵ ✶✵ Sampling is hard. Let’s do math! Filtering = tracking current state with uncertainty 20 Generative story 15 x <- normal 10 3 m <- normal x 1 10 m 5 0 0 5 10 15 20 x 5

♣ ✶✵ ✶✵ q q ✾✶ ✾ ✭ ✾ ✰ ✶ ✾ ✶✵ ✶✵✮ ✶✵ ✶✵ ✶✵ ✶✵ q ✶✹✶ ✹✾ ✶✵ ✶✵ Sampling is hard. Let’s do math! Filtering = tracking current state with uncertainty Generative story 25 x <- normal 10 3 20 m <- normal x 1 x’ <- normal (x+5) 2 15 x’ 10 5 20 15 20 10 15 10 m 5 5 x 0 0 5

♣ ✶✵ ✶✵ q q ✾✶ ✾ ✭ ✾ ✰ ✶ ✾ ✶✵ ✶✵✮ ✶✵ ✶✵ ✶✵ ✶✵ q ✶✹✶ ✹✾ ✶✵ ✶✵ Sampling is hard. Let’s do math! Filtering = tracking current state with uncertainty Generative story 25 x <- normal 10 3 20 m <- normal x 1 x’ <- normal (x+5) 2 15 x’ Observed effect 10 condition (m = 9) 5 Hidden cause 20 15 20 10 15 return x’ 10 m 5 5 x 0 0 5

q ✾✶ ✾ ✶✵ ✶✵ q ✶✹✶ ✹✾ ✶✵ ✶✵ Sampling is hard. Let’s do math! Filtering = tracking current state with uncertainty Conditioning = clamp first/outermost choice/integral Generative story ♣ m <- normal ✶✵ ✶✵ x <- normal 10 3 q x <- normal ✭ ✾ ✶✵ m ✰ ✶ ✾ m <- normal x 1 ✶✵ ✶✵✮ ✶✵ x’ <- normal (x+5) 2 Observed effect condition (m = 9) Hidden cause return x’ 5

q ✾✶ ✾ ✶✵ ✶✵ q ✶✹✶ ✹✾ ✶✵ ✶✵ Sampling is hard. Let’s do math! Filtering = tracking current state with uncertainty Conditioning = clamp first/outermost choice/integral Generative story ♣ m <- normal ✶✵ ✶✵ x <- normal 10 3 let m = 9 q x <- normal ✭ ✾ ✶✵ m ✰ ✶ ✾ m <- normal x 1 ✶✵ ✶✵✮ ✶✵ x’ <- normal (x+5) 2 Observed effect condition (m = 9) Hidden cause return x’ 5

♣ ✶✵ ✶✵ q ✭ ✾ ✰ ✶ ✾ ✶✵ ✶✵✮ ✶✵ ✶✵ q ✶✹✶ ✹✾ ✶✵ ✶✵ Sampling is hard. Let’s do math! Filtering = tracking current state with uncertainty Conditioning = clamp first/outermost choice/integral Conjugacy = absorb one choice/integral into another Generative story q ✾✶ ✾ x <- normal ✶✵ ✶✵ x’ <- normal (x+5) 2 Hidden cause return x’ 5

♣ ✶✵ ✶✵ q q ✾✶ ✾ ✭ ✾ ✰ ✶ ✾ ✶✵ ✶✵✮ ✶✵ ✶✵ ✶✵ ✶✵ Sampling is hard. Let’s do math! Filtering = tracking current state with uncertainty Conditioning = clamp first/outermost choice/integral Conjugacy = absorb one choice/integral into another Generative story q ✶✹✶ ✹✾ x’ <- normal ✶✵ ✶✵ Hidden cause return x’ 5

Math is hard. Let’s go sampling! Each sample has an importance weight 6

Math is hard. Let’s go sampling! Each sample has an importance weight 25 Generative story x <- normal 10 3 20 m <- normal x 1 x’ <- normal (x+5) 2 Observed effect 15 x’ condition (m = 9) 10 Hidden cause return x’ 5 0 5 10 15 20 x 6

Math is hard. Let’s go sampling! Each sample has an importance weight : How much did we rig our random choices to avoid rejection? 25 Generative story x <- normal 10 3 20 m <- normal x 1 x’ <- normal (x+5) 2 Observed effect 15 x’ condition (m = 9) 10 Hidden cause return x’ 5 0 5 10 15 20 x 6

The story so far 1. Declarative program specifies generative story and observed effect. 2. We try mathematical optimizations, but still need to sample. 3. A sampler should generate a stream of samples (run-weight pairs) whose histogram matches the specified conditional distribution. 4. Importance sampling generates each sample independently. 7

Monte Carlo Markov Chain For harder search problems, keep the previous sampling run in memory, and take a random walk that lingers around high-probability runs. 8

Monte Carlo Markov Chain For harder search problems, keep the previous sampling run in memory, and take a random walk that lingers around high-probability runs. WingType RotorLength WingType=Helicopter BladeFlash Want: 1. match dimensions 2. reject less 3. infinite domain 8

A lazy probabilistic language data Code = Evaluate [Loc] ([Value] -> Code) | Allocate Code (Loc -> Code) | Generate [(Value, Prob)] type Prob = Double type Subloc = Int type Loc = [Subloc] data Value = Bool Bool | ... 9

A lazy probabilistic language data Code = Evaluate [Loc] ([Value] -> Code) | Allocate Code (Loc -> Code) | Generate [(Value, Prob)] bernoulli :: Prob -> Code bernoulli p = Generate [(Bool True , p ), WingType (Bool False, 1-p)] RotorLength example :: Code example = Allocate (bernoulli 0.5) $ \w -> WingType=Helicopter BladeFlash Allocate (bernoulli 0.5) $ \r -> Evaluate [w] $ \[Bool w] -> if w then Evaluate [r] $ \[Bool r] -> if r then bernoulli 0.4 else bernoulli 0.8 else bernoulli 0.2 9

Through the lens of lazy evaluation To match dimensions , Wingate et al.’s MH sampler reuses random choices in the heap from the previous run. ( memoization ) To reject less , Arora et al.’s Gibbs sampler evaluates code in the context of its desired output. ( destination passing ) 10

Summary Probabilistic programming ◮ Denote measure by generative story ◮ Run backwards to infer cause from effect Mathematical reasoning ◮ Define conditioning ◮ Reduce sampling ◮ Avoid rejection Lazy evaluation ◮ Match dimensions (reversible jump) ◮ Reject less (Gibbs sampling) ◮ Infinite domain? 11

From lazy evaluation to Gibbs sampling Chung-chieh Shan Indiana - PowerPoint PPT Presentation

From lazy evaluation to Gibbs sampling Chung-chieh Shan Indiana University March 19, 2014 This work is supported by DARPA grant FA8750-14-2-0007. 1 Come to Indiana University to create essential abstractions and practical languages for clear,

Gibbs-non-Gibbs dynamical transitions. A large-deviation paradigm R. Fern andez F. den

Gibbs sampling Dr. Jarad Niemi Iowa State University March 29, 2018 Jarad Niemi (Iowa State)

Faster Gaussian Lattice Sampling using Information Leakage Gaussian Sampling Our Work Lazy

Sampling Methods CMSC 678 UMBC Outline Recap Monte Carlo methods Sampling Techniques Uniform

Poisson-Minibatching for Gibbs Sampling with Convergence Rate Guarantees Ruqi Zhang and

CSci 8980: Advanced Topics in Graphical Models MCMC, Gibbs Sampling Instructor: Arindam Banerjee

Can We Represent Infinite Lists? Lazy Evaluation Amtoft Motivation Lazy Lists Conversions

Sampling Methods Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 11 Sampling Rejection Sampling

Chapter 7. Sampling Chapter 7. Sampling methods? methods? Two types of sampling methods Two

Multiple importance sampling Slides for CS6630 lecture 6 sampling the BRDF sampling the

What is the strengths and weakness of these sampling methods? Sampling Strengths /

Gibbs Sampling Bayesian Networks: A First Attempt with Cilk++ Alexander Dubbs May 13, 2010

Imagine for a moment @trentmwillis Lazy Loading Engines: Anything But Lazy Engines allow

Factors of Gibbs measures on subshifts What is a Gibbs measure? Two-ish definitions Equivalence

College P Planning N Night GIBBS GIBBS HIGH IGH SCHOOL SC SCHO HOOL COUNSE SELING OFFICE

Lazy v. Yield Incremental, Linear Pretty-printing Oleg Kiselyov Simon Peyton-Jones Amr Sabry

Composite games: strategies, equilibria, dynamics and applications Sylvain Sorin

Filtering with limited information Thorsten Drautzburg, 1 Jes andez-Villaverde, 2 and Pablo

Honest and Lying Types Thesis Proposal Ben Greenman 2019-11-25 Committee: 1. Matthias

Reinforcement Learning in a Physics Inspired Semi-Markov Environment Colin Bellinger, Rory Coles,

Resolving the Performance Puzzle of Board Gender Diversity Assoc Prof Lawrence Loh Director

Th The Loc e Localist t Solution How incentives can drive economic development (and make

Selected issues I. IMF resources and reform II. Financial Transaction Tax III. Tax havens

Beyond GDP? Welfare across Countries and Time Chad Jones and Pete Klenow Stanford University and