Markov Chains and MCMC CompSci 590.02 Instructor: - PowerPoint PPT Presentation

Markov Chains and MCMC CompSci 590.02 Instructor: AshwinMachanavajjhala Lecture 4 : 590.02 Spring 13 1

Recap: Monte Carlo Method • If U is a universe of items, and G is a subset satisfying some property, we want to estimate |G| – Either intractable or inefficient to count exactly For i = 1 to N Choose u ε U, uniformly at random • Check whether u ε G ? • Let X i = 1 if u ε G, X i = 0 otherwise • Return Variance: Lecture 4 : 590.02 Spring 13 3

Recap: Monte Carlo Method When is this method an FPRAS? • |U| is known and easy to uniformly sample from U. • Easy to check whether sample is in G • |U|/|G| is small … (polynomial in the size of the input) Lecture 4 : 590.02 Spring 13 4

Recap: Importance Sampling • In certain case |G| << |U|, hence the number of samples is not small. • Suppose q(x) is the density of interest, sample from a different approximate density p(x) Lecture 4 : 590.02 Spring 13 5

Today’s Class • Markov Chains • Markov Chain Monte Carlo sampling – a.k.a. Metropolis-Hastings Method. – Standard technique for probabilistic inference in machine learning, when the probability distribution is hard to compute exactly Lecture 4 : 590.02 Spring 13 6

Markov Chains • Consider a time varying random process which takes the value X t at time t – Values of X t are drawn from a finite (more generally countable) set of states Ω . • {X 0 … X t … X n } is a Markov Chain if the value of X t only depends on X t-1 Lecture 4 : 590.02 Spring 13 7

Transition Probabilities • Pr[X t+1 = s j | X t = s i ], denoted by P(i,j), is called the transition probability – Can be represented as a | Ω | x | Ω | matrix P. – P(i,j) is the probability that the chain moves from state i to state j • Let π i (t) = Pr[X t = s i ] denote the probability of reaching state i at time t Lecture 4 : 590.02 Spring 13 8

Transition Probabilities • Pr[X t+1 = s j | X t = s i ], denoted by P(i,j), is called the transition probability – Can be represented as a | Ω | x | Ω | matrix P. – P(i,j) is the probability that the chain moves from state i to state j • If π (t) denotes the 1x| Ω | vector of probabilities of reaching all the states at time t, Lecture 4 : 590.02 Spring 13 9

Example • Suppose Ω = {Rainy, Sunny, Cloudy} • Tomorrow’s weather only depends on today’s weather. – Markov process Pr[X t+1 = Sunny | X t = Rainy] = 0.25 Pr[X t+1 = Sunny | X t = Sunny] = 0 No 2 consecutive days of sun (Seattle?) Lecture 4 : 590.02 Spring 13 10

Example • Suppose Ω = {Rainy, Sunny, Cloudy} • Tomorrow’s weather only depends on today’s weather. – Markov process • Suppose today is Sunny. • What is the weather 2 days from now? Lecture 4 : 590.02 Spring 13 11

Example • Suppose Ω = {Rainy, Sunny, Cloudy} • Tomorrow’s weather only depends on today’s weather. – Markov process • Suppose today is Sunny. • What is the weather 7 days from now? Lecture 4 : 590.02 Spring 13 12

Example • Suppose Ω = {Rainy, Sunny, Cloudy} • Tomorrow’s weather only depends on today’s weather. – Markov process • Suppose today is Rainy. • What is the weather 2 days from now? • Weather 7 days from now? Lecture 4 : 590.02 Spring 13 13

Example • After sufficient amount of time the expected weather distribution is independent of the starting value. • Moreover, • This is called the stationary distribution. Lecture 4 : 590.02 Spring 13 14

Stationary Distribution • π is called a stationary distribution of the Markov Chain if • That is, once the stationary distribution is reached, every subsequent X i is a sample from the distribution π How to use Markov Chains: • Suppose you want to sample from a set | Ω |, according to distribution π • Construct a Markov Chain ( P ) such that π is the stationary distribution • Once stationary distribution is achieved, we get samples from the correct distribution. Lecture 4 : 590.02 Spring 13 15

Conditions for a Stationary Distribution A Markov chain is ergodic if it is: • Irreducible : A state j can be reached from any state i in some finite number of steps. Lecture 4 : 590.02 Spring 13 16

Conditions for a Stationary Distribution A Markov chain is ergodic if it is: • Irreducible : A state j can be reached from any state i in some finite number of steps. • Aperiodic : A chain is not forced into cycles of fixed length between certain states Lecture 4 : 590.02 Spring 13 17

Conditions for a Stationary Distribution A Markov chain is ergodic if it is: • Irreducible : A state j can be reached from any state i in some finite number of steps. • Aperiodic : A chain is not forced into cycles of fixed length between certain states Theorem: For every ergodic Markov chain, there is a unique vector π such that for all initial probability vectors π (0), Lecture 4 : 590.02 Spring 13 18

Sufficient Condition: Detailed Balance • In a stationary walk, for any pair of states j, k, the Markov Chain is as likely to move from j to k as from k to j. • Also called reversibility condition . Lecture 4 : 590.02 Spring 13 19

Example: Random Walks • Consider a graph G = (V,E), with weights on edges (w(e)) Random Walk: • Start at some node u in the graph G(V,E) • Move from node u to node v with probability proportional to w(u,v). Random walk is a Markov chain • State space = V • P(u,v) = w(u,v) / Σ w(u,v ’) if ( u,v) ε E = 0 if (u,v) is not in E Lecture 4 : 590.02 Spring 13 20

Example: Random Walk Random walk is ergodic if: • Irreducible : A state j can be reached from any state i in some finite number of steps. If G is connected. • Aperiodic : A chain is not forced into cycles of fixed length between certain states If G is not bipartite Lecture 4 : 590.02 Spring 13 21

Example: Random Walk Uniform random walk: • Suppose all weights on the graph are 1 • P(u,v) = 1/deg(u) (or 0) Theorem: If G is connected and not bipartite, then the stationary distribution of the random walk is Lecture 4 : 590.02 Spring 13 22

Example: Random Walk Symmetric random walk: • Suppose P(u,v) = P(v,u) Theorem: If G is connected and not bipartite, then the stationary distribution of the random walk is Lecture 4 : 590.02 Spring 13 23

Stationary Distribution • π is called a stationary distribution of the Markov Chain if • That is, once the stationary distribution is reached, every subsequent X i is a sample from the distribution π How to use Markov Chains: • Suppose you want to sample from a set | Ω |, according to distribution π • Construct a Markov Chain ( P ) such that π is the stationary distribution • Once stationary distribution is achieved, we get samples from the correct distribution. Lecture 4 : 590.02 Spring 13 24

Metropolis-Hastings Algorithm (MCMC) • Suppose we want to sample from a complex distribution f(x) = p(x) / K, where K is unknown or hard to compute • Example: Bayesian Inference Lecture 4 : 590.02 Spring 13 25

Metropolis-Hastings Algorithm • Start with any initial value x 0 , such that p(x 0 ) > 0 • Using current value x t-1 , sample a new point according some proposal distribution q(x t | x t-1 ) • Compute • With probability α accept the move to x t , otherwise reject x t Lecture 4 : 590.02 Spring 13 26

Why does Metropolis-Hastings work? • Metropolis-Hastings describes a Markov chain with transition probabilities: • We want to show that f(x) = p(x)/K is the stationary distribution • Recall sufficient condition for stationary distribution: Lecture 4 : 590.02 Spring 13 27

Why does Metropolis-Hastings work? • Metropolis-Hastings describes a Markov chain with transition probabilities: • Sufficient to show: Lecture 4 : 590.02 Spring 13 28

Proof: Case 1 • Suppose • Then, P(x,y) = q(y | x) • Therefore P(x,y)p(x) = q(y | x) p(x) = p(y) q(x | y) = P(y,x) p(y) Lecture 4 : 590.02 Spring 13 29

Proof: Case 2 • Proof of Case 3 is identical. Lecture 4 : 590.02 Spring 13 30

When is stationary distribution reached? • Next class … Lecture 4 : 590.02 Spring 13 31

Markov Chains and MCMC CompSci 590.02 Instructor: - PowerPoint PPT Presentation

Markov Chains and MCMC CompSci 590.02 Instructor: AshwinMachanavajjhala Lecture 4 : 590.02 Spring 13 1 Recap: Monte Carlo Method If U is a universe of items, and G is a subset satisfying some property, we want to estimate |G| Either

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr

Markov chains and MCMC methods Ingo Blechschmidt November 7th, 2014 Kleine Bayessche AG Markov

Markov chains and Hidden Markov Models 9000 Markov chains and HMMs We will discuss: Markov

CSCE 471/871 Lecture 3: Markov Chains Markov Chains and and Hidden Markov Models Hidden

Imprecise Markov chains From basic theory to applications II prof. Jasper De Bock Imprecise

Overview Motivation Verifying Continuous-Time Markov Chains 1 Lecture 1+2: Discrete-Time Markov

Discrete time Markov chains Today: Discrete Time Markov Chains, Limiting Discrete time Markov

Discrete Time Markov Chains Discrete-Time Markov Chains Books - Introduction to Stochastic

Stability of Markov Chains based on fluid limit techniques. Applications to MCMC Gersende FORT

Overview Verifying Continuous-Time Markov Chains Negative exponential distributions 1 Lecture

Markov Chains and Hidden Markov Models COMP 571 Luay Nakhleh, Rice University Markov Chains and

Markov Chains and Hidden Markov Models COMP 571 Luay Nakhleh, Rice University 2 Markov Chains

Hidden Markov Models Discrete Markov Processes 1 Hidden Markov Models Hidden Markov Models 2

Under Interval and Fuzzy From the . . . Symmetric Markov Chains Uncertainty, Symmetric In

Simulation of Discrete-Time Markov Chains Discrete-Time Markov Chains (DTMCs) Numerical Solution

Stochastic Processes Markov Processes Hamid R. Rabiee 1 Overview o Markov Property o Markov

Computational Complexity Tutorial COMSOC 2017 Ronald de Haan Plan for Today Tutorial on

Homework Homework 8 Computational Complexity Help after lecture Before we start The

For Wednesday No reading Research paper topic due on Blackboard Homework: Weiss,

Optimizing the Automated Programming Stack James Bornholt University of Washington Software

Online Planning for Decentralized Sto ci astic Control with Partial History Sharing Kaiqing Zhang,

Computation Quantum Computing: . . . Potential Use of . . . in Quantum Space-Time Quantum

Sum-Product Networks for Probabilistic Semantic Maps Kaiyu Zheng , Andrzej Pronobis, Rajesh Rao

Computers and Intractability A Guide to the Theory of NP-Completeness The Bible of