Approximate Inference Henrik I. Christensen Robotics & - PowerPoint PPT Presentation

Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Approximate Inference Henrik I. Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I. Christensen (RIM@GT) Approximate Inference 1 / 36

Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Outline Introduction 1 Variational Inference 2 Variational Mixture of Gaussians 3 Exponential Family 4 Expectation Propagation 5 Summary 6 Henrik I. Christensen (RIM@GT) Approximate Inference 2 / 36

Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Introduction We often are required to estimate a (conditional) prior of the form p ( Z | X ) The solution might be intractable There might not be a close form solution 1 The integration over X or a parameter space θ might be 2 computationally challenging The set of possible outcomes might be significant/exponential 3 Two strategies Deterministic Approximation Methods 1 Stochastic Sampling (Monte Carlo Techniques) 2 Today we will talk about deterministic techniques Henrik I. Christensen (RIM@GT) Approximate Inference 3 / 36

Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Variational Inference In general we have a Bayesian Model as seen earlier, ie. ln p ( X ) = ln p ( X , Z ) − ln p ( Z | X ) We can rewrite this to ln p ( X ) = L ( q ) + KL ( q || p ) where � � p ( X , Z ) � L ( q ) = q ( Z ) ln q ( Z ) � p ( Z | X ) � � KL ( q || p ) = − q ( Z ) ln q ( Z ) So L ( q ) is an estimate of the joint distribution and KL is the Kullback-Leibler comparison of q ( Z ) to p ( Z | X ). Henrik I. Christensen (RIM@GT) Approximate Inference 5 / 36

Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Factorized Distributions Assume for now that we can factorize Z into disjoint groups so that M � q ( Z ) = q i ( Z i ) i =1 In physics a similar model has been adopted termed mean field theory We can them optimize L(q) through a component wise optimization   � �   � L ( q ) = q i  ln p ( X , Z ) − q j  dZ i j � � = q j ln ˜ p ( X , Z j ) dZ j − q j ln q j dZ j + const where � ˜ p ( X , Z j ) = E i � = j [ln p ( X , Z )] + c = ln p ( X , Z ) q i dZ i + c i � = j Henrik I. Christensen (RIM@GT) Approximate Inference 6 / 36

Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Factorized distributions The optimal solution is now ln q ∗ j ( Z j ) = E i � = j [ln p ( X , Z )] + c Ie the solution where every factor minimizes the influence on L ( q ) Henrik I. Christensen (RIM@GT) Approximate Inference 7 / 36

Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Variational Mixture of Gaussians We encounter mixtures of Gaussians all the time Examples are multi-wall modelling, ambiguous localization, ... We have: a set of observed data X , a set of latent variables, Z that describe the mixture Henrik I. Christensen (RIM@GT) Approximate Inference 9 / 36

Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixture of Gaussians - Modelling We can model the mixture model N K � � π z nk p ( Z | π ) = k n =1 k =1 We can also derive the observed conditional N K � � N ( x n | µ k , Λ − 1 k ) z nk p ( X | Z , µ, Λ) = n =1 k =1 We will for now assume that mixtures are modelled as diraclets K � π α 0 − 1 p ( π ) = Dir ( π | α 0 ) = C ( α 0 ) k k =1 Henrik I. Christensen (RIM@GT) Approximate Inference 10 / 36

Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixture of Gaussians - Modelling The component processes can be modelled as a Gaussian-Wishart K � N ( µ k | m 0 , ( β 0 Λ k ) − 1 ) W (Λ k | W 0 , ν 0 ) p ( µ, Λ) = p ( µ | Λ) p (Λ) = k =1 Ie a total model of z n Λ π x n µ N Henrik I. Christensen (RIM@GT) Approximate Inference 11 / 36

Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixtures of Gaussians - Variational The conditional model can be seen as p ( X , Z , π, µ, Λ) = p ( X | Z , µ, Λ) p ( Z | π ) p ( π ) p ( µ | Λ) p (Λ) Only X is observed We can now consider the selection of a distribution q ( Z , π, µ, Λ) = q ( Z ) q ( π, µ, Λ) this is clear an assumption of independence. We can use the general result of component-wise optimization ln q ∗ ( Z ) = E π,µ, Λ [ln p ( X , Z , π, µ, Λ] + const Decomposition gives us ln q ∗ ( Z ) = E π [ln p ( Z | π )] + E µ, Λ [ln p ( X | Z , µ, Λ)] + const N K � � ln q ∗ ( Z ) = z nk ln ρ nk + const n =1 k =1 Henrik I. Christensen (RIM@GT) Approximate Inference 12 / 36

Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixtures of Gaussians - Variational We can further achieve ln ρ nk = E [ln π k ]+ 1 2 E [ln | Λ k | ] − D 2 ln 2 π − 1 2 E µ k , Λ k [( x n − µ k ) T Λ k ( x n − µ k )]+ c Taking the exponential we have K N q ∗ ( Z ) ∝ � � ρ z nk nk k =1 n =1 Using normalization we arrive at K N � � q ∗ ( Z ) ∝ r z nk nk n =1 k =1 Where ρ nk r nk = � j ρ nj Henrik I. Christensen (RIM@GT) Approximate Inference 13 / 36

Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixtures of Gaussians - Variational Just as we saw for EM we can define N � = N k r nk n =1 N 1 � ¯ x k = r nk x n N k n =1 N 1 � x n ) T = r nk ( x n − ¯ x n )( x n − ¯ S k N k n =1 Henrik I. Christensen (RIM@GT) Approximate Inference 14 / 36

Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixtures of Gaussians - Parameters/Mixture Lets now consider q ( π, µ, Λ) to arrive at K k N E [ z nk ] ln N ( x n | µ k , Λ − 1 X X X ln q ∗ ( π, µ, Λ) = ln p ( π ) + ln p ( µ k , Λ k ) + E Z [ln p ( Z | π )] + ) + c k k =1 k =1 n =1 We can partition the problem into K � q ( π, µ, Λ) = q ( π ) q ( µ k , Λ k ) k =1 We can derive K K N � � � ln q ∗ ( π ) = ( α 0 − 1) ln π k + r nk ln π k + c k =1 k =1 n =1 We can now derive q ∗ ( π ) = Dir ( π | α ) where α k = α 0 + N k Henrik I. Christensen (RIM@GT) Approximate Inference 15 / 36

Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixtures of Gaussians - Parameters/Mixture We can then derive q ∗ ( µ k , Λ k ) = N ( µ k | m k , ( β k Λ k ) − 1 ) W ( λ k | W k , ν k ) where β k = β 0 + N k 1 m k = ( β 0 m 0 + N k ¯ x k ) β k β 0 N k W − 1 W − 1 x k − m 0 ) T = + N k S k + (¯ x k − m 0 )(¯ 0 K β 0 + N k ν k = ν 0 + N k + 1 Henrik I. Christensen (RIM@GT) Approximate Inference 16 / 36

Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixtures of Gaussians - Parameters We can now arrive at the parameters E µ k , Λ k [( x n − µ k ) T ( x n − µ k )] = D β − 1 + ν k ( x n − m k ) T W K ( x n − m k ) k D � ν k + 1 − i � ln ˜ � Λ k = E [ln | Λ | k | ] = ψ + D ln 2 + ln | W k | 2 i =1 ln ˜ π k = E [ln π k ] = ψ ( α k ) − ψ (ˆ α ) here ψ ( . ) which is defined as d / da ln Γ( a ) also known as the digramma function. The last two results are given by the Gauss-Wishart Henrik I. Christensen (RIM@GT) Approximate Inference 17 / 36

Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixtures of Gaussians - Parameters We can finally find the responsibilities � − 1 � r nk ∝ π k | Λ k | 1 / 2 exp 2( x n − µ k ) T Λ k ( x n − µ k ) The optimization is stepwise Estimate µ, Λ and then r nk 1 Estimate π and Z 2 Check for convergence - return to 1 if not converged 3 Henrik I. Christensen (RIM@GT) Approximate Inference 18 / 36

Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixture of Gaussians - Example 0 15 60 120 Henrik I. Christensen (RIM@GT) Approximate Inference 19 / 36

Approximate Inference Henrik I. Christensen Robotics & - PowerPoint PPT Presentation

Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Approximate Inference Henrik I. Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Approximate inference: Sampling methods Probabilistic Graphical Models Sharif University of

Bayesian networks: approximate inference Machine Intelligence Thomas D. Nielsen September 2008

Two Approximate- Programmability Birds, One Statistical- Inference Stone Adrian Sampson

Approximate Inference: Randomized Methods October 15, 2015 Topics Hard Inference

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Travel Time Estimation using Approximate Belief States on a Hidden Markov Model Walid Krichene

Approximate Computing Is Dead; Long Live Approximate Computing Adrian Sampson Cornell Hardware

Approximate Nearest Neighbors Search Approximate Nearest Neighbors Search in High Dimensions in

Approximate inference on graphical models: variational methods Alexandre Bouchard-C ot e

Approximate Bayesian inference for latent Gaussian models avard Rue 1 H Department of

Variable Elimination 1 Inference Exact inference Enumeration Variable elimination

Variational Inference CMSC 691 UMBC Goal: Posterior Inference Hyperparameters Unknown

Post-Selection Inference Todd Kuffner Washington University in St. Louis PhyStat 2016

Soft Inference and Posterior Marginals September 19, 2013 Soft vs. Hard Inference Hard

Type Inference 75 Definition Type Inference Type inference = Java compiler's ability

Wolfes Combinatorial Method is Exponential Jamie Haddock STOC June 26, 2018 UC Davis/UCLA

Parameter Estimation in Mixtures of Truncated Exponentials Helge Langseth 1 Thomas D. Nielsen 2

Exponentials of derivations in prime Gradings characteristic Artin-Hasse exponentials Laguerre

Towards Breaking the Exponential Barrier for General Secret Sharing Tianren Liu Vinod

Graphical Models Graphical Models Exponential family & Variational Inference I Siamak

CSci 8980: Advanced Topics in Graphical Models Mixture Models, EM, Exponential Families

Generalized Linear Models (GLIMs) Probabilistic Graphical Models Sharif University of Technology

Probabilistic Graphical Models 10-708 More on learning fully observed More on learning fully

Approximate Inference Henrik I. Christensen Robotics & - PowerPoint PPT Presentation

Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Approximate Inference Henrik I. Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Approximate inference: Sampling methods Probabilistic Graphical Models Sharif University of

Bayesian networks: approximate inference Machine Intelligence Thomas D. Nielsen September 2008

Two Approximate- Programmability Birds, One Statistical- Inference Stone Adrian Sampson

Approximate Inference: Randomized Methods October 15, 2015 Topics Hard Inference

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Travel Time Estimation using Approximate Belief States on a Hidden Markov Model Walid Krichene

Approximate Computing Is Dead; Long Live Approximate Computing Adrian Sampson Cornell Hardware

Approximate Nearest Neighbors Search Approximate Nearest Neighbors Search in High Dimensions in

Approximate inference on graphical models: variational methods Alexandre Bouchard-C ot e

Approximate Bayesian inference for latent Gaussian models avard Rue 1 H Department of

Variable Elimination 1 Inference Exact inference Enumeration Variable elimination

Variational Inference CMSC 691 UMBC Goal: Posterior Inference Hyperparameters Unknown

Post-Selection Inference Todd Kuffner Washington University in St. Louis PhyStat 2016

Soft Inference and Posterior Marginals September 19, 2013 Soft vs. Hard Inference Hard

Type Inference 75 Definition Type Inference Type inference = Java compiler's ability

Wolfes Combinatorial Method is Exponential Jamie Haddock STOC June 26, 2018 UC Davis/UCLA

Parameter Estimation in Mixtures of Truncated Exponentials Helge Langseth 1 Thomas D. Nielsen 2

Exponentials of derivations in prime Gradings characteristic Artin-Hasse exponentials Laguerre

Towards Breaking the Exponential Barrier for General Secret Sharing Tianren Liu Vinod

Graphical Models Graphical Models Exponential family &amp; Variational Inference I Siamak

CSci 8980: Advanced Topics in Graphical Models Mixture Models, EM, Exponential Families

Generalized Linear Models (GLIMs) Probabilistic Graphical Models Sharif University of Technology

Probabilistic Graphical Models 10-708 More on learning fully observed More on learning fully

Graphical Models Graphical Models Exponential family & Variational Inference I Siamak