CSci 8980: Advanced Topics in Graphical Models Variational Inference - PowerPoint PPT Presentation

Graphical Models Exponential Families Variational Methods Mean Field Approximation Properties of the Cumulant ψ (Contd.) The set of mean parameters � � � µ ∈ R d |∃ p ( . ) s . t . M = t ( x ) p ( x ) ν ( dx ) = µ Consider the mapping Λ : Θ �→ M as � Λ( θ ) = E θ [ t ( x )] = t ( x ) p ( x ; θ ) ν ( dx ) x If t is minimal, Λ is one-to-one Further, Λ is onto the (relative) interior of M

Graphical Models Exponential Families Variational Methods Mean Field Approximation Fenchel-Legendre Conjugacy The conjugate dual function ψ ∗ ( µ ) = sup {� µ, θ � − ψ ( θ ) } θ ∈ Θ

Graphical Models Exponential Families Variational Methods Mean Field Approximation Fenchel-Legendre Conjugacy The conjugate dual function ψ ∗ ( µ ) = sup {� µ, θ � − ψ ( θ ) } θ ∈ Θ The (Bolzmann-Shannon) entropy of p ( x ; θ ) w.r.t. ν is � H ( p ( x ; θ )) = − p ( x ; θ ) log p ( x ; θ ) ν ( dx ) = − E θ [log p ( x ; θ )] x

Graphical Models Exponential Families Variational Methods Mean Field Approximation Fenchel-Legendre Conjugacy The conjugate dual function ψ ∗ ( µ ) = sup {� µ, θ � − ψ ( θ ) } θ ∈ Θ The (Bolzmann-Shannon) entropy of p ( x ; θ ) w.r.t. ν is � H ( p ( x ; θ )) = − p ( x ; θ ) log p ( x ; θ ) ν ( dx ) = − E θ [log p ( x ; θ )] x If µ ∈ ri M , then ψ ∗ ( µ ) = − H ( p ( x ; θ ( µ )))

Graphical Models Exponential Families Variational Methods Mean Field Approximation Fenchel-Legendre Conjugacy The conjugate dual function ψ ∗ ( µ ) = sup {� µ, θ � − ψ ( θ ) } θ ∈ Θ The (Bolzmann-Shannon) entropy of p ( x ; θ ) w.r.t. ν is � H ( p ( x ; θ )) = − p ( x ; θ ) log p ( x ; θ ) ν ( dx ) = − E θ [log p ( x ; θ )] x If µ ∈ ri M , then ψ ∗ ( µ ) = − H ( p ( x ; θ ( µ ))) In terms of the dual, ψ has a variational representation {� θ, µ � − ψ ∗ ( µ ) } ψ ( θ ) = sup µ ∈M

Graphical Models Exponential Families Variational Methods Mean Field Approximation Main Issues Key problems:

Graphical Models Exponential Families Variational Methods Mean Field Approximation Main Issues Key problems: Computation of the cumulant function ψ ( θ )

Graphical Models Exponential Families Variational Methods Mean Field Approximation Main Issues Key problems: Computation of the cumulant function ψ ( θ ) Computation of the mean parameter µ = E θ [ t ( x )]

Graphical Models Exponential Families Variational Methods Mean Field Approximation Main Issues Key problems: Computation of the cumulant function ψ ( θ ) Computation of the mean parameter µ = E θ [ t ( x )] The key equation for both problems {� θ, µ � − ψ ∗ ( µ ) } ψ ( θ ) = sup µ ∈M

Graphical Models Exponential Families Variational Methods Mean Field Approximation Main Issues Key problems: Computation of the cumulant function ψ ( θ ) Computation of the mean parameter µ = E θ [ t ( x )] The key equation for both problems {� θ, µ � − ψ ∗ ( µ ) } ψ ( θ ) = sup µ ∈M For all θ ∈ Θ, the supremum is attained by µ ∈ ri M � µ = E θ [ t ( x )] = t ( x ) p ( x ; θ ) ν ( dx ) x

Graphical Models Exponential Families Variational Methods Mean Field Approximation Main Issues Key problems: Computation of the cumulant function ψ ( θ ) Computation of the mean parameter µ = E θ [ t ( x )] The key equation for both problems {� θ, µ � − ψ ∗ ( µ ) } ψ ( θ ) = sup µ ∈M For all θ ∈ Θ, the supremum is attained by µ ∈ ri M � µ = E θ [ t ( x )] = t ( x ) p ( x ; θ ) ν ( dx ) x Two primary challenges

Graphical Models Exponential Families Variational Methods Mean Field Approximation Main Issues Key problems: Computation of the cumulant function ψ ( θ ) Computation of the mean parameter µ = E θ [ t ( x )] The key equation for both problems {� θ, µ � − ψ ∗ ( µ ) } ψ ( θ ) = sup µ ∈M For all θ ∈ Θ, the supremum is attained by µ ∈ ri M � µ = E θ [ t ( x )] = t ( x ) p ( x ; θ ) ν ( dx ) x Two primary challenges Set M is difficult to characterize

Graphical Models Exponential Families Variational Methods Mean Field Approximation Main Issues Key problems: Computation of the cumulant function ψ ( θ ) Computation of the mean parameter µ = E θ [ t ( x )] The key equation for both problems {� θ, µ � − ψ ∗ ( µ ) } ψ ( θ ) = sup µ ∈M For all θ ∈ Θ, the supremum is attained by µ ∈ ri M � µ = E θ [ t ( x )] = t ( x ) p ( x ; θ ) ν ( dx ) x Two primary challenges Set M is difficult to characterize Function ψ ∗ lacks an explicit definition

Graphical Models Exponential Families Variational Methods Mean Field Approximation Mean Parameters M has the following properties

Graphical Models Exponential Families Variational Methods Mean Field Approximation Mean Parameters M has the following properties M is full-dimensional if t is minimal

Graphical Models Exponential Families Variational Methods Mean Field Approximation Mean Parameters M has the following properties M is full-dimensional if t is minimal M is bounded iff Θ = R d and ψ is Lipschitz

Graphical Models Exponential Families Variational Methods Mean Field Approximation Mean Parameters M has the following properties M is full-dimensional if t is minimal M is bounded iff Θ = R d and ψ is Lipschitz Example: Mutinomial random vector x ∈ X n

Graphical Models Exponential Families Variational Methods Mean Field Approximation Mean Parameters M has the following properties M is full-dimensional if t is minimal M is bounded iff Θ = R d and ψ is Lipschitz Example: Mutinomial random vector x ∈ X n The set M is a polytope M = { µ ∈ R d |� a j , µ � ≤ b j , ∀ j ∈ J }

Graphical Models Exponential Families Variational Methods Mean Field Approximation Mean Parameters M has the following properties M is full-dimensional if t is minimal M is bounded iff Θ = R d and ψ is Lipschitz Example: Mutinomial random vector x ∈ X n The set M is a polytope M = { µ ∈ R d |� a j , µ � ≤ b j , ∀ j ∈ J } Index set J is finite, but can be large

Graphical Models Exponential Families Variational Methods Mean Field Approximation Mean Parameters M has the following properties M is full-dimensional if t is minimal M is bounded iff Θ = R d and ψ is Lipschitz Example: Mutinomial random vector x ∈ X n The set M is a polytope M = { µ ∈ R d |� a j , µ � ≤ b j , ∀ j ∈ J } Index set J is finite, but can be large Facets of the polytope can grow very fast with n

Graphical Models Exponential Families Variational Methods Mean Field Approximation Mean Parameters M has the following properties M is full-dimensional if t is minimal M is bounded iff Θ = R d and ψ is Lipschitz Example: Mutinomial random vector x ∈ X n The set M is a polytope M = { µ ∈ R d |� a j , µ � ≤ b j , ∀ j ∈ J } Index set J is finite, but can be large Facets of the polytope can grow very fast with n A complete graph with n = 7 has more than 2 × 10 8 facets

Graphical Models Exponential Families Variational Methods Mean Field Approximation Mean Parameters (Contd.)

Graphical Models Exponential Families Variational Methods Mean Field Approximation Dual Function ψ ∗ is the negative entropy

Graphical Models Exponential Families Variational Methods Mean Field Approximation Dual Function ψ ∗ is the negative entropy Typically, does not have an explicit closed form

Graphical Models Exponential Families Variational Methods Mean Field Approximation Dual Function ψ ∗ is the negative entropy Typically, does not have an explicit closed form In general, can be specified as a composition of two functions

Graphical Models Exponential Families Variational Methods Mean Field Approximation Dual Function ψ ∗ is the negative entropy Typically, does not have an explicit closed form In general, can be specified as a composition of two functions Compute an inverse image θ ( µ ) using Λ − 1 ( µ )

Graphical Models Exponential Families Variational Methods Mean Field Approximation Dual Function ψ ∗ is the negative entropy Typically, does not have an explicit closed form In general, can be specified as a composition of two functions Compute an inverse image θ ( µ ) using Λ − 1 ( µ ) Compute the negative entropy of p ( x ; θ ( µ ))

Graphical Models Exponential Families Variational Methods Mean Field Approximation Tractable Families Based on the key equation {� µ, θ � − ψ ∗ ( µ ) } ψ ( θ ) = sup µ ∈M

Graphical Models Exponential Families Variational Methods Mean Field Approximation Tractable Families Based on the key equation {� µ, θ � − ψ ∗ ( µ ) } ψ ( θ ) = sup µ ∈M Mean field focuses on tractable distributions

Graphical Models Exponential Families Variational Methods Mean Field Approximation Tractable Families Based on the key equation {� µ, θ � − ψ ∗ ( µ ) } ψ ( θ ) = sup µ ∈M Mean field focuses on tractable distributions Let H ⊆ G on which exact calculations are feasible

Graphical Models Exponential Families Variational Methods Mean Field Approximation Tractable Families Based on the key equation {� µ, θ � − ψ ∗ ( µ ) } ψ ( θ ) = sup µ ∈M Mean field focuses on tractable distributions Let H ⊆ G on which exact calculations are feasible I ( H ) be the indices of cliques in H

Graphical Models Exponential Families Variational Methods Mean Field Approximation Tractable Families Based on the key equation {� µ, θ � − ψ ∗ ( µ ) } ψ ( θ ) = sup µ ∈M Mean field focuses on tractable distributions Let H ⊆ G on which exact calculations are feasible I ( H ) be the indices of cliques in H Natural parameters for distributions corresponding to H E ( H ) = { θ ∈ Θ | θ α = 0 , ∀ α ∈ I \ I ( H ) }

Graphical Models Exponential Families Variational Methods Mean Field Approximation Tractable Families (Contd.) Simple tractable subgraph is H = ( V , ∅ )

Graphical Models Exponential Families Variational Methods Mean Field Approximation Tractable Families (Contd.) Simple tractable subgraph is H = ( V , ∅ ) Natural parameters belong to the subspace E ( H ) = { θ ∈ Θ | θ st = 0 , ∀ ( s , t ) ∈ E }

Graphical Models Exponential Families Variational Methods Mean Field Approximation Tractable Families (Contd.) Simple tractable subgraph is H = ( V , ∅ ) Natural parameters belong to the subspace E ( H ) = { θ ∈ Θ | θ st = 0 , ∀ ( s , t ) ∈ E } Corresponding distribution p ( x ; θ ) = � s ∈ V p ( x s ; θ s )

Graphical Models Exponential Families Variational Methods Mean Field Approximation Tractable Families (Contd.) Simple tractable subgraph is H = ( V , ∅ ) Natural parameters belong to the subspace E ( H ) = { θ ∈ Θ | θ st = 0 , ∀ ( s , t ) ∈ E } Corresponding distribution p ( x ; θ ) = � s ∈ V p ( x s ; θ s ) Structured approximation using spanning tree T = ( V , E ( T ))

Graphical Models Exponential Families Variational Methods Mean Field Approximation Tractable Families (Contd.) Simple tractable subgraph is H = ( V , ∅ ) Natural parameters belong to the subspace E ( H ) = { θ ∈ Θ | θ st = 0 , ∀ ( s , t ) ∈ E } Corresponding distribution p ( x ; θ ) = � s ∈ V p ( x s ; θ s ) Structured approximation using spanning tree T = ( V , E ( T )) Natural parameters belong to the subspace E ( T ) = { θ ∈ Θ | θ st = 0 , ∀ ( s , t ) ∈ E ( T ) }

Graphical Models Exponential Families Variational Methods Mean Field Approximation Tractable Families (Contd.) Simple tractable subgraph is H = ( V , ∅ ) Natural parameters belong to the subspace E ( H ) = { θ ∈ Θ | θ st = 0 , ∀ ( s , t ) ∈ E } Corresponding distribution p ( x ; θ ) = � s ∈ V p ( x s ; θ s ) Structured approximation using spanning tree T = ( V , E ( T )) Natural parameters belong to the subspace E ( T ) = { θ ∈ Θ | θ st = 0 , ∀ ( s , t ) ∈ E ( T ) } For a subgraph H , the set of realizable mean parameters M tract ( G ; H ) = { µ ∈ R d | µ = E θ [ t ( x )] , θ ∈ E ( H ) }

Graphical Models Exponential Families Variational Methods Mean Field Approximation Tractable Families (Contd.) Simple tractable subgraph is H = ( V , ∅ ) Natural parameters belong to the subspace E ( H ) = { θ ∈ Θ | θ st = 0 , ∀ ( s , t ) ∈ E } Corresponding distribution p ( x ; θ ) = � s ∈ V p ( x s ; θ s ) Structured approximation using spanning tree T = ( V , E ( T )) Natural parameters belong to the subspace E ( T ) = { θ ∈ Θ | θ st = 0 , ∀ ( s , t ) ∈ E ( T ) } For a subgraph H , the set of realizable mean parameters M tract ( G ; H ) = { µ ∈ R d | µ = E θ [ t ( x )] , θ ∈ E ( H ) } The inclusion M tract ( G ; H ) ⊆ M ( G ) always holds

Graphical Models Exponential Families Variational Methods Mean Field Approximation Lower Bounds For any µ ∈ ri M , ψ ( θ ) ≥ � θ, µ � − ψ ∗ ( µ )

Graphical Models Exponential Families Variational Methods Mean Field Approximation Lower Bounds For any µ ∈ ri M , ψ ( θ ) ≥ � θ, µ � − ψ ∗ ( µ ) Alternative proof using Jensen’s inequality p ( x ; θ )exp( � θ, t ( x ) � ) � ψ ( θ ) = log ν ( dx ) p ( x ; θ ) x � ≥ p ( x ; θ ) [ � θ, t ( x ) � − log p ( x ; θ ( µ ))] ν ( dx ) x � θ, µ � − ψ ∗ ( µ ) =

Graphical Models Exponential Families Variational Methods Mean Field Approximation Lower Bounds For any µ ∈ ri M , ψ ( θ ) ≥ � θ, µ � − ψ ∗ ( µ ) Alternative proof using Jensen’s inequality p ( x ; θ )exp( � θ, t ( x ) � ) � ψ ( θ ) = log ν ( dx ) p ( x ; θ ) x � ≥ p ( x ; θ ) [ � θ, t ( x ) � − log p ( x ; θ ( µ ))] ν ( dx ) x � θ, µ � − ψ ∗ ( µ ) = In general, ψ ∗ does not have closed form

Graphical Models Exponential Families Variational Methods Mean Field Approximation Lower Bounds For any µ ∈ ri M , ψ ( θ ) ≥ � θ, µ � − ψ ∗ ( µ ) Alternative proof using Jensen’s inequality p ( x ; θ )exp( � θ, t ( x ) � ) � ψ ( θ ) = log ν ( dx ) p ( x ; θ ) x � ≥ p ( x ; θ ) [ � θ, t ( x ) � − log p ( x ; θ ( µ ))] ν ( dx ) x � θ, µ � − ψ ∗ ( µ ) = In general, ψ ∗ does not have closed form Since ψ ∗ H has an explicit form, solve approximation {� µ, θ � − ψ ∗ sup H ( µ ) } µ ∈M tract

Graphical Models Exponential Families Variational Methods Mean Field Approximation Naive Mean Field Chooses a fully factorized distribution to approximate the original distribution

Graphical Models Exponential Families Variational Methods Mean Field Approximation Naive Mean Field Chooses a fully factorized distribution to approximate the original distribution We will study Ising model as an example

Graphical Models Exponential Families Variational Methods Mean Field Approximation Naive Mean Field Chooses a fully factorized distribution to approximate the original distribution We will study Ising model as an example Approximate G by fully disconnected graph H 0 with no edges

Graphical Models Exponential Families Variational Methods Mean Field Approximation Naive Mean Field Chooses a fully factorized distribution to approximate the original distribution We will study Ising model as an example Approximate G by fully disconnected graph H 0 with no edges Then, the mean parameter set M tract = { ( µ s , µ st ) | 0 ≤ µ s ≤ 1 , µ st = µ s µ t }

Graphical Models Exponential Families Variational Methods Mean Field Approximation Naive Mean Field Chooses a fully factorized distribution to approximate the original distribution We will study Ising model as an example Approximate G by fully disconnected graph H 0 with no edges Then, the mean parameter set M tract = { ( µ s , µ st ) | 0 ≤ µ s ≤ 1 , µ st = µ s µ t } The negative entropy of the product distribution is � ψ ∗ H 0 ( µ ) = [ µ s log µ s + (1 − µ s ) log(1 − µ s )] s ∈ V

Graphical Models Exponential Families Variational Methods Mean Field Approximation Naive Mean Field (Contd.) The naive mean field problem takes the form µ ∈M tract {� µ, θ � − ψ ∗ max H 0 ( µ ) }

Graphical Models Exponential Families Variational Methods Mean Field Approximation Naive Mean Field (Contd.) The naive mean field problem takes the form µ ∈M tract {� µ, θ � − ψ ∗ max H 0 ( µ ) } Using µ st = µ s µ t , we get the reduced problem   � � � θ st µ s µ t − [ µ s log µ s + (1 − µ s ) log(1 max θ s µ s + { µ s }∈ [0 , 1] n  s ∈ V s ∈ V ( s , t ) ∈ E

Graphical Models Exponential Families Variational Methods Mean Field Approximation Naive Mean Field (Contd.) The naive mean field problem takes the form µ ∈M tract {� µ, θ � − ψ ∗ max H 0 ( µ ) } Using µ st = µ s µ t , we get the reduced problem   � � � θ st µ s µ t − [ µ s log µ s + (1 − µ s ) log(1 max θ s µ s + { µ s }∈ [0 , 1] n  s ∈ V s ∈ V ( s , t ) ∈ E It is concave in µ s with other co-ordinates held fixed

Graphical Models Exponential Families Variational Methods Mean Field Approximation Naive Mean Field (Contd.) The naive mean field problem takes the form µ ∈M tract {� µ, θ � − ψ ∗ max H 0 ( µ ) } Using µ st = µ s µ t , we get the reduced problem   � � � θ st µ s µ t − [ µ s log µ s + (1 − µ s ) log(1 max θ s µ s + { µ s }∈ [0 , 1] n  s ∈ V s ∈ V ( s , t ) ∈ E It is concave in µ s with other co-ordinates held fixed Taking gradient and setting it to zero yields 1 µ s ← 1 + exp( − ( θ s + � t ∈ N ( s ) θ st µ t ))

Graphical Models Exponential Families Variational Methods Mean Field Approximation Structured Mean Field Considers tractable distributions with additional structure

Graphical Models Exponential Families Variational Methods Mean Field Approximation Structured Mean Field Considers tractable distributions with additional structure For subgraph H , lets I ( H ) be the index set associated with H

Graphical Models Exponential Families Variational Methods Mean Field Approximation Structured Mean Field Considers tractable distributions with additional structure For subgraph H , lets I ( H ) be the index set associated with H With µ ( H ) = { µ α | α ∈ H} , we have

Graphical Models Exponential Families Variational Methods Mean Field Approximation Structured Mean Field Considers tractable distributions with additional structure For subgraph H , lets I ( H ) be the index set associated with H With µ ( H ) = { µ α | α ∈ H} , we have The subvector µ ( H ) can be an arbitrary member of M ( H )

Graphical Models Exponential Families Variational Methods Mean Field Approximation Structured Mean Field Considers tractable distributions with additional structure For subgraph H , lets I ( H ) be the index set associated with H With µ ( H ) = { µ α | α ∈ H} , we have The subvector µ ( H ) can be an arbitrary member of M ( H ) Dual ψ ∗ H depends only on µ ( H ), not on µ β , β ∈ I ( G ) \ I ( H )

Graphical Models Exponential Families Variational Methods Mean Field Approximation Structured Mean Field Considers tractable distributions with additional structure For subgraph H , lets I ( H ) be the index set associated with H With µ ( H ) = { µ α | α ∈ H} , we have The subvector µ ( H ) can be an arbitrary member of M ( H ) Dual ψ ∗ H depends only on µ ( H ), not on µ β , β ∈ I ( G ) \ I ( H ) But such µ β do appear in the � µ, β � term

Graphical Models Exponential Families Variational Methods Mean Field Approximation Structured Mean Field Considers tractable distributions with additional structure For subgraph H , lets I ( H ) be the index set associated with H With µ ( H ) = { µ α | α ∈ H} , we have The subvector µ ( H ) can be an arbitrary member of M ( H ) Dual ψ ∗ H depends only on µ ( H ), not on µ β , β ∈ I ( G ) \ I ( H ) But such µ β do appear in the � µ, β � term Each µ β = g β ( µ ( H )), i.e., depends on µ ( H ) non-linearly

Graphical Models Exponential Families Variational Methods Mean Field Approximation Structured Mean Field Considers tractable distributions with additional structure For subgraph H , lets I ( H ) be the index set associated with H With µ ( H ) = { µ α | α ∈ H} , we have The subvector µ ( H ) can be an arbitrary member of M ( H ) Dual ψ ∗ H depends only on µ ( H ), not on µ β , β ∈ I ( G ) \ I ( H ) But such µ β do appear in the � µ, β � term Each µ β = g β ( µ ( H )), i.e., depends on µ ( H ) non-linearly The approximate optimization problem can be written as     � � θ α g α ( µ ( H )) − ψ ∗ sup θ α µ α + H ( µ ( H )) µ ( H ) ∈M ( H )  α ∈I c ( H )  α ∈I ( H )

Graphical Models Exponential Families Variational Methods Mean Field Approximation Structured Mean Field Considers tractable distributions with additional structure For subgraph H , lets I ( H ) be the index set associated with H With µ ( H ) = { µ α | α ∈ H} , we have The subvector µ ( H ) can be an arbitrary member of M ( H ) Dual ψ ∗ H depends only on µ ( H ), not on µ β , β ∈ I ( G ) \ I ( H ) But such µ β do appear in the � µ, β � term Each µ β = g β ( µ ( H )), i.e., depends on µ ( H ) non-linearly The approximate optimization problem can be written as     � � θ α g α ( µ ( H )) − ψ ∗ sup θ α µ α + H ( µ ( H )) µ ( H ) ∈M ( H )  α ∈I c ( H )  α ∈I ( H ) For Ising model, with H 0 = ( V , ∅ ), g st ( µ ( H 0 )) = µ s µ t

Graphical Models Exponential Families Variational Methods Mean Field Approximation Structured Mean Field (Contd.) Let F ( µ ( H )) denote the cost function

Graphical Models Exponential Families Variational Methods Mean Field Approximation Structured Mean Field (Contd.) Let F ( µ ( H )) denote the cost function Taking derivative w.r.t. µ β , β ∈ I ( H ) yields − ∂ψ ∗ ∂ F ( µ ( H )) ∂ g α ( µ ( H )) H ( µ ( H )) � = θ β + θ α ∂µ β ∂µ β ∂µ β α ∈I c ( H )

Graphical Models Exponential Families Variational Methods Mean Field Approximation Structured Mean Field (Contd.) Let F ( µ ( H )) denote the cost function Taking derivative w.r.t. µ β , β ∈ I ( H ) yields − ∂ψ ∗ ∂ F ( µ ( H )) ∂ g α ( µ ( H )) H ( µ ( H )) � = θ β + θ α ∂µ β ∂µ β ∂µ β α ∈I c ( H ) γ β ( H ) = ∂ψ ∗ H ( µ ( H )) is the inverse moment mapping ∂µ β

Graphical Models Exponential Families Variational Methods Mean Field Approximation Structured Mean Field (Contd.) Let F ( µ ( H )) denote the cost function Taking derivative w.r.t. µ β , β ∈ I ( H ) yields − ∂ψ ∗ ∂ F ( µ ( H )) ∂ g α ( µ ( H )) H ( µ ( H )) � = θ β + θ α ∂µ β ∂µ β ∂µ β α ∈I c ( H ) γ β ( H ) = ∂ψ ∗ H ( µ ( H )) is the inverse moment mapping ∂µ β Setting the gradient to zero yields the update ∂ g α ( µ ( H )) � γ β ( H ) ← θ β + θ α ∂µ β α ∈I c ( H )

Graphical Models Exponential Families Variational Methods Mean Field Approximation Structured Mean Field (Contd.) Let F ( µ ( H )) denote the cost function Taking derivative w.r.t. µ β , β ∈ I ( H ) yields − ∂ψ ∗ ∂ F ( µ ( H )) ∂ g α ( µ ( H )) H ( µ ( H )) � = θ β + θ α ∂µ β ∂µ β ∂µ β α ∈I c ( H ) γ β ( H ) = ∂ψ ∗ H ( µ ( H )) is the inverse moment mapping ∂µ β Setting the gradient to zero yields the update ∂ g α ( µ ( H )) � γ β ( H ) ← θ β + θ α ∂µ β α ∈I c ( H ) For Ising model, ∂ g st ∂µ s = µ t and so on

Graphical Models Exponential Families Variational Methods Mean Field Approximation Structured Mean Field (Contd.) Let F ( µ ( H )) denote the cost function Taking derivative w.r.t. µ β , β ∈ I ( H ) yields − ∂ψ ∗ ∂ F ( µ ( H )) ∂ g α ( µ ( H )) H ( µ ( H )) � = θ β + θ α ∂µ β ∂µ β ∂µ β α ∈I c ( H ) γ β ( H ) = ∂ψ ∗ H ( µ ( H )) is the inverse moment mapping ∂µ β Setting the gradient to zero yields the update ∂ g α ( µ ( H )) � γ β ( H ) ← θ β + θ α ∂µ β α ∈I c ( H ) For Ising model, ∂ g st ∂µ s = µ t and so on We get the exact updates as naive mean field

Graphical Models Exponential Families Variational Methods Mean Field Approximation Structured Mean Field (Contd.) Let F ( µ ( H )) denote the cost function Taking derivative w.r.t. µ β , β ∈ I ( H ) yields − ∂ψ ∗ ∂ F ( µ ( H )) ∂ g α ( µ ( H )) H ( µ ( H )) � = θ β + θ α ∂µ β ∂µ β ∂µ β α ∈I c ( H ) γ β ( H ) = ∂ψ ∗ H ( µ ( H )) is the inverse moment mapping ∂µ β Setting the gradient to zero yields the update ∂ g α ( µ ( H )) � γ β ( H ) ← θ β + θ α ∂µ β α ∈I c ( H ) For Ising model, ∂ g st ∂µ s = µ t and so on We get the exact updates as naive mean field In general, H can be more involved

Graphical Models Exponential Families Variational Methods Mean Field Approximation Non-convexity of Mean Field The original problem is concave

CSci 8980: Advanced Topics in Graphical Models Variational Inference - PowerPoint PPT Presentation

Graphical Models Exponential Families Variational Methods Mean Field Approximation CSci 8980: Advanced Topics in Graphical Models Variational Inference Instructor: Arindam Banerjee October 17, 2007 Graphical Models Exponential Families

CSci 8980: Advanced Topics in Graphical Models Mixture Models, EM, Exponential Families

An Introduction to An Introduction to Variational Variational Methods for Graphical Models

CSci 8980: Advanced Topics in Graphical Models Application: Gene Expression Analysis Instructor:

CSci 8980: Advanced Topics in Graphical Models Analysis of Genetic Variation Instructor: Arindam

CSci 8980: Advanced Topics in Graphical Models Expectation Propagation Instructor: Arindam

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee

CSci 8980: Advanced Topics in Graphical Models MCMC, Gibbs Sampling Instructor: Arindam Banerjee

Variational Mean Field Variational Mean Field for Graphical Models for Graphical Models

Variational Auto-encoders 2 VARIATIONAL AUTO-ENCODERS INTRODUCTION VARIATIONAL AUTO-ENCODERS

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

Transforming Graphical System Models to Graphical Attack Models ! Joint work with Marieta

Graphical Models Graphical Models Exponential family & Variational Inference I Siamak

Special Topics: CSci 8980 Machine Learning in Computer Systems Jon B. Weissman (jon@cs.umn.edu)

Probabilistic Graphical Models Probabilistic Graphical Models Variable elimination Siamak

Probabilistic Graphical Models CMSC 678 UMBC Probabilistic Graphical Models A graph G that

10/4/15 Graphical Programming (1) Maze Program TOPICS Graphical Programming Using

Inference and Sampling of K 33 -free Ising Models Valerii Likhosherstov 1 , Yury Maximov 1,2 ,

High-dimensional Ising model and Monte Carlo methods Wojciech Rejchel Nicolaus Copernicus

Bootstrapping the 3D Ising Model David Simmons-Duffin IAS Strings 2014 with S. El-Showk, M.

Holographic encoding of universality in corner spectra National Center for Theoretical Sciences

Using multiple SLE to explain a certain observable in the 2d Ising model Michael J. Kozdron

Interfaces in planar Ising and Potts models a review Yvan V elenik Universit de Genve

Multiplicative chaos in random matrix theory and related fields Christian Webb Aalto University,

Topological Complexity for Quantum Information Zhengwei Liu Tsinghua University Joint with

Sambuz

Useful Links

Newsletter

Mail Us

CSci 8980: Advanced Topics in Graphical Models Variational Inference - PowerPoint PPT Presentation

Graphical Models Exponential Families Variational Methods Mean Field Approximation CSci 8980: Advanced Topics in Graphical Models Variational Inference Instructor: Arindam Banerjee October 17, 2007 Graphical Models Exponential Families

CSci 8980: Advanced Topics in Graphical Models Mixture Models, EM, Exponential Families

An Introduction to An Introduction to Variational Variational Methods for Graphical Models

CSci 8980: Advanced Topics in Graphical Models Application: Gene Expression Analysis Instructor:

CSci 8980: Advanced Topics in Graphical Models Analysis of Genetic Variation Instructor: Arindam

CSci 8980: Advanced Topics in Graphical Models Expectation Propagation Instructor: Arindam

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee

CSci 8980: Advanced Topics in Graphical Models MCMC, Gibbs Sampling Instructor: Arindam Banerjee

Variational Mean Field Variational Mean Field for Graphical Models for Graphical Models

Variational Auto-encoders 2 VARIATIONAL AUTO-ENCODERS INTRODUCTION VARIATIONAL AUTO-ENCODERS

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

Transforming Graphical System Models to Graphical Attack Models ! Joint work with Marieta

Graphical Models Graphical Models Exponential family &amp; Variational Inference I Siamak

Special Topics: CSci 8980 Machine Learning in Computer Systems Jon B. Weissman (jon@cs.umn.edu)

Probabilistic Graphical Models Probabilistic Graphical Models Variable elimination Siamak

Probabilistic Graphical Models CMSC 678 UMBC Probabilistic Graphical Models A graph G that

10/4/15 Graphical Programming (1) Maze Program TOPICS Graphical Programming Using

Inference and Sampling of K 33 -free Ising Models Valerii Likhosherstov 1 , Yury Maximov 1,2 ,

High-dimensional Ising model and Monte Carlo methods Wojciech Rejchel Nicolaus Copernicus

Bootstrapping the 3D Ising Model David Simmons-Duffin IAS Strings 2014 with S. El-Showk, M.

Holographic encoding of universality in corner spectra National Center for Theoretical Sciences

Using multiple SLE to explain a certain observable in the 2d Ising model Michael J. Kozdron

Interfaces in planar Ising and Potts models a review Yvan V elenik Universit de Genve

Multiplicative chaos in random matrix theory and related fields Christian Webb Aalto University,

Topological Complexity for Quantum Information Zhengwei Liu Tsinghua University Joint with

Sambuz

Useful Links

Newsletter

Mail Us

Graphical Models Graphical Models Exponential family & Variational Inference I Siamak