Problem Session 2 Problem 1: from belief propagation to Bayes AMP - PDF document

OOPS 2020 Mean field methods in high-dimensional statistics and nonconvex optimization Lecturer: Andrea Montanari Problem session leader: Michael Celentano July 6, 2020 Problem Session 2 Problem 1: from belief propagation to Bayes AMP state evolution Below I have depicted the computation tree. · · · v ′ · · · f ′ X f ′ v y f ′ X fv v f y f θ v · · · · · · We observe the edge weights X fv and for each factor node the outcome � X fv ′ θ v ′ + w f . y f = v ′ ∈ ∂f Recall iid iid iid ∼ N (0 , σ 2 ) . X fv ∼ N (0 , 1 /n ) , θ v ∼ µ Θ , and w f The belief propagation algorithm on the computation tree exactly computes the posterior p v ( ϑ |T v, 2 t ), where T v, 2 t is the σ -algebra generated by the obervations corresponding to nodes and edges within a 2 t -radius ball of v . The iteration is m 0 v → f ( ϑ ) = 1 ,   �  − 1 � 2 � m s � � m s � ˜ f → v ( ϑ ) ∝ exp y f − X fv ϑ − X fv ′ ϑ v ′ v ′ → f ( ϑ v ′ ) µ Θ (d ϑ v ′ ) ,  2 σ 2 v ′ ∈ ∂f \ v v ′ ∈ ∂f \ v v ′ ∈ ∂f \ v � m s +1 m s v → f ( ϑ ) ∝ ˜ f ′ → v ( ϑ ) , f ′ ∈ ∂v \ f � m t � m t with normalization ˜ f → v ( ϑ ) µ Θ | V ( v v , d ϑ ) = v → f ( ϑ ) µ Θ | V ( v v , d ϑ ) = 1. One can show that for any variable node v , the posterior density with respect to measure µ Θ is � m t − 1 p v ( ϑ |T v, 2 t ) ∝ ˜ f → v ( ϑ ) . f ∈ ∂v This equation is exact. Our goal is to show that when n, d → ∞ , n/d → δ � − 1 � v − ϑ ) 2 + o p (1) ( χ t p v ( ϑ |T v, 2 t ) ∝ exp , 2 τ 2 t 1

d where ( χ t v , θ v ) → (Θ + τ t Z, Θ), Θ ∼ µ Θ , G ∼ N (0 , 1) independent of Θ, and τ t is given by the Bayes AMP state evolution equations t +1 = σ 2 + 1 τ 2 δ mmse Θ ( τ 2 t ) , initialized by τ 2 0 = ∞ . In fact, this follows without too much work once we show that � � − 1 v → f − ϑ ) 2 + o p (1) m s ( χ s v → f ( ϑ ) ∝ exp , (1) 2 τ 2 s d where ( χ s → (Θ + τ s Zµ s v → f , θ v ) v ′ → f − , Θ). This problem focuses on establishing (1). We do so inductively. The base case more-or-less follows the standard inductive step, except that we need to pay some attention to the infinite variance τ 2 0 = ∞ . We do not consider the base case here. Throughout, we assume µ Θ has compact support. We do not carefully verify the validity of all approximations. See Celentano, Montanari, Wu. “The estimation error of general first order methods.” COLT 2020 , for complete details. 2

(a) Define � � v → f ) 2 = µ s ϑm s ( τ s ϑ 2 m s v → f ( ϑ ) µ Θ (d ϑ ) − ( µ s v ′ → f ) 2 , v → f = v → f ( ϑ ) µ Θ (d ϑ ) , and f → v ) 2 = � � µ s X fv ′ µ s τ s X 2 fv ′ ( τ s v ′ → f ) 2 . ˜ f → v = v ′ → f , (˜ v ′ ∈ ∂f \ v v ′ ∈ ∂f \ v Argue (non-rigorously) that we may approximate (up to normalization) m s µ s τ s � � ˜ f → v ( ϑ ) ≈ E G p ( X fv ϑ + ˜ f → v + ˜ f → v G − y f ) , 2 σ 2 x 2 is the normal density at variance σ 2 . 1 2 πσ e − 1 where G ∼ N (0 , 1) and p ( x ) = √ v → f ) 2 have a simple statistical interpretation: they are the pos- Remark: The quantities µ s v → f and ( τ s terior mean and variance for θ v given observations in the computation tree within distance 2 s of node v and excluding the branch in the direction of f . (b) Using the inductive hypothesis, show that as n, d → ∞ , n/d → δ → 1 p τ s f → v ) 2 δ mmse Θ ( τ 2 τ 2 (˜ s ) =: ˜ s . f → v = X fv θ v + ˜ µ s Z s Further, note y f − ˜ f → v , where ˜ � Z s X fv ′ ( θ v ′ − µ s f → v = w f + v ′ → f ) . v ′ ∈ ∂f \ v Argue � 0 , σ 2 + 1 � d Z s ˜ δ mmse Θ ( τ 2 → N s ) f → v and is independent of X fv and θ v . v ′ → f ( ϑ v ′ ) as v ′ varies in ∂f are iid and independent of the edge weights Hint: The (random) functions m s X fv ′ . Why? (c) For any smooth probability density f : R → R > 0 , µ ∈ R , and τ > 0, show that d τG )] = − 1 µ log E G [ f (˜ µ + ˜ τ E [ G | S + ˜ τG = ˜ µ ] , d˜ ˜ d 2 τG )] = − 1 µ 2 log E G [ f (˜ µ + ˜ τ 2 (1 − Var[ G | S + ˜ τG = ˜ µ ]) , d˜ ˜ where S ∼ f ( s )d s independent of G ∼ N (0 , 1). (d) We Taylor expand f → v ϑ − 1 f → v ϑ 2 + O p ( n − 3 / 2 ) . fv ˜ m s a s 2 X 2 b s log ˜ f → v ( ϑ ) ≈ const + X fv ˜ f → v and ˜ a s b s (We take this to be the definition of ˜ f → v ). Taking the approximation in part (a) to hold with equality, argue 1 1 ˜ a s µ s b s ˜ f → v = ( y f − ˜ f → v ) + o p (1) , f → v = + o p (1) . τ 2 τ 2 s +1 s +1 3

µ s (e) Taking the approximations in part (d) to hold with equality and using part (b) to subsitute for y f − ˜ f → v , Taylor expand log m s +1 v → f ( ϑ ) to conclude 1 1 ϑ 2 + o p (1) , log m s +1 χ s +1 v → f ( ϑ ) = const + v → f ϑ − τ 2 τ 2 s +1 s +1 d where ( χ s +1 v → f , Θ) → (Θ+ τ s +1 Z, Θ). Why do we expect this Taylor expansion to be valid for all ϑ = O (1)? Conclude Eq. (1). 4

Problem Session 2 Problem 1: from belief propagation to Bayes AMP - PDF document

OOPS 2020 Mean field methods in high-dimensional statistics and nonconvex optimization Lecturer: Andrea Montanari Problem session leader: Michael Celentano July 6, 2020 Problem Session 2 Problem 1: from belief propagation to Bayes AMP state

Problem Definition Problem Definition Problem Definition Problem Definition Problem Definition

Texture Synthesis Presented by James Hays Problem Statement 1 Problem Statement Problem

Oral Presentation Program Thursday Oct 3, 11:00-12:35 Session 1 Session 2 Session 3 Session 4

Time Room 1 Room 2 Room 3 Room 4 Room 5 Room 6 Room 7 Room 8 Session 1a Session 2a

Problem Analysis Session SWERC judges November 30, 2017 SWERC judges Problem Analysis Session

Problems Problem Spaces Problems, Problem Spaces, and Search Ahmed Rafea Ahmed Rafea Problem

Integrating Problem Solving 2020 Integrating Problem Solving 2020 Integrating Problem Solving

Celebration of Student Achievement: Poster Session Schedule 2016 Session A: EDU 212 - 12:00-1:00

SESSION 6: SESSION 6: PR SESSION 6: SESSION 6: PR PROCEDURES OF OPEN PROCEDURES OF OPEN

Last time: Problem-Solving Problem solving: Goal formulation Problem formulation

Talks: Session 1 Talks: Session 1 Talks: Session 1 Talks: Session 1 Saturday, April 7, 9:30

Session 2 : Numerical Python and plotting Session 2 In this session: Session 1 exercise

Problem Analysis Session SWERC judges December 2, 2018 SWERC judges Problem Analysis Session

Computational Aesthetics CS 294-69 Final Project Armin Samii Tim Althoff Problem Problem

Problem solving and search Chapter 3 Chapter 3 1 Outline Problem-solving agents Problem

Problem solving and search Chapter 3 Chapter 3 1 Outline Problem-solving agents Problem

MetiTarski: An Automatic Prover for Real-Valued Special Functions Behzad Akbarpour and Lawrence

Taylor Expansion of Maximum Likelihood Attacks Institut Nicolas Bruneau 1 , 2 , Sylvain Guilley 1

QCD with isospin chemical potential: low densities and Taylor expansion Bastian Brandt and

Second order Taylor Second order Taylor Method Taylor expansion of y ( t + h ) about y ( t )

From Math 2220 Class 15 1 Variable Taylor Series Multivariable Dr. Allen Back Taylor

Why Fuzzy Interpretation of . . . Bernstein Polynomials Fuzzy Interpretation . . . How Can We .

AM 205: lecture 11 Final project worth 30% of grade Due on Thursday December 13th at 11:59

Hardy Algebras, Berezin Transform and Taylors Taylor Series Paul Muhly and Baruch Solel