Boosted Density Estimation Remastered Zac Cranko 1,2 and Richard Nock - PowerPoint PPT Presentation

Boosted Density Estimation Remastered Zac Cranko 1,2 and Richard Nock 2,1,3 1 The Australian National University 2 CSIRO Data61 3 The University of Sydney

Quick Summary • Learn a density function incrementally • Use classifiers for the incremental updates (similar to GAN discriminators) • Unlike other state of the art attempts, achieve strong convergence results (geometric) using a weak learning assumption on the classifiers (in the paper!) 1

sup E Q 0 [log D ] − E P [log(1 − D )] D : X→ (0 , 1) 2

def def D Take f ( t ) = t log t − ( t + 1) log( t + 1) and ϕ ( D ) = 1 − D . Then sup E Q 0 [log D ] − E P [log(1 − D )] D : X→ (0 , 1) E Q 0 [ f ′ ◦ ϕ ◦ D ] − E P [ f ∗ ◦ f ′ ◦ ϕ ◦ D ] = sup D : X→ (0 , 1) E Q 0 [ f ′ ◦ d ] − E P [ f ∗ ◦ f ′ ◦ d ] = sup d : X→ (0 , ∞ ) � f ′ ◦ d P � � f ∗ ◦ f ′ ◦ d P � = E Q 0 − E P d Q 0 d Q 0 3

def def D Take f ( t ) = t log t − ( t + 1) log( t + 1) and ϕ ( D ) = 1 − D . Then sup E Q 0 [log D ] − E P [log(1 − D )] D : X→ (0 , 1) E Q 0 [ f ′ ◦ ϕ ◦ D ] − E P [ f ∗ ◦ f ′ ◦ ϕ ◦ D ] = sup D : X→ (0 , 1) E Q 0 [ f ′ ◦ d ] − E P [ f ∗ ◦ f ′ ◦ d ] = sup d : X→ (0 , ∞ ) � f ′ ◦ d P � � f ∗ ◦ f ′ ◦ d P � = E Q 0 − E P d Q 0 d Q 0 Recall: � � f ( x ) d P ∀ f : f ( x ) P (d x ) = ( x ) Q 0 (d x ) d Q 0 3

Main Idea E Q 0 [ f ′ ◦ d ′ ] − E P [ f ∗ ◦ f ′ ◦ d ′ ] d 1 ∈ argmax d ′ : X→ (0 , ∞ ) 1. Find d 1 as above 4

Main Idea E Q 0 [ f ′ ◦ d ′ ] − E P [ f ∗ ◦ f ′ ◦ d ′ ] d 1 ∈ argmax d ′ : X→ (0 , ∞ ) 1. Find d 1 as above 2. Multiply d 1 ( x ) Q 0 (d x ) to find P (d x ) 4

Main Idea E Q 0 [ f ′ ◦ d ′ ] − E P [ f ∗ ◦ f ′ ◦ d ′ ] d 1 ∈ argmax d ′ : X→ (0 , ∞ ) 1. Find d 1 as above 2. Multiply d 1 ( x ) Q 0 (d x ) to find P (d x ) 3. Finished. Get a job at a hedge fund next door Unfortunately this is not so simple since in practice we can only approximately solve the maximisation. 4

Main Idea E Q 0 [ f ′ ◦ d ′ ] − E P [ f ∗ ◦ f ′ ◦ d ′ ] d 1 ∈ argmax d ′ : X→ (0 , ∞ ) 1. Find d 1 as above 2. Multiply d 1 ( x ) Q 0 (d x ) to find P (d x ) 3. Finished. Get a job at a hedge fund next door Unfortunately this is not so simple since in practice we can only approximately solve the maximisation. Sadface. 4

Solution E Q t − 1 [ f ′ ◦ d ′ ] − E P [ f ∗ ◦ f ′ ◦ d ′ ] argmax d t ∈ d ′ : X→ (0 , ∞ ) Q t = 1 � Q t (d x ) = d α t ˜ t ( x ) · ˜ ˜ d ˜ def Q t − 1 (d x ) , where = Q t , Z t Q t , Z t 1. Some step size parameters α t ∈ (0 , 1) 2. Treat the updates as classifiers d t = exp ◦ c t 5

Solution E Q t − 1 [ f ′ ◦ d ′ ] − E P [ f ∗ ◦ f ′ ◦ d ′ ] argmax d t ∈ d ′ : X→ (0 , ∞ ) Q t = 1 � Q t (d x ) = d α t ˜ t ( x ) · ˜ ˜ d ˜ def Q t − 1 (d x ) , where = Q t , Z t Q t , Z t 1. Some step size parameters α t ∈ (0 , 1) 2. Treat the updates as classifiers d t = exp ◦ c t • The classifiers are distinguishing between samples originating from P and Q t − 1 like in a GAN • However unlike a GAN there is not necessarily a simple fast sampler for Q t − 1 , but there is a closed-form density function 5

Solution E Q t − 1 [ f ′ ◦ d ′ ] − E P [ f ∗ ◦ f ′ ◦ d ′ ] argmax d t ∈ d ′ : X→ (0 , ∞ ) Q t = 1 � Q t (d x ) = d α t ˜ t ( x ) · ˜ ˜ d ˜ def Q t − 1 (d x ) , where = Q t , Z t Q t , Z t 1. Some step size parameters α t ∈ (0 , 1) 2. Treat the updates as classifiers d t = exp ◦ c t • The classifiers are distinguishing between samples originating from P and Q t − 1 like in a GAN • However unlike a GAN there is not necessarily a simple fast sampler for Q t − 1 , but there is a closed-form density function Convergence of Q t → P in KL-divergence with a weak learning assumption on the updates as classifiers. With additional minimal assumptions: geometric convergence. 5

Experiments (0.5 if Q t → P ) 0 . 8 Accuracy 0 . 6 0 . 5 0 . 4 4 (lower is better) KL( P, Q t ) 3 2 1 0 0 1 2 3 4 5 t = 0 t = 1 t = 2 t = 3 6

Thanks for listening, come chat to us at poster #161. (Bring beer!) 7

Boosted Density Estimation Remastered Zac Cranko 1,2 and Richard Nock - PowerPoint PPT Presentation

Boosted Density Estimation Remastered Zac Cranko 1,2 and Richard Nock 2,1,3 1 The Australian National University 2 CSIRO Data61 3 The University of Sydney Quick Summary Learn a density function incrementally Use classifiers for the

Boosted Top Tagging Seung J. Lee Outline Introduction: top jets @ LHC Modern boosted top

Relative Density Chapters 3.5 Relative Density 1 2/5/2015 Minimum Density Pluviate soil from

Density Ratio Estimation Density Ratio Estimation in Machine Learning in Machine Learning

Outline Density Estimation 1 Nonparametric Methods Bins Kernel Estimators k-Nearest Neighbor

Non-parametric Methods Oliver Schulte - CMPT 726 Bishop PRML Ch. 2.5 Kernel Density Estimation

Nonparametric Methods Steven J Zeil Old Dominion Univ. Fall 2010 1 Density Estimation

Polyethylene Monomer: Ethylene High Density Polyethylene (HDPE) Low Density Polyethylene

Bulk Density and Void Content Bulk Density Bulk density ( n .) the mass of a unit volume of bulk

StarCraft: Remastered Emulating a buffer overflow for fun and profit A note before we begin

Task-Based Learning Remastered ETAS Conference, January 2019 Neil McCutcheon, Fluency First ELT

Boosted Higgs, b tagging and other tools/techniques (Part 2) Dinko Ferenek Rutgers, The State

NEAT 022 Switching from a Boosted PI to Dolutegravir NEAT 022: Design Study Design

SPIRIT STUDY Switch to RPV-TDF-FTC from Ritonavir-boosted PI Regimen Spirit: Study Design Study

EMERALD DRV-COBI-TAF-FTC vs Continue Boosted PI + TDF-FTC EMERALD: Design EMERALD: Study Design

MODAt Trial Simplification to Ritonavir-Boosted Atazanavir Monotherapy MODAt: Study Design Study

Lecture 7: Kernel Density Estimation Applied Statistics 2015 1 / 20 Kernel Density Estimator

EI331 Signals and Systems Lecture 14 Bo Jiang John Hopcroft Center for Computer Science

MA Macroeconomics 11. The Solow Model Karl Whelan School of Economics, UCD Autumn 2014 Karl

On the motion of compressible inviscid fluids driven by stochastic forcing Eduard Feireisl based

CSC321 Lecture 6: Backpropagation Roger Grosse Roger Grosse CSC321 Lecture 6: Backpropagation 1

2.2 Classic Differential Geometry 1 Hao Li http://cs621.hao-li.com 1 Spring 2018 CSCI 621:

San Francisco Economic Strategy The 2007 Economic Strategy identifies three

Math 5490 11/3/2014 Dynamical Systems Math 5490 Summary So Far November 3, 2014 Topics in

The cumulative cultural evolution of category structure in an open-ended meaning space Jon W.