Learning from random moments Rmi Gribonval - Inria Rennes - Bretagne - PowerPoint PPT Presentation

Learning from random moments Rémi Gribonval - Inria Rennes - Bretagne Atlantique remi.gribonval@inria.fr Joint work with: G. Blanchard (U. Potsdam) N. Keriven, Y Traonmilin (Inria Rennes) 1

Main Contributors & Collaborators Anthony Bourrier Nicolas Keriven Yann Traonmilin Gilles Puy Nicolas Tremblay Gilles Blanchard Mike Davies Patrick Perez R. GRIBONVAL 2 Inverse Problems and Machine Learning, Caltech, February 2018

Foreword Signal processing & machine learning inverse problems & generalized method of moments embeddings with random projections & random features /kernels image super-resolution, source localization & k-means Continuous vs discrete ? wavelets (1990s): from continuous to discrete compressive sensing (2000s): in the discrete world current trends : back to continuous ! off-the-grid compressive sensing, FRI, high-resolution methods compressive statistical learning from random moments R. GRIBONVAL 3 Inverse Problems and Machine Learning, Caltech, February 2018

Learning from random moments: the concept Compressive Statistical Learning (guarantees) Recent developments & perspectives R. GRIBONVAL 4 Inverse Problems and Machine Learning, Caltech, February 2018

Large-scale learning X x n x 2 x 1 R. GRIBONVAL 5 Inverse Problems and Machine Learning, Caltech, February 2018

Large-scale learning X x n x 2 x 1 High feature dimension d Large collection size n = “volume” R. GRIBONVAL 5 Inverse Problems and Machine Learning, Caltech, February 2018

Large-scale learning X x n x 2 x 1 High feature dimension d Large collection size n = “volume” Challenge: compress before learning ? X R. GRIBONVAL 5 Inverse Problems and Machine Learning, Caltech, February 2018

Compressive learning: three routes X x n x 2 x 1 dimension subsampling sketching reduction Y = MX random projections - Johnson Lindenstrauss lemma see e.g. [Calderbank & al 2009, Reboredo & al 2013] R. GRIBONVAL 6 Inverse Problems and Machine Learning, Caltech, February 2018

Compressive learning: three routes X x n x 2 x 1 dimension subsampling sketching reduction x 2 x n x 1 Nyström method & coresets see e.g. [Williams&Seeger 2000, Agarwal & al 2003, Felman 2010] R. GRIBONVAL 7 Inverse Problems and Machine Learning, Caltech, February 2018

Compressive learning: three routes X x n x 2 x 1 dimension subsampling random reduction moments E Φ 1 ( X ) z ∈ R m … E Φ m ( X ) Inspiration : compressive sensing [Foucart & Rauhut 2013] sketching/hashing [Thaper & al 2002, Cormode & al 2005] Connections with : generalized method of moments [Hall 2005] kernel mean embeddings [Smola & al 2007, Sriperimbudur & al 2010] R. GRIBONVAL 8 Inverse Problems and Machine Learning, Caltech, February 2018

Example: Compressive K-means X Training set n = 70000; d = 784; k = 10 R. GRIBONVAL 9 Inverse Problems and Machine Learning, Caltech, February 2018

Example: Compressive K-means X Training set n = 70000; d = k = 10 1 1 0.5 0.5 Dim. 6 Dim. 6 0 0 -0.5 -0.5 -1 -1 -1 -1 -0.5 -0.5 0 0 0.5 0.5 1 1 Dim. 5 Dim. 5 Spectral features R. GRIBONVAL 9 Inverse Problems and Machine Learning, Caltech, February 2018

Example: Compressive K-means Sketch vector X Training set memory size independent of n n = 70000; d = k = 10 m & kd = 100 1 1 Sketch( X ) n = 1 X z ∈ R m Φ ( x i ) 0.5 0.5 n streaming / distributed i =1 computation Dim. 6 Dim. 6 0 0 -0.5 -0.5 -1 -1 -1 -1 -0.5 -0.5 0 0 0.5 0.5 1 1 Dim. 5 Dim. 5 Spectral features R. GRIBONVAL 9 Inverse Problems and Machine Learning, Caltech, February 2018

Example: Compressive K-means Sketch vector memory size independent of n n = 70000; d = k = 10 m & kd = 100 n = 1 X z ∈ R m Φ ( x i ) n streaming / distributed i =1 computation Privacy-aware R. GRIBONVAL 9 Inverse Problems and Machine Learning, Caltech, February 2018

Example: Compressive K-means Sketch vector memory size independent of n n = 70000; d = k = 10 m & kd = 100 1 1 n = 1 X z ∈ R m Φ ( x i ) 0.5 0.5 n streaming / distributed i =1 computation Privacy-aware Dim. 6 Dim. 6 0 0 -0.5 -0.5 Learn centroids from sketch = moment fitting -1 -1 -1 -1 -0.5 -0.5 0 0 0.5 0.5 1 1 Dim. 5 Dim. 5 R. GRIBONVAL 9 Inverse Problems and Machine Learning, Caltech, February 2018

Example: Compressive K-means Sketch vector memory size independent of n n = 70000; d = k = 10 m & kd = 100 1 1 1 1 n = 1 X z ∈ R m Φ ( x i ) 0.5 0.5 0.5 0.5 n streaming / distributed i =1 computation Privacy-aware Dim. 6 Dim. 6 Dim. 6 Dim. 6 0 0 0 0 -0.5 -0.5 -0.5 -0.5 Learn centroids from sketch = moment fitting -1 -1 -1 -1 -1 -1 -1 -1 -0.5 -0.5 -0.5 -0.5 0 0 0 0 0.5 0.5 0.5 0.5 1 1 1 1 Dim. 5 Dim. 5 Dim. 5 Dim. 5 R. GRIBONVAL 9 Inverse Problems and Machine Learning, Caltech, February 2018

Example: Compressive K-means Sketch vector memory size independent of n n = 70000; d = k = 10 m & kd = 100 1 1 1 1 n = 1 X z ∈ R m Φ ( x i ) 0.5 0.5 0.5 0.5 n streaming / distributed i =1 computation Privacy-aware Dim. 6 Dim. 6 Dim. 6 Dim. 6 0 0 0 0 -0.5 -0.5 -0.5 -0.5 Learn centroids from sketch = moment fitting -1 -1 -1 -1 -1 -1 -1 -1 -0.5 -0.5 -0.5 -0.5 0 0 0 0 0.5 0.5 0.5 0.5 1 1 1 1 Dim. 5 Dim. 5 Dim. 5 Dim. 5 Using: random Fourier features Φ ( x ) := { e ı ω T j x } m j =1 Vector -valued function R. GRIBONVAL 9 Inverse Problems and Machine Learning, Caltech, February 2018

Sketching & Neural networks Sketching for k-means empirical characteristic function n z ` = 1 e jw > X ` x i n i =1 R. GRIBONVAL 10 Inverse Problems and Machine Learning, Caltech, February 2018

Sketching & Neural networks Sketching for k-means empirical characteristic function n z ` = 1 e jw > X ` x i n i =1 X R. GRIBONVAL 10 Inverse Problems and Machine Learning, Caltech, February 2018

Sketching & Neural networks Sketching for k-means empirical characteristic function n z ` = 1 e jw > X ` x i n i =1 w T ` X w T ` X R. GRIBONVAL 10 Inverse Problems and Machine Learning, Caltech, February 2018

Sketching & Neural networks Sketching for k-means empirical characteristic function n z ` = 1 e jw > X ` x i n i =1 W m WX X R. GRIBONVAL 10 Inverse Problems and Machine Learning, Caltech, February 2018

Sketching & Neural networks Sketching for k-means empirical characteristic function n z ` = 1 e jw > X ` x i n i =1 h ( · ) = e j ( · ) h ( WX ) W m WX X R. GRIBONVAL 10 Inverse Problems and Machine Learning, Caltech, February 2018

Sketching & Neural networks Sketching for k-means m empirical characteristic function n z z ` = 1 e jw > X ` x i n i =1 average h ( · ) = e j ( · ) h ( WX ) W m WX X R. GRIBONVAL 10 Inverse Problems and Machine Learning, Caltech, February 2018

Sketching & Neural networks Sketching for k-means m ~ One-layer random empirical characteristic function neural net n z z ` = 1 DNN ~ hierarchical sketching ? e jw > X ` x i see also [Bruna & al 2013, Giryes & al 2015] n i =1 average h ( · ) = e j ( · ) h ( WX ) W m WX X R. GRIBONVAL 10 Inverse Problems and Machine Learning, Caltech, February 2018

Sketching & Privacy Sketching ~ One-layer random empirical characteristic function neural net n z z ` = 1 DNN ~ hierarchical sketching ? e jw > X ` x i see also [Bruna & al 2013, Giryes & al 2015] n i =1 average h ( · ) = e j ( · ) Privacy-reserving sketch and forget h ( WX ) W WX X R. GRIBONVAL 11 Inverse Problems and Machine Learning, Caltech, February 2018

Sketching & Online Learning Sketching Streaming algorithms empirical characteristic function One pass; online update n z z ` = 1 e jw > X ` x i n i =1 average h ( · ) = e j ( · ) h ( WX ) W WX streaming … X R. GRIBONVAL 12 Inverse Problems and Machine Learning, Caltech, February 2018

Sketching & Distributed Computing Sketching Distributed computing empirical characteristic function Decentralized (HADOOP) / n parallel (GPU) z z ` = 1 e jw > X ` x i n i =1 average h ( · ) = e j ( · ) DIS TRI BU TED h ( WX ) W WX … … … … X R. GRIBONVAL 13 Inverse Problems and Machine Learning, Caltech, February 2018

Learning from random moments: the concept Compressive Statistical Learning (guarantees) Recent developments & perspectives R. GRIBONVAL 14 Inverse Problems and Machine Learning, Caltech, February 2018

Statistical learning 101 Statistical risk R ( p, ✓ ) = E x ∼ p ` ( x, ✓ ) θ ? ∈ arg min Target R ( p ? , θ ) ✓ ˆ Empirical version θ n ∈ arg min R (ˆ p n , θ ) n θ X p n := 1 x i ∼ p ? , i.i.d. δ x i ˆ n i =1 PAC / e xcess risk control / generalization error R ( p ? , ˆ θ n ) ≤ R ( p ? , θ ? ) + η n p n , θ ) − R ( p ? , θ ) | ≤ η n / 2 can be achieved if uniform convergence, i.e. whp sup | R (ˆ ✓ R. GRIBONVAL 15 Inverse Problems and Machine Learning, Caltech, February 2018

Learning from random moments Rmi Gribonval - Inria Rennes - Bretagne - PowerPoint PPT Presentation

Learning from random moments Rmi Gribonval - Inria Rennes - Bretagne Atlantique remi.gribonval@inria.fr Joint work with: G. Blanchard (U. Potsdam) N. Keriven, Y Traonmilin (Inria Rennes) 1 Main Contributors & Collaborators Anthony

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number

Outline Outline Conditional Distribution and Density Conditional Distribution and

Estimating Frequency Moments Moments Estimating F 0 Algorithm Correctness Anil Maheshwari

Chapter 2: Random Variables In this chapter we will cover: 1. Discrete Random variables, ( 2.1

Random Numbers, Files, and Onwards Random Numbers Computers cannot produce truly random numbers.

APPLYING THE METHOD APPLYING THE METHOD OF MOMENTS TO OF MOMENTS TO DEVELOP RELIABILITY

VMware Skyline Turn Moments of Panic into Moments to Shine with Proactive Support Arron King

History and Approaches to Social Change Todays session is about... Moments in Organizing

3. Lecture: Basics of Magnetism: Local Moments Hartmut Zabel Ruhr-University Bochum Germany

Estimating Frequency Moments Estimating F 0 Algorithm Correctness Further Anil Maheshwari

three forces and moments Forces & Moments 1 Architectural Structures S2018abn Lecture 3

New Cluster Moments for Jet Cleaning Atlas Reco meeting Sven Menke, MPP M unchen 4. Oct 2011,

Moments of Traces for Circular -ensembles Tiefeng Jiang University of Minnesota This is a

three forces and moments Forces & Moments 1 Architectural Structures F2018abn Lecture 3

Discrete Random Variables October 7, 2010 Discrete Random Variables Random Variables In many

Chapter 9 Object recognition Random Forests 9.9 Random forests 2 9.9 Random forests

Markov Chains & Zifan Yu Department of Mathematics, University of Maryland Random Walks

Universality for zeros of random polynomials Motivation Random polynomials Turgay Bayraktar

Critical interfaces in random media: random bond Potts model and logarithmic CFTs Raoul

MonkeySort Keith Gallagher Florida Institute of Technology An Introduction.... The Quark

Paths and Random Walks on Graphs Based on materials by

DISCRETE PROBABILITY Discrete Probability is a finite or countable set called the

Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The

Random Number Generation with Multiple Streams for Sequential and Parallel Computing Pierre

Learning from random moments Rmi Gribonval - Inria Rennes - Bretagne - PowerPoint PPT Presentation

Learning from random moments Rmi Gribonval - Inria Rennes - Bretagne Atlantique remi.gribonval@inria.fr Joint work with: G. Blanchard (U. Potsdam) N. Keriven, Y Traonmilin (Inria Rennes) 1 Main Contributors & Collaborators Anthony

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number

Outline Outline Conditional Distribution and Density Conditional Distribution and

Estimating Frequency Moments Moments Estimating F 0 Algorithm Correctness Anil Maheshwari

Chapter 2: Random Variables In this chapter we will cover: 1. Discrete Random variables, ( 2.1

Random Numbers, Files, and Onwards Random Numbers Computers cannot produce truly random numbers.

APPLYING THE METHOD APPLYING THE METHOD OF MOMENTS TO OF MOMENTS TO DEVELOP RELIABILITY

VMware Skyline Turn Moments of Panic into Moments to Shine with Proactive Support Arron King

History and Approaches to Social Change Todays session is about... Moments in Organizing

3. Lecture: Basics of Magnetism: Local Moments Hartmut Zabel Ruhr-University Bochum Germany

Estimating Frequency Moments Estimating F 0 Algorithm Correctness Further Anil Maheshwari

three forces and moments Forces &amp; Moments 1 Architectural Structures S2018abn Lecture 3

New Cluster Moments for Jet Cleaning Atlas Reco meeting Sven Menke, MPP M unchen 4. Oct 2011,

Moments of Traces for Circular -ensembles Tiefeng Jiang University of Minnesota This is a

three forces and moments Forces &amp; Moments 1 Architectural Structures F2018abn Lecture 3

Discrete Random Variables October 7, 2010 Discrete Random Variables Random Variables In many

Chapter 9 Object recognition Random Forests 9.9 Random forests 2 9.9 Random forests

Markov Chains &amp; Zifan Yu Department of Mathematics, University of Maryland Random Walks

Universality for zeros of random polynomials Motivation Random polynomials Turgay Bayraktar

Critical interfaces in random media: random bond Potts model and logarithmic CFTs Raoul

MonkeySort Keith Gallagher Florida Institute of Technology An Introduction.... The Quark

Paths and Random Walks on Graphs Based on materials by

DISCRETE PROBABILITY Discrete Probability is a finite or countable set called the

Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The

Random Number Generation with Multiple Streams for Sequential and Parallel Computing Pierre

three forces and moments Forces & Moments 1 Architectural Structures S2018abn Lecture 3

three forces and moments Forces & Moments 1 Architectural Structures F2018abn Lecture 3

Markov Chains & Zifan Yu Department of Mathematics, University of Maryland Random Walks