EM Algorithm and Mixture Models Guojun Zhang University of Waterloo - PowerPoint PPT Presentation

Feb 04, 2023 •111 likes •219 views

EM Algorithm and Mixture Models Guojun Zhang University of Waterloo Unsupervised learning and clustering Learn the intrinsic representation of unlabeled data Other examples: density estimation, novelty detection Mixture model

EM Algorithm and Mixture Models Guojun Zhang University of Waterloo
Unsupervised learning and clustering • Learn the intrinsic representation of unlabeled data • Other examples: density estimation, novelty detection
Mixture model • Continuous: mixture of Gaussians • Discrete: mixture of Bernoullis
Gaussian Bernoulli: flipping a coin
Optimization algorithms • Loss function: negative log likelihood • Expectation-Maximization (DLR 1977):
Optimization algorithms • Loss function: negative log likelihood • Gradient descent:
k-cluster region • What if just some clusters are used? Has the algorithm learned the ground truth? How bad are these regions?
Potential project • To study how EM and GD (or any other algorithm) behave in learning mixture models • Can they avoid some bad local minima, such as the k-cluster regions? • Some Results/Guesses: 1) EM does but GD does not (on BMMs) 2) EM escapes exponentially faster than GD (on GMMs) • Ultimate goal: to understand their convergence property and the limit of each algorithm; to propose better algorithms • Need strong mathematical background: linear algebra, advanced calculus, probability theory and statistics, continuous optimization, (maybe) dynamical systems…
References • Christopher Bishop, “Pattern Recognition and Machine Learning” (2006). • Guojun Zhang, Pascal Poupart and George Trimponias, “Comparing EM with GD in Mixtures of Two Components,” to appear in UAI 2019. • Dempster, Arthur P ., Nan M. Laird and Donald B. Rubin. “Maximum likelihood from incomplete data via the EM algorithm.” Journal of the Royal Statistical Society: Series B (1977).

Recommend

Bernoulli Mixture Models Victor Medina Researcher at SBIF DataCamp Mixture Models in R The

DataCamp Mixture Models in R MIXTURE MODELS IN R Bernoulli Mixture Models Victor Medina Researcher at SBIF DataCamp Mixture Models in R The handwritten digits dataset DataCamp Mixture Models in R Continuous versus discrete variables

440 views • 41 slides

Structure of mixture models Victor Medina Researcher at SBIF DataCamp Mixture Models in R

DataCamp Mixture Models in R MIXTURE MODELS IN R Structure of mixture models Victor Medina Researcher at SBIF DataCamp Mixture Models in R Description of mixture models 1. Which is the suitable probability distribution? Get familiar with

1.07k views • 36 slides

AND MACHINE LEARNING CHAPTER 10: MIXTURE MODELS AND EM Mixture Models - Define a joint

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 10: MIXTURE MODELS AND EM Mixture Models - Define a joint distribution over observed and latent variables - The corresponding distribution of the observed variables alone is obtained by

878 views • 62 slides

Gaussian Mixture Models & EM CE-717: Machine Learning Sharif University of Technology M.

Gaussian Mixture Models & EM CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Mixture Models: definition Mixture models: Linear supper-position of mixtures or components | =

1.04k views • 39 slides

The EM Algorithm The EM algorithm Mixture models Why EM works EM variants Learning

Preview The EM Algorithm The EM algorithm Mixture models Why EM works EM variants Learning with Missing Data The EM Algorithm Goal: Learn parameters of Bayes net with known structure Initialize parameters ignoring missing

363 views • 3 slides

Deep Gaussian Mixture Models Cinzia Viroli (University of Bologna, Italy) joint with Geoff

Deep Gaussian Mixture Models Cinzia Viroli (University of Bologna, Italy) joint with Geoff McLachlan (University of Queensland, Australia) JOCLAD 2018, Lisbona, April 5th, 2018 Outline Deep Learning Mixture Models Deep Gaussian Mixture

915 views • 66 slides

Bag-of-components: an online algorithm for batch learning of mixture models Olivier Schwander

Information Geometry for mixtures Co-Mixture Models Bag of components Bag-of-components: an online algorithm for batch learning of mixture models Olivier Schwander Frank Nielsen Universit Pierre et Marie Curie, Paris, France cole

551 views • 20 slides

Classification of High Dimensional Data By Two-way Mixture Models Jia Li Statistics Department

Classification of High Dimensional Data By Two-way Mixture Models Jia Li Statistics Department The Pennsylvania State University 1 Outline Goals Two-way mixture model approach Background: mixture discriminant analysis Model

418 views • 23 slides

Metropolis-Hastings Algorithm for Mixture Model and its Weak Convergence Kengo, KAMATANI

Metropolis-Hastings Algorithm for Mixture Model and its Weak Convergence Kengo, KAMATANI University of Tokyo, Japan KAMATANI (University of Tokyo, Japan) Metropolis-Hastings Algorithm for Mixture Model and its Weak Convergence 1 / 22 1 Gibbs

388 views • 22 slides

CSci 8980: Advanced Topics in Graphical Models Mixture Models, EM, Exponential Families

Variants Auxiliary Functions Mixture of Gaussians Exponential Family Mixtures of Exponential Families Bayes Net CSci 8980: Advanced Topics in Graphical Models Mixture Models, EM, Exponential Families Instructor: Arindam Banerjee September

1.04k views • 83 slides

Odds Algorithm An Online Algorithm Group Fibonado 20. Dec 2016 Group Fibonado Odds Algorithm

Odds Algorithm An Online Algorithm Group Fibonado 20. Dec 2016 Group Fibonado Odds Algorithm 20. Dec 2016 1 / 21 Outline Introduction 1 Online Algorithm The Secretary Problem Optimal Stopping 2 Odds Algorithm 3 Algorithm Proof

1.27k views • 55 slides

Flexible Mixture Modeling and Model-Based Clustering in R Bettina Grn September 2017 c

Flexible Mixture Modeling and Model-Based Clustering in R Bettina Grn September 2017 c Flexible Mixture Modeling and Model-Based Clustering in R 0 / 170 Outline Bettina Grn September 2017 c Flexible Mixture Modeling and

2.01k views • 171 slides

EM Algorithm 11-09-2019 For Mixture Gaussian Models Instructor - Sriram Ganapathy

E9 205 Machine Learning for Signal Processing EM Algorithm 11-09-2019 For Mixture Gaussian Models Instructor - Sriram Ganapathy (sriramg@iisc.ac.in) Teaching Assistant - Prachi Singh (prachisingh@iisc.ac.in). EM Algorithm For GMMs EM

368 views • 18 slides

Constrained Mixture Estimation for Constrained Mixture Estimation Analysis and Robust

Constrained Mixture Estimation for Constrained Mixture Estimation Analysis and Robust Classification of for Analysis and Robust Classification Clinical Time Series of Clinical Time Series Alexander Schnhuth (joint work with Ivan Costa,

615 views • 35 slides

Mixture Selection, Mechanism Design, and Signaling Ho Yee Cheung Shaddin Dughmi Yu Cheng Ehsan

Mixture Selection, Mechanism Design, and Signaling Ho Yee Cheung Shaddin Dughmi Yu Cheng Ehsan Emamjomeh-Zadeh Li Han Shang-Hua Teng University of Southern California Yu Cheng (USC) 1 / 14 Mixture selection Mixture Selection Optimization

655 views • 40 slides

Binary liquid mixture of EmimBF 4 and methoxyethanol Binary liquid mixture excess molar volume

Binary liquid mixture of EmimBF 4 and methoxyethanol Binary liquid mixture excess molar volume (mL/mol) T=298 K P = 1 bar * + ( 1 x 1 ) V 2 * + x 1 ( 1 x 1 )( c + d x 1 ) V m = x 1 V 1 1 / 28 2 / 28 source: M. S. Reddy, et al .,

197 views • 4 slides

Information Geometry: Background and Applications in Machine Learning Giovanni Pistone

Geometry and Computer Science Information Geometry: Background and Applications in Machine Learning Giovanni Pistone www.giannidiorestino.it Pescara (IT), February 810, 2017 Abstract Information Geometry (IG) is the name given by S. Amari

688 views • 43 slides

1. Foundations of Numerics from Advanced Mathematics 1. Foundations of Numerics from Advanced

Mathematical Essentials and Notation Linear Algebra Calculus Stochastics and Statistics 1. Foundations of Numerics from Advanced Mathematics 1. Foundations of Numerics from Advanced Mathematics Numerical Programming I (for CSE), Hans-Joachim

532 views • 50 slides

Statistical Natural Language Processing Outcome Whether a review is negative or positive:

Statistical Natural Language Processing Outcome Whether a review is negative or positive: Outcome Negative Positive Value The POS tag of a word: Noun Random variables Verb Adj Adv Value 1 mapping outcomes to real numbers

465 views • 8 slides

Statistics of spike trains: A dynamical systems Statistics of spike trains: A dynamical systems

Statistics of spike trains: A dynamical systems Statistics of spike trains: A dynamical systems perspective. perspective. Bruno Cessac, Horacio Rostro, Juan-Carlos Vasquez, Thierry Viville Multiples scales. Non linear and collective

823 views • 70 slides

Statistical Modeling of SiPM Noise Sergey Vinogradov Lebedev Physical Institute of the Russian

Statistical Modeling of SiPM Noise Sergey Vinogradov Lebedev Physical Institute of the Russian Academy of Sciences, Moscow, Russia National Research Nuclear University MEPhI, Moscow, Russia Sergey Vinogradov Statistical Modeling of SiPM

294 views • 27 slides

Practical data analysis Large Number Theorems Width of a distribution Doru Constantin and

Practical data analysis References Variability Probability Distributions Practical data analysis Large Number Theorems Width of a distribution Doru Constantin and Guillaume Tresset Sampling Chi-squared doru.constantin@u-psud.fr

783 views • 32 slides

Statistics in Cryptanalysis Subhabrata Samajder Indian Statistical Institute, Kolkata 24 th May,

Statistics in Cryptanalysis Subhabrata Samajder Indian Statistical Institute, Kolkata 24 th May, 2017 0/35 Cryptanalysis of Affine Cipher Outline Cryptanalysis of Affine Cipher 1 Hypothesis Testing 2 Linear Cryptanalysis 3 0/35

1.52k views • 110 slides

PERCOLATION WITHOUT FKG VINCENT BEFFARA AND DAMIEN GAYET Abstract. We prove a Russo-Seymour-Welsh

PERCOLATION WITHOUT FKG VINCENT BEFFARA AND DAMIEN GAYET Abstract. We prove a Russo-Seymour-Welsh theorem for large and natural perturbative families of discrete percolation models that do not necessarily satisfy the Fortuin-Kasteleyn- Ginibre

714 views • 32 slides