The EM Algorithm The EM algorithm Mixture models Why EM works EM - PDF document

Aug 12, 2023 •319 likes •363 views

Preview The EM Algorithm The EM algorithm Mixture models Why EM works EM variants Learning with Missing Data The EM Algorithm Goal: Learn parameters of Bayes net with known structure Initialize parameters ignoring missing

Preview The EM Algorithm • The EM algorithm • Mixture models • Why EM works • EM variants Learning with Missing Data The EM Algorithm • Goal: Learn parameters of Bayes net with known structure Initialize parameters ignoring missing information • For now: Maximum likelihood Repeat until convergence: • Suppose the values of some variables in some samples E step: Compute expected values of unobserved variables, are missing assuming current parameter values • If we knew all values, computing parameters would be M step: Compute new parameter values to maximize easy probability of data (observed & estimated) • If we knew the parameters, we could infer the missing values (Also: Initialize expected values ignoring missing info) • “Chicken and egg” problem Example A B C Hidden Variables Examples: 0 1 1 • What if some variables were always missing? 1 0 0 1 1 1 • In general, difficult problem 1 ? 0 • Consider Naive Bayes structure, with class missing: Initialization: P ( B | A ) = P ( C | B ) = P ( A ) = P ( B |¬ A ) = P ( C |¬ B ) = n c d � � P ( x ) = P ( c i ) P ( x j | c i ) E-step: P (? = 1) = P ( B | A, ¬ C ) = P ( A,B, ¬ C ) = . . . = 0 P ( A, ¬ C ) i =1 j =1 M-step: P ( B | A ) = P ( C | B ) = P ( A ) = P ( B |¬ A ) = P ( C |¬ B ) = E-step: P (? = 1) = 0 (converged)
Naive Bayes Model P ( Bag= 1) Clustering Bag C P ( F=cherry | B ) • Goal: Group similar objects Bag 1 • Example: Group Web pages with similar topics F 1 2 F 2 • Clustering can be hard or soft • What’s the objective function? X Flavor Wrapper Holes (a) (b) Mixtures of Gaussians Mixture Models n c � P ( x ) = P ( c i ) P ( x | c i ) i =1 p(x) Objective function: Log likelihood of data Naive Bayes: P ( x | c i ) = � n d j =1 P ( x j | c i ) AutoClass: Naive Bayes with various x j models x Mixture of Gaussians: P ( x | c i ) = Multivariate Gaussian � � 2 � 1 − 1 � x − µ i √ P ( x | µ i ) = 2 πσ 2 exp In general: P ( x | c i ) can be any distribution 2 σ EM for Mixtures of Gaussians Simplest case: Assume known priors and covariances Mixtures of Gaussians (cont.) Initialization: Choose means at random E step: For all samples x k : • K-means clustering ≺ EM for mixtures of Gaussians P ( µ i | x k ) = P ( µ i ) P ( x k | µ i ) P ( µ i ) P ( x k | µ i ) • Mixtures of Gaussians ≺ Bayes nets = P ( x k ) � i ′ P ( µ i ′ ) P ( x k | µ i ′ ) • Also good for estimating joint distribution of continuous variables M step: For all means µ i : � x k x P ( µ i | x k ) µ i = � x k P ( µ i | x k )
Why EM Works LL(Onew) EM Variants MAP: Compute MAP estimates instead of ML in M step LL(Onew) LLold GEM: Just increase likelihood in M step MCMC: Approximate E step LLold + Q(Onew) Simulated annealing: Avoid local maxima Early stopping: Faster, may reduce overfitting Oold Onew Structural EM: Missing data and unknown structure θ new = argmax E θ old [log P ( X )] θ Summary • The EM algorithm • Mixture models • Why EM works • EM variants

Recommend

Odds Algorithm An Online Algorithm Group Fibonado 20. Dec 2016 Group Fibonado Odds Algorithm

Odds Algorithm An Online Algorithm Group Fibonado 20. Dec 2016 Group Fibonado Odds Algorithm 20. Dec 2016 1 / 21 Outline Introduction 1 Online Algorithm The Secretary Problem Optimal Stopping 2 Odds Algorithm 3 Algorithm Proof

1.27k views • 55 slides

Visible Surface Determination CS418 Computer Graphics John C. Hart Painters Algorithm

Visible Surface Determination CS418 Computer Graphics John C. Hart Painters Algorithm Painters Algorithm Painters Algorithm Painters Algorithm Painters Algorithm Painters Algorithm Backface Culling Q: How does Bob

741 views • 23 slides

Algorithm Analysis October 12, 2016 CMPE 250 Algorithm Analysis October 12, 2016 1 / 66

Algorithm Analysis October 12, 2016 CMPE 250 Algorithm Analysis October 12, 2016 1 / 66 Problem Solving: Main Steps Problem definition 1 Algorithm design / Algorithm specification 2 Algorithm analysis 3 Implementation 4 Testing 5

943 views • 66 slides

Shortest path using A Algorithm Introduction History Components of A Algorithm

Shortest path using A Algorithm Phaneendhar Reddy Vanam Shortest path using A Algorithm Introduction History Components of A Algorithm Phaneendhar Reddy Vanam Steps Involved in A Algorithm Example showing A December

773 views • 20 slides

Stoer-Wagner Algorithm A Minimum Cut Algorithm for Undirected Graphs BigNews CS214: Algorithms

The Algorithm Correctness Running Time Stoer-Wagner Algorithm A Minimum Cut Algorithm for Undirected Graphs BigNews CS214: Algorithms and Complexity Shanghai Jiao Tong University 2016.12.06 The Algorithm Correctness Running Time Outline

761 views • 51 slides

Quiz I Give the SVD-based algorithm for solving least squares, and I justify the algorithm by that

Quiz I Give the SVD-based algorithm for solving least squares, and I justify the algorithm by that showing it outputs the correct answer. I Under what circumstances would this algorithm be preferred over the QR-based algorithm? The Eigenvector

905 views • 18 slides

Some More Critical Section Solutions Dr. Liam OConnor University of Edinburgh LFCS (and UNSW)

Petersons Algorithm Bakery Algorithm Fast Algorithm Szymanskis Algorithm Some More Critical Section Solutions Dr. Liam OConnor University of Edinburgh LFCS (and UNSW) Term 2 2020 1 Petersons Algorithm Bakery Algorithm Fast

435 views • 26 slides

A-Star Algorithm & Heaps/Priority Queues Mark Redekopp 2 A* Search Algorithm ALGORITHM

1 EE 355 Unit 14 A-Star Algorithm & Heaps/Priority Queues Mark Redekopp 2 A* Search Algorithm ALGORITHM HIGHLIGHT 3 Search Methods Many systems require searching for goal states Path Planning Roomba Vacuum

887 views • 40 slides

Earley algorithm Earley: introduction Example of Earley algorithm Scott Farrar CLMA,

Earley algorithm Scott Farrar CLMA, University of Washington farrar@u.washington.edu Earley algorithm Earley algorithm Earley: introduction Example of Earley algorithm Scott Farrar CLMA, University of Washington farrar@u.washington.edu

1.18k views • 88 slides

The BBS Algorithm The BBS Algorithm The BBS Algorithm Prof. Paolo Ciaccia Prof. Paolo Ciaccia

The BBS Algorithm The BBS Algorithm The BBS Algorithm Prof. Paolo Ciaccia Prof. Paolo Ciaccia http://www http:// www- -db.deis.unibo.it db.deis.unibo.it/ /courses courses/SI /SI- -LS/ LS/ 04.2_BBSAlgorithm.pdf 04.2_BBSAlgorithm.pdf

258 views • 3 slides

Avoiding Register Overflow in the Bakery Algorithm The Bakery++ Algorithm The Bakery algorithm is

Avoiding Register Overflow in the Bakery Algorithm The Bakery++ Algorithm The Bakery algorithm is the first true solution of mutual exclusion, but it suffers from register overflows. Bakery++ is a slightly modified version of Bakery that avoids

425 views • 6 slides

Dijkstras Algorithm Austin Saporito and Charlie Rizzo Test Questions 1. What is the run time

Dijkstras Algorithm Austin Saporito and Charlie Rizzo Test Questions 1. What is the run time complexity of Dijkstras algorithm? 2. What case is Dijkstras algorithm best used for? 3. What was the Dijkstras algorithm a rediscovery

546 views • 36 slides

Pollards Rho Algorithm for Elliptic Curves Aaron Blumenfeld November 30, 2015 Aaron

Pollards Rho Algorithm Partitions Future Work References Pollards Rho Algorithm for Elliptic Curves Aaron Blumenfeld November 30, 2015 Aaron Blumenfeld Pollards Rho Algorithm for Elliptic Curves Pollards Rho Algorithm

496 views • 15 slides

K-MEANS++ OPTIMAL INITIALIZATION ALGORITHM An Improved K-means Clustering Method OVERVIEW

K-MEANS++ OPTIMAL INITIALIZATION ALGORITHM An Improved K-means Clustering Method OVERVIEW K-means Clustering Algorithm K-means++ Initialization Algorithm Experiment Datasets Conclusion K-MEANS CLUSTERING ALGORITHM A

313 views • 14 slides

Knuth-Morris-Pratt Algorithm Kranthi Kumar Mandumula December 18, 2011 Kranthi Kumar Mandumula

Knuth-Morris-Pratt Algorithm Kranthi Kumar Mandumula Knuth-Morris-Pratt Algorithm Kranthi Kumar Mandumula December 18, 2011 Kranthi Kumar Mandumula Knuth-Morris-Pratt Algorithm outline Knuth-Morris-Pratt Algorithm Kranthi Kumar

445 views • 26 slides

Algorithm Analyses Hoang Anh Quan June 22, 2018 Outline The Big Oh, Omega, Theta The first

Algorithm Analyses Hoang Anh Quan June 22, 2018 Outline The Big Oh, Omega, Theta The first algorithm notations Comparing running times & What is an algorithm? algorithms Definition Expressing an algorithm Algorithms analyses

416 views • 39 slides

Programming Reactive Systems in Scala: Principles and Abstractions Philipp Haller KTH Royal

Programming Reactive Systems in Scala: Principles and Abstractions Philipp Haller KTH Royal Institute of Technology Stockholm, Sweden Entwicklertag Frankfurt, Germany, 21 February, 2018 What are reactive systems? Multiple definitions proposed

893 views • 54 slides

Rational Krylov Methods for Solving Nonlinear Eigenvalue Problems Roel Van Beeumen

Rational Krylov Methods for Solving Nonlinear Eigenvalue Problems Roel Van Beeumen rvanbeeumen@lbl.gov Computational Research Division Lawrence Berkeley National Laboratory BASCD 2016 Stanford December 3, 2016 Quadratic eigenvalue

814 views • 47 slides

Eigenvalues of Graphs Operator Theory and Krein Spaces (dedicated to the memory of Hagen

Eigenvalues of Graphs Operator Theory and Krein Spaces (dedicated to the memory of Hagen Neidhardt) Samuel Mohr December 20, 2019 Institut fr Mathematik Technische Universitt Ilmenau Gefrdert durch die Deutsche Forschungsgemeinschaft

692 views • 51 slides

The number of eigenvalues of a tensor Dustin Cartwright 1 Yale University/MPIM Bonn May 31, 2012

The number of eigenvalues of a tensor Dustin Cartwright 1 Yale University/MPIM Bonn May 31, 2012 1 joint with Bernd Sturmfels Dustin Cartwright (Yale/MPIM) The number of eigenvalues of a tensor May 31, 2012 1 / 11 Eigenvalues of a tensor A =

425 views • 30 slides

A unifying methodology A Gentle Introduction to the EM Algorithm Dempster, Laird & Rubin

A unifying methodology A Gentle Introduction to the EM Algorithm Dempster, Laird & Rubin (1977) unified many strands of apparently unrelated work under the banner of The EM Algorithm Ted Pedersen EM had gone incognito for many

839 views • 4 slides

Lecture 12: EM Algorithm Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse

Lecture 12: EM Algorithm Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse webpage: http://kwchang.net/teaching/NLP16 CS6501 Natural Language Processing 1 Three basic problems for HMMs v Likelihood of the input: v Forward

880 views • 50 slides

Learning from Unlabeled Data INFO-4604, Applied Machine Learning University of Colorado Boulder

Learning from Unlabeled Data INFO-4604, Applied Machine Learning University of Colorado Boulder December 5-7, 2017 Prof. Michael Paul Types of Learning Recall the definitions of: Supervised learning Most of the semester has been

611 views • 50 slides

Expectation maximization don't have any labels. Can you still do something? ! Amazingly you can!

Motivation Suppose you are building a naive Bayes spam classifier. After your are done your boss tells you that there is no money to label the data. ! You have a probabilistic model that assumes labelled data, but you Expectation maximization

365 views • 5 slides