Clustering: Models and Algorithms Shikui Tu 2019-02-28 1 Outline - PowerPoint PPT Presentation

Clustering: Models and Algorithms Shikui Tu 2019-02-28 1

Outline • Clustering – K-mean clustering, hierarchical clustering • Adaptive learning (online learning) – CL, FSCL, RPCL • Gaussian Mixture Models (GMM) • Expectation-Maximization (EM) for maximum likelihood 2

What is clustering? �� Six malignant tumors (melanoma) 8 APRIL 2016 • VOL 352 ISSUE 6282, SCIENCE 3

How to represent a cluster • �� … �� 4

How to define error? Square distance: ! x t || ! - x t || 2 || ! - x 1 || 2 + || ! - x 2 || 2 + || ! - x 3 || 2 �� ! �� 5

Matrix derivatives 6 http://www2.imm.dtu.dk/pubdb/views/edoc_download.php/3274/pdf/imm3274.pdf

Clustering the data We have the following data: We want to cluster the data into two clusters (red and blue) How? 7

Minimize the sum of square distances J minimize r nk = 1 if and only if data point x n is assigned to cluster k; otherwise r nk = 0. µ 2 r n1 = 1 k = 1, 2; K = 2 clusters r n2 = 0 n = 1, …, N; N: the total number of points. µ 1 We need to calculate { r nk }and { µ k } . 8

If we know r n1 , r n2 for all n =1,…,N Since the points have been assigned to cluster 1 or cluster 2, we calculate µ 2 µ 1 = mean of the points in cluster 1 µ 2 = mean of the points in cluster 2 µ 1 Or formally We call it the M Step. 9

If we know µ 1, µ 2 We should assign point x n to cluster 1, because || x n – µ 1 || 2 < || x n – µ 2 || 2 µ 2 Then, r n1 = 1 r n2 = 0 µ 1 Or formally We call it the E Step 10

Initialization µ 1 µ 2 11

Given µ 1, µ 2 , calculate r n1 , r n2 for all n =1,…,N Equal distance line E Step Assign the points to the nearest cluster: Steps 12

Given r n1 , r n2 , calculate µ 1, µ 2 M Step Calculate the means of the points in each cluster: Steps 13

Given µ 1, µ 2 , calculate r n1 , r n2 for all n =1,…,N E Step Assign the points to the nearest cluster: Steps 14

Given r n1 , r n2 , calculate µ 1, µ 2 M Step Calculate the means of the points in each cluster: Steps 15

Initialization E-Step M-Step E-Step M-Step E-Step If J does not change, or E-Step Convergence M-Step { µ 1, µ 2 } do not change, then the algorithm converges. 16

K �� • �� ! 1 ,…, ! k • �� – �� ! i – � ! i �� • ��

Basic ingredients • Model or structure • Objective function • Algorithm • Convergence

Questions for K-mean algorithm • Does it find the global optimum of J? – No, the nearest local optimum, depending on initialization • If Euclidean distance is not good for some data, do we have other choices? • Can we assign each data point to the clusters probabilistically? • If K (the total number of clusters) is unknown, can we estimate it from the data? 19

Hierarchical Clustering • k -means clustering requires – k – Positions of initial centers – A distance measure between points ( e.g. Euclidean distance) • Hierarchical clustering requires a measure of distance between groups of data points 21 Adapted from Blei, D . Hierarchial Cluster [PwerPoint slides]. www.cs.princeton.edu/courses/archive/spr08/cos424/slides/clustering-2.pdf

Hierarchical Clustering • Agglomerative clustering • A very simple procedure: – Assign each data point into its own group – Repeat: look for the two closest groups and merge them into one group – Stop when all the data points are merged into a single cluster 22 Adapted from Blei, D . Hierarchial Cluster [PwerPoint slides]. www.cs.princeton.edu/courses/archive/spr08/cos424/slides/clustering-2.pdf

Distance Measure • Distance between data points a and b : d ( a, b ) – • Group A and B – Single-linkage d ( A, B ) = a ∈ A,b ∈ B d ( a, b ) min – Complete-linkage d ( A, B ) = a ∈ A,b ∈ B d ( a, b ) max – Average-linkage P a ∈ A,b ∈ B d ( a, b ) d ( A, B ) = | A | · | B | 23

Dendrogram Distance 24 Jain, A. K., Murty, M. N., Flynn, P. J. (1999) "Data Clustering: A Review". ACM Computing Surveys (CSUR), 31(3), p.264-323, 1999.

From batch to adaptive • Given a batch of data points • Data points come one by one: … x 1 x 2 x N 26

Competitive learning • Data points come one by one: … x 1 x 2 x N 27

When starting with “bad initializations” 28

A four-cluster case 29

frequency sensitive competitive learning (FSCL) [Ahalt et al., 1990] The idea is to penalize the frequent winners: 30

FSCL is not good when there are extra centers When k is pre-assigned to 5. the frequency sensitive mechanism also brings the extra one into data to disturb the correct locations of others 31

Rival penalized competitive learning (RPCL) (Xu, Krzyzak, & Oja, 1992 , 1993) The RPCL differs from FSCL by implementing p j,t as follows: where γ approximately takes a number between 0.05 and 0.1 for controlling the penalizing strength. 32

Rival penalized mechanism makes extra agents driven far away. 33

Thank you! 53

Clustering: Models and Algorithms Shikui Tu 2019-02-28 1 Outline - PowerPoint PPT Presentation

Clustering: Models and Algorithms Shikui Tu 2019-02-28 1 Outline Clustering K-mean clustering, hierarchical clustering Adaptive learning (online learning) CL, FSCL, RPCL Gaussian Mixture Models (GMM)

Graph Clustering Graph Clustering What is clustering? What is clustering? Finding patterns

Subspace Clustering Ensemble Clustering Subspace Clustering, Ensemble Clustering, Alternative

Evolutionary Clustering Presenter: Lei Tang Evolutionary Clustering Evolutionary Clustering

Clustering A Categorization of Major Clustering Methods Partitioning Methods

Trust based Clustering for Group Trust based Clustering for Group Trust based Clustering for

Clustering and Dimensionality Reduction Preview Clustering K -means clustering

Clustering Hierarchical clustering and k-mean clustering Genome 373 Genomic Informatics

Clustering Hierarchical clustering, k-mean clustering Genome 559: Introduction to Statistical and

PAC-Bayesian Analysis of Co-clustering, Graph Clustering and Pairwise Clustering Yevgeny Seldin

Introduction to Machine Learning, Clustering and EM Barnab s P czos Contents Clustering

Finding Clusters Types of Clustering Approaches: Linkage Based, e.g. Hierarchical Clustering

Cl Clustering t i A Categorization of Major Clustering Methods Partitioning Methods

CSCE 478/878 Lecture 8: Stephen Scott Clustering Introduction Outline Clustering Stephen

Clustering kMeans, Expectation Maximization, Self-Organizing Maps Outline K-means

Lecture 23: Spectral clustering Hierarchical clustering What is a good clustering?

Week 7 Video 3 Advanced Clustering Algorithms Today Multiple advanced algorithms for

Clustering Algorithms Dalya Baron (Tel Aviv University) XXX Winter School, November 2018

LECTURE 7 Clustering The k-means algorithm Hierarchical Clustering The DBSCAN algorithm

Projects Chandrasekar, Arun Kumar, Group 17 Nearly all group have submitted a proposal

Machine Learning Lecture Notes on Clustering (II) 2017-2018 Davide Eynard davide.eynard@usi.ch

Data Clustering: A Very Brief Overview Serhan Cosar INRIA-STARS Outline Introduction

EM and GMM Jia-Bin Huang Virginia Tech Spring 2019 ECE-5424G / CS-5824 Administrative HW 3

Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent

Design of Experiments The linear regression model relates the expected value of a dependent