Unsupervised learning Clustering and Dimensionality Reduction Marta - PowerPoint PPT Presentation

Unsupervised learning Clustering and Dimensionality Reduction Marta Arias marias@cs.upc.edu Dept. CS, UPC Fall 2018

Clustering Partition input examples into similar subsets

Clustering Main challenges ◮ How to measure similarity? ◮ How many clusters? ◮ How do we evaluate the clusters? Algorithms we will cover ◮ K-means ◮ Hierarchical clustering

K-means clustering Intuition ◮ Input data are: ◮ m examples x 1 , .., x m , and ◮ K , the number of desired clusters ◮ Clusters represented by cluster centers µ 1 , .., µ K ◮ Given centers µ 1 , .., µ K , each center defines a cluster: the subset of inputs x i that are closer to it than to other centers

K-means clustering Intuition The aim is to find ◮ cluster centers µ 1 , .., µ K and ◮ a cluster assignment z = ( z 1 , .., z m ) where z i ∈ { 1, .., K } ◮ z i is the cluster assigned to example x i such that µ 1 , .., µ K , z minimize the cost function � x i − µ z i � 2 . � J ( µ 1 , .., µ K , z ) = i

K-means clustering Cost function � x i − µ z i � 2 � J ( µ 1 , .., µ K , z ) = i Pseudocode ◮ Pick initial centers µ 1 , .., µ K at random ◮ Repeat until convergence ◮ Optimize z in J ( µ 1 , .., µ K , z ) keeping µ 1 , .., µ K fixed ◮ Set z i to closest center: z i = arg min � x i − µ k � 2 k ◮ Optimize µ 1 , .., µ K in J ( µ 1 , .., µ K , z ) keeping z fixed 1 � ◮ For each k = 1, .., K , set µ k = x i |{ i | z i = k }| i : z i = k

K-Means illustrated

Limitations of k-Means K-Means works well if.. ◮ Clusters are spherical ◮ Clusters are well separated ◮ Clusters are of similar volumes ◮ Clusters have similar number of points .. so improve it with more general model ◮ Mixture of Gaussians: ◮ Learn it using Expectation Maximization

Hierarchical clustering Output is a dendogram

Agglomerative hierarchical clustering Bottom-up Pseudocode 1. Start with one cluster per example 2. Repeat until all examples in one cluster ◮ merge two closest clusters (Next example from D. Blei’s course at Princeton)

Example Data ● 80 ● ● ● 60 ● 40 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● − 20 ● ● 0 20 40 60 80 D. Blei Clustering 02 5 / 21

Example iteration 001 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● − 20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

Agglomerative hierarchical clustering Bottom-up Pseudocode 1. Start with one cluster per example 2. Repeat until all examples in one cluster ◮ merge two closest clusters Defining distance between clusters (i.e. sets of points) ◮ Single Linkage: d ( X , Y ) = x ∈ X , y ∈ Y d ( x , y ) min ◮ Complete Linkage: d ( X , Y ) = x ∈ X , y ∈ Y d ( x , y ) max � x ∈ X , y ∈ Y d ( x , y ) ◮ Group Average: d ( X , Y ) = | X | × | Y | ◮ Centroid Distance: d ( X , Y ) = d ( 1 x , 1 � � y ) | X | | Y | x ∈ X y ∈ Y

Many, many, many other algorithms available ..

Clustering with scikit-learn I K-means: an example with the Iris dataset

Clustering with scikit-learn II K-means: an example with the Iris dataset

Clustering with scikit-learn I Hierarchical clustering: an example with the Iris dataset

Dimensionality reduction I The curse of dimensionality ◮ When dimensionality increases, data becomes increasingly sparse in the space that it occupies ◮ Definitions of density and distance between points (critical for many tasks!) become less meaningful ◮ Visualization and qualitative analysis becomes impossible

Unsupervised learning Clustering and Dimensionality Reduction Marta - PowerPoint PPT Presentation

Unsupervised learning Clustering and Dimensionality Reduction Marta Arias marias@cs.upc.edu Dept. CS, UPC Fall 2018 Clustering Partition input examples into similar subsets Clustering Partition input examples into similar subsets Clustering

UNSUPERVISED LEARNING, CLUSTERING UNSUPERVISED LEARNING UNSUPERVISED LEARNING Supervised

Unsupervised Learning and Clustering l In unsupervised learning you are given a data set with no

Unsupervised Learning Andrea Passerini passerini@disi.unitn.it Machine Learning Unsupervised

Introduction to PCA Unsupervised Learning in R Unsupervised learning Two methods of

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Unsupervised Maximum Likelihood

Unsupervised Language Learning: Representation Learning for NLP Katia Shutova ILLC University

Unsupervised Learning Unsupervised Learning Learning without Class Labels (or correct Learning

Unsupervised Learning Introduction Nakul Verma Unsupervised Learning What can we learn from

12. Unsupervised Deep Learning CS 535 Deep Learning, Winter 2018 Fuxin Li With materials from

Machine Learning for NLP Unsupervised Learning Aurlie Herbelot 2019 Centre for Mind/Brain

Unsupervised Learning Unsupervised vs Supervised Learning: Most of this course focuses on

Unsupervised Learning Shan-Hung Wu shwu@cs.nthu.edu.tw Department of Computer Science, National

Unsupervised Learning Unsupervised vs Supervised Learning: Most of this course focuses on

Unsupervised Learning Shan-Hung Wu shwu@cs.nthu.edu.tw Department of Computer Science, National

On the Limitations of Unsupervised Bilingual Dictionary Induction Anders Sgaard Sebastian

Unsupervised learning introduction October 7, 2019 Unsupervised learning introduction

The Nearest Neighbor Algorithm The Nearest Neighbor Algorithm Hypothesis Space Hypothesis Space

Lifting the curse of dimensionality in nonlinear system identification with tensor networks. Kim

Natural Language Processing with Deep Learning Sentiment Analysis with Machine Learning Navid

Cyrus Cousins with Eli Upfal Brown University BigData Group Spring 2019 Web:

Applied Machine Learning Applied Machine Learning Some basic concepts Siamak Ravanbakhsh Siamak

Recent Advances in Adaptive Sampling and Reconstruction for Monte

High-dimensional statistics and probability Christophe Giraud 1 , Matthieu Lerasle 2 , 3 and

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Pattern Recogniton Pattern: Any