clustering lecture 8
play

Clustering Lecture 8 David Sontag New York University - PowerPoint PPT Presentation

Clustering Lecture 8 David Sontag New York University Slides adapted from Luke Zettlemoyer, Vibhav Gogate, Carlos Guestrin, Andrew Moore, Dan Klein Clustering Clustering: Unsupervised learning


  1. Clustering ¡ Lecture ¡8 ¡ David ¡Sontag ¡ New ¡York ¡University ¡ Slides adapted from Luke Zettlemoyer, Vibhav Gogate, Carlos Guestrin, Andrew Moore, Dan Klein

  2. Clustering Clustering: – Unsupervised learning – Requires data, but no labels – Detect patterns e.g. in • Group emails or search results • Customer shopping patterns • Regions of images – Useful when don’t know what you’re looking for – But: can get gibberish

  3. Clustering • Basic idea: group together similar instances • Example: 2D point patterns

  4. Clustering • Basic idea: group together similar instances • Example: 2D point patterns

  5. Clustering • Basic idea: group together similar instances • Example: 2D point patterns • What could “ similar ” mean? – One option: small Euclidean distance (squared) y || 2 dist( ~ x, ~ y ) = || ~ x − ~ 2 – Clustering results are crucially dependent on the measure of similarity (or distance) between “points” to be clustered

  6. � � � � � � � � Clustering algorithms • 8+'%(%(,) � +"*,'(%-.$ � 9:"+%; – < � .&+)$ – =(>%#'& � ,? � @+#$$(+) – A2&0%'+" � !"#$%&'()* � • /(&'+'0-(0+" � +"*,'(%-.$ � – 1,%%,. � #2 � 3 +**",.&'+%(4& – 5,2 � 6,7) � 3 6(4($(4& � � � � � � �

  7. Clustering examples ¡ Image ¡segmenta2on ¡ Goal: ¡Break ¡up ¡the ¡image ¡into ¡meaningful ¡or ¡ perceptually ¡similar ¡regions ¡ [Slide from James Hayes]

  8. Clustering examples Clustering gene expression data Eisen et al, PNAS 1998

  9. Clustering examples ¡ Cluster ¡news ¡ ar2cles ¡

  10. Clustering examples Cluster ¡people ¡by ¡space ¡and ¡2me ¡ [Image from Pilho Kim]

  11. Clustering examples Clustering ¡languages ¡ [Image from scienceinschool.org]

  12. Clustering examples Clustering ¡languages ¡ [Image from dhushara.com]

  13. Clustering examples Clustering ¡species ¡ (“phylogeny”) ¡ [Lindblad-Toh et al., Nature 2005]

  14. Clustering examples Clustering ¡search ¡queries ¡

  15. K-Means • An iterative clustering algorithm – Initialize: Pick K random points as cluster centers – Alternate: 1. Assign data points to closest cluster center 2. Change the cluster center to the average of its assigned points – Stop when no points ’ assignments change

  16. K-Means • An iterative clustering algorithm – Initialize: Pick K random points as cluster centers – Alternate: 1. Assign data points to closest cluster center 2. Change the cluster center to the average of its assigned points – Stop when no points ’ assignments change

  17. K-­‑means ¡clustering: ¡Example ¡ • Pick K random points as cluster centers (means) Shown here for K =2 17

  18. K-­‑means ¡clustering: ¡Example ¡ Iterative Step 1 • Assign data points to closest cluster center 18

  19. K-­‑means ¡clustering: ¡Example ¡ Iterative Step 2 • Change the cluster center to the average of the assigned points 19

  20. K-­‑means ¡clustering: ¡Example ¡ • Repeat ¡unDl ¡ convergence ¡ 20

  21. ProperDes ¡of ¡K-­‑means ¡ algorithm ¡ • Guaranteed ¡to ¡converge ¡in ¡a ¡finite ¡number ¡of ¡ iteraDons ¡ • Running ¡Dme ¡per ¡iteraDon: ¡ 1. Assign data points to closest cluster center O(KN) time 2. Change the cluster center to the average of its assigned points O(N) ¡

  22. !"#$%& '(%)#*+#%,# !"#$%&'($ � � � � � � � � � ��� � ��� ��� � ��� -. /01 � � 2 � (340"05# � !" !"#$ � % � &' � ()#*+, � � � � � � � � � � � ��� � ��� � � � � � � � � ��� ��� � � 6. /01 � !# � (340"05# � �� � � � � � � � � � ��� ��� � ��� – 7$8# � 3$*40$9 � :#*0)$40)# � (; � � � $%: � &#4 � 4( � 5#*(2 � <# � =$)# with respect to � � � � �� � � � � !"#$ � - � &' � ()#*+, ��� � !"#$%& 4$8#& � $% � $94#*%$40%+ � (340"05$40(% � $33*($,=2 � #$,= � &4#3 � 0& � +>$*$%4##: � 4( � :#,*#$&# � 4=# � (?@#,40)# � A 4=>& � +>$*$%4##: � 4( � ,(%)#*+# [Slide from Alan Fern]

  23. Example: K-Means for Segmentation K=2 Original Goal of Segmentation is Original image K = 2 K = 3 K = 10 to partition an image into regions each of which has reasonably homogenous visual appearance.

  24. Example: K-Means for Segmentation K=2 K=3 K=10 Original Original image K = 2 K = 3 K = 10

  25. Example: K-Means for Segmentation K=2 K=3 K=10 Original Original image K = 2 K = 3 K = 10

  26. Example: Vector quantization FIGURE 14.9. Sir Ronald A. Fisher ( 1890 − 1962 ) was one of the founders of modern day statistics, to whom we owe maximum-likelihood, su ffi ciency, and many other fundamental concepts. The image on the left is a 1024 × 1024 grayscale image at 8 bits per pixel. The center image is the result of 2 × 2 block VQ, using 200 code vectors, with a compression rate of 1 . 9 bits/pixel. The right image uses only four code vectors, with a compression rate of 0 . 50 bits/pixel [Figure from Hastie et al. book]

  27. Initialization • K-means algorithm is a heuristic – Requires initial means – It does matter what you pick! – What can go wrong? – Various schemes for preventing this kind of thing: variance-based split / merge, initialization heuristics

  28. K-Means Getting Stuck A local optimum: Would be better to have one cluster here … and two clusters here

  29. K-means not able to properly cluster Y X

  30. Changing the features (distance function) can help R θ

  31. Hierarchical ¡Clustering ¡

  32. Agglomerative Clustering • Agglomerative clustering: – First merge very similar instances – Incrementally build larger clusters out of smaller clusters • Algorithm: – Maintain a set of clusters – Initially, each instance in its own cluster – Repeat: • Pick the two closest clusters • Merge them into a new cluster • Stop when there’s only one cluster left • Produces not one clustering, but a family of clusterings represented by a dendrogram

  33. Agglomerative Clustering • How should we define “ closest ” for clusters with multiple elements?

  34. Agglomerative Clustering • How should we define “ closest ” for clusters with multiple elements? • Many options: – Closest pair (single-link clustering) – Farthest pair (complete-link clustering) – Average of all pairs • Different choices create different clustering behaviors

  35. Agglomerative Clustering • How should we define “ closest ” for clusters with multiple elements? Closest pair Farthest pair (single-link clustering) (complete-link clustering) 1 5 6 2 1 5 2 6 3 4 7 8 3 4 7 8 [Pictures from Thorsten Joachims]

  36. Clustering ¡Behavior ¡ Average Farthest Nearest Mouse tumor data from [Hastie et al. ]

  37. AgglomeraDve ¡Clustering ¡ When ¡can ¡this ¡be ¡expected ¡to ¡work? ¡ Strong separation property: Closest pair All points are more similar to points in (single-link clustering) their own cluster than to any points in any other cluster Then, the true clustering corresponds to some pruning of the tree obtained by 1 5 6 2 single-link clustering! Slightly weaker (stability) conditions are solved by average-link clustering 3 4 7 8 (Balcan et al., 2008)

  38. Spectral ¡Clustering ¡ Slides adapted from James Hays, Alan Fern, and Tommi Jaakkola

  39. Spectral ¡clustering ¡ K-means Spectral clustering twocircles, 2 clusters two circles, 2 clusters (K − means) 5 5 5 4.5 4.5 4.5 4 4 4 3.5 3.5 3.5 3 3 3 2.5 2.5 2.5 2 2 2 1.5 1.5 1.5 1 1 1 0.5 0.5 0.5 0 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 1 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 [Shi & Malik ‘00; Ng, Jordan, Weiss NIPS ‘01]

  40. Spectral ¡clustering ¡ nips, 8 clusters lineandballs, 3 clusters fourclouds, 2 clusters 5 5 5 4.5 4.5 4.5 4 4 4 3.5 3.5 3.5 3 3 3 2.5 2.5 2.5 2 2 2 1.5 1.5 1.5 1 1 1 0.5 0.5 0.5 0 0 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 squiggles, 4 clusters threecircles − joined, 3 clusters twocircles, 2 clusters threecircles − joined, 2 clusters 5 5 5 5 4.5 4.5 4.5 4.5 4 4 4 4 3.5 3.5 3.5 3.5 3 3 3 3 2.5 2.5 2.5 2.5 2 2 2 2 1.5 1.5 1.5 1.5 − 1 1 1 1 − 0.5 0.5 0.5 0.5 − 0 0 0 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 [Figures from Ng, Jordan, Weiss NIPS ‘01]

  41. Spectral ¡clustering ¡ ¡ ¡Group ¡points ¡based ¡on ¡links ¡in ¡a ¡graph ¡ B A [Slide from James Hays]

Recommend


More recommend