hierarchical clustering
play

Hierarchical Clustering 36-350: Data Mining 25 September 2006 Last - PowerPoint PPT Presentation

Hierarchical Clustering 36-350: Data Mining 25 September 2006 Last time... Unsupervised learning problems; finding clusters K means divide into k clusters to minimize within- cluser variance*cluster size local search, local


  1. Hierarchical Clustering 36-350: Data Mining 25 September 2006

  2. Last time... • Unsupervised learning problems; finding clusters • K means • divide into k clusters to minimize within- cluser variance*cluster size • local search, local minima

  3. Limits of k-Means • Local search can get stuck • Random starts help • Sum-of-squares likes ball-shaped clusters • How to pick k? • No relations between clusters

  4. Hierarchical Clustering • Basic idea: cluster the clusters • High-level clusters contain multiple low-level clusters • Clusters are now related • Don’t need to chose k • Assumes a hierarchy makes sense...

  5. Ward’s Method 1. Start with every point in its own cluster 2. For each pair of clusters, calculate “merging cost” = increase in sum of squares 3. Merge least-costly pair 4. Stop when merging cost takes a big jump

  6. Ward’s method applied ocean6 to the images from royalblue ocean1 royalblue ocean5 lecture 3: ocean, tigers, lightskyblue3 ocean4 azure3 flowers ocean2 darkslategray.2 ocean7 darkslategray.2 ocean3 midnightblue tiger2 Jump in merging cost gray10 tiger1 lightgoldenrod3 suggests 3 clusters - tiger4 darkseagreen4 tiger9 almost exactly right darkseagreen4 tiger8 antiquewhite2 tiger6 ones, too (but thinks gray10 darkseagreen4 flower5 flower5 is a tiger) tiger5 burlywood2 tiger3 gray36 3.0 tiger7 flower4 2.5 plum4 flower1 gray59.2 2.0 merging cost flower9 1.5 gray32 flower8 orchid3 flower7 1.0 orchid3 flower6 orchid3 0.5 flower3 darkmagenta 0.0 flower2 2 4 6 8 10 clusters

  7. • Don’t have to chose k • Sum of squares is worse, generally, than k- means (for equal k) • more constrained search • prefers to merge small clusters, all else equal

  8. Minimizing the mean distance from the center tends to make spheres, which can be silly k-Means Ward’s note how Ward’s is less balanced

  9. Single-link clustering 1. Start with every point in its own cluster 2. Calculate gaps between every pair of clusters = distance between 2 closest points in each cluster 3. Merge clusters with smallest gap

  10. k-Means Ward’s Single-link

  11. Examples where single-link doesn’t work so well k-Means Ward’s Single-link

Recommend


More recommend