hierarchical clustering
play

Hierarchical clustering David M. Blei COS424 Princeton University - PowerPoint PPT Presentation

Hierarchical clustering David M. Blei COS424 Princeton University February 28, 2008 D. Blei Clustering 02 1 / 21 Hierarchical clustering Hierarchical clustering is a widely used data analysis tool. D. Blei Clustering 02 2 / 21


  1. Hierarchical clustering David M. Blei COS424 Princeton University February 28, 2008 D. Blei Clustering 02 1 / 21

  2. Hierarchical clustering • Hierarchical clustering is a widely used data analysis tool. D. Blei Clustering 02 2 / 21

  3. Hierarchical clustering • Hierarchical clustering is a widely used data analysis tool. • The idea is to build a binary tree of the data that successively merges similar groups of points D. Blei Clustering 02 2 / 21

  4. Hierarchical clustering • Hierarchical clustering is a widely used data analysis tool. • The idea is to build a binary tree of the data that successively merges similar groups of points • Visualizing this tree provides a useful summary of the data D. Blei Clustering 02 2 / 21

  5. Hierarchical clusering vs. k -means • Recall that k -means or k -medoids requires D. Blei Clustering 02 3 / 21

  6. Hierarchical clusering vs. k -means • Recall that k -means or k -medoids requires • A number of clusters k D. Blei Clustering 02 3 / 21

  7. Hierarchical clusering vs. k -means • Recall that k -means or k -medoids requires • A number of clusters k • An initial assignment of data to clusters D. Blei Clustering 02 3 / 21

  8. Hierarchical clusering vs. k -means • Recall that k -means or k -medoids requires • A number of clusters k • An initial assignment of data to clusters • A distance measure between data d ( x n , x m ) D. Blei Clustering 02 3 / 21

  9. Hierarchical clusering vs. k -means • Recall that k -means or k -medoids requires • A number of clusters k • An initial assignment of data to clusters • A distance measure between data d ( x n , x m ) • Hierarchical clustering only requires a measure of similarity between groups of data points. D. Blei Clustering 02 3 / 21

  10. Agglomerative clustering • We will talk about agglomerative clustering . D. Blei Clustering 02 4 / 21

  11. Agglomerative clustering • We will talk about agglomerative clustering . • Algorithm: D. Blei Clustering 02 4 / 21

  12. Agglomerative clustering • We will talk about agglomerative clustering . • Algorithm: 1 Place each data point into its own singleton group D. Blei Clustering 02 4 / 21

  13. Agglomerative clustering • We will talk about agglomerative clustering . • Algorithm: 1 Place each data point into its own singleton group 2 Repeat: iteratively merge the two closest groups D. Blei Clustering 02 4 / 21

  14. Agglomerative clustering • We will talk about agglomerative clustering . • Algorithm: 1 Place each data point into its own singleton group 2 Repeat: iteratively merge the two closest groups 3 Until: all the data are merged into a single cluster D. Blei Clustering 02 4 / 21

  15. Example Data ● 80 ● ● ● 60 ● 40 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 D. Blei Clustering 02 5 / 21

  16. Example iteration 001 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

  17. Example iteration 002 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

  18. Example iteration 003 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

  19. Example iteration 004 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

  20. Example iteration 005 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

  21. Example iteration 006 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

  22. Example iteration 007 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

  23. Example iteration 008 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

  24. Example iteration 009 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

  25. Example iteration 010 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

  26. Example iteration 011 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

  27. Example iteration 012 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

  28. Example iteration 013 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

  29. Example iteration 014 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

  30. Example iteration 015 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

  31. Example iteration 016 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

  32. Example iteration 017 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

  33. Example iteration 018 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

  34. Example iteration 019 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

  35. Example iteration 020 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

  36. Example iteration 021 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

  37. Example iteration 022 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

  38. Example iteration 023 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

  39. Example iteration 024 ● 80 ● ● ● 60 ● 40 V2 ● ● 20 ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● −20 ● ● 0 20 40 60 80 V1 D. Blei Clustering 02 5 / 21

  40. Agglomerative clustering • Each level of the resulting tree is a segmentation of the data D. Blei Clustering 02 6 / 21

  41. Agglomerative clustering • Each level of the resulting tree is a segmentation of the data • The algorithm results in a sequence of groupings D. Blei Clustering 02 6 / 21

  42. Agglomerative clustering • Each level of the resulting tree is a segmentation of the data • The algorithm results in a sequence of groupings • It is up to the user to choose a ”natural” clustering from this sequence D. Blei Clustering 02 6 / 21

  43. Dendrogram • Agglomerative clustering is monotonic D. Blei Clustering 02 7 / 21

  44. Dendrogram • Agglomerative clustering is monotonic • The similarity between merged clusters is monotone decreasing with the level of the merge. D. Blei Clustering 02 7 / 21

Recommend


More recommend