comparing more than two observations
play

Comparing More than Two Observations Dmitriy Gorenshteyn Sr. Data - PowerPoint PPT Presentation

DataCamp Cluster Analysis in R CLUSTER ANALYSIS IN R Comparing More than Two Observations Dmitriy Gorenshteyn Sr. Data Scientist, Memorial Sloan Kettering Cancer Center DataCamp Cluster Analysis in R The Closest Observation to a Pair Is 2


  1. DataCamp Cluster Analysis in R CLUSTER ANALYSIS IN R Comparing More than Two Observations Dmitriy Gorenshteyn Sr. Data Scientist, Memorial Sloan Kettering Cancer Center

  2. DataCamp Cluster Analysis in R The Closest Observation to a Pair Is 2 is closest to group 1,4? 1 2 3 2 11.7 Is 3 is closest to group 1,4? 3 16.8 18.0 4 10.0 20.6 15.8

  3. DataCamp Cluster Analysis in R Linkage Criteria: Complete Is 2 is closest to group 1,4? 1 2 3 2 11.7 max(D(2,1), D(2,4)) = 20.6 3 16.8 18.0 Is 3 is closest to group 1,4? 4 10.0 20.6 15.8 max(D(3,1), D(3,4)) = 16.8

  4. DataCamp Cluster Analysis in R Hierarchical Clustering Complete Linkage : maximum distance between two sets

  5. DataCamp Cluster Analysis in R Grouping With Linkage & Distance

  6. DataCamp Cluster Analysis in R Grouping With Linkage & Distance

  7. DataCamp Cluster Analysis in R Grouping With Linkage & Distance

  8. DataCamp Cluster Analysis in R Grouping With Linkage & Distance

  9. DataCamp Cluster Analysis in R Grouping With Linkage & Distance

  10. DataCamp Cluster Analysis in R Grouping With Linkage & Distance

  11. DataCamp Cluster Analysis in R Grouping With Linkage & Distance

  12. DataCamp Cluster Analysis in R Grouping With Linkage & Distance

  13. DataCamp Cluster Analysis in R Grouping With Linkage & Distance

  14. DataCamp Cluster Analysis in R Grouping With Linkage & Distance

  15. DataCamp Cluster Analysis in R Linkage Criteria Complete Linkage : maximum distance between two sets Single Linkage : minimum distance between two sets Average Linkage : average distance between two sets

  16. DataCamp Cluster Analysis in R CLUSTER ANALYSIS IN R Let's practice!

  17. DataCamp Cluster Analysis in R CLUSTER ANALYSIS IN R Capturing K Clusters Dmitriy Gorenshteyn Sr. Data Scientist, Memorial Sloan Kettering Cancer Center

  18. DataCamp Cluster Analysis in R

  19. DataCamp Cluster Analysis in R

  20. DataCamp Cluster Analysis in R

  21. DataCamp Cluster Analysis in R

  22. DataCamp Cluster Analysis in R

  23. DataCamp Cluster Analysis in R

  24. DataCamp Cluster Analysis in R

  25. DataCamp Cluster Analysis in R

  26. DataCamp Cluster Analysis in R

  27. DataCamp Cluster Analysis in R

  28. DataCamp Cluster Analysis in R

  29. DataCamp Cluster Analysis in R Hierarchical Clustering in R print(players) x y <dbl> <dbl> 1 -1 1 2 -2 -3 3 8 6 4 7 -8 5 -12 8 6 -15 0 dist_players <- dist(players, method = 'euclidean') hc_players <- hclust(dist_players, method = 'complete')

  30. DataCamp Cluster Analysis in R Extracting K Clusters cluster_assignments <- cutree(hc_players, k = 2) print(cluster_assignments) [1] 1 1 1 1 2 2 library(dplyr) players_clustered <- mutate(players, cluster = cluster_assignments) print(players_clustered) x y cluster <dbl> <dbl> <int> 1 -1 1 1 2 -2 -3 1 3 8 6 1 4 7 -8 1 5 -12 8 2 6 -15 0 2

  31. DataCamp Cluster Analysis in R Visualizing K-Clusters library(ggplot2) ggplot(players_clustered, aes(x = x, y = y, color = factor(cluster))) + geom_point()

  32. DataCamp Cluster Analysis in R CLUSTER ANALYSIS IN R Let's practice!

  33. DataCamp Cluster Analysis in R CLUSTER ANALYSIS IN R Visualizing the Dendrogram Dmitriy Gorenshteyn Sr. Data Scientist, Memorial Sloan Kettering Cancer Center

  34. DataCamp Cluster Analysis in R Building the Dendrogram

  35. DataCamp Cluster Analysis in R Building the Dendrogram

  36. DataCamp Cluster Analysis in R Building the Dendrogram

  37. DataCamp Cluster Analysis in R Building the Dendrogram

  38. DataCamp Cluster Analysis in R Building the Dendrogram

  39. DataCamp Cluster Analysis in R Building the Dendrogram

  40. DataCamp Cluster Analysis in R Building the Dendrogram

  41. DataCamp Cluster Analysis in R Building the Dendrogram

  42. DataCamp Cluster Analysis in R Building the Dendrogram

  43. DataCamp Cluster Analysis in R Plotting the Dendrogram plot(hc_players)

  44. DataCamp Cluster Analysis in R CLUSTER ANALYSIS IN R Let's practice!

  45. DataCamp Cluster Analysis in R CLUSTER ANALYSIS IN R Cutting the Tree Dmitriy Gorenshteyn Sr. Data Scientist, Memorial Sloan Kettering Cancer Center

  46. DataCamp Cluster Analysis in R

  47. DataCamp Cluster Analysis in R

  48. DataCamp Cluster Analysis in R

  49. DataCamp Cluster Analysis in R Coloring the Dendrogram - Height library(dendextend) dend_players <- as.dendrogram(hc_players) dend_colored <- color_branches(dend_players, h = 15) plot(dend_colored)

  50. DataCamp Cluster Analysis in R Coloring the Dendrogram - Height library(dendextend) dend_players <- as.dendrogram(hc_players) dend_colored <- color_branches(dend_players, h = 15) plot(dend_colored)

  51. DataCamp Cluster Analysis in R Coloring the Dendrogram - Height library(dendextend) dend_players <- as.dendrogram(hc_players) dend_colored <- color_branches(dend_players, h = 10) plot(dend_colored)

  52. DataCamp Cluster Analysis in R Coloring the Dendrogram - K library(dendextend) dend_players <- as.dendrogram(hc_players) dend_colored <- color_branches(dend_players, k = 2) plot(dend_colored)

  53. DataCamp Cluster Analysis in R cutree() using height cluster_assignments <- cutree(hc_players, h = 15) print(cluster_assignments) [1] 1 1 1 1 2 2 library(dplyr) players_clustered <- mutate(players, cluster = cluster_assignments) print(players_clustered) x y cluster <dbl> <dbl> <int> 1 -1 1 1 2 -2 -3 1 3 8 6 1 4 7 -8 1 5 -12 8 2 6 -15 0 2

  54. DataCamp Cluster Analysis in R CLUSTER ANALYSIS IN R Let's practice!

  55. DataCamp Cluster Analysis in R CLUSTER ANALYSIS IN R Making Sense of the Clusters Dmitriy Gorenshteyn Sr. Data Scientist, Memorial Sloan Kettering Cancer Center

  56. DataCamp Cluster Analysis in R Wholesale Dataset 45 observations 3 features: Milk Spending Grocery Spending Frozen Food Spending

  57. DataCamp Cluster Analysis in R Wholesale Dataset print(customers_spend) Milk Grocery Frozen 1 11103 12469 902 2 2013 6550 909 3 1897 5234 417 4 1304 3643 3045 5 3199 6986 1455 ... ... ... ...

  58. DataCamp Cluster Analysis in R Exploring More Than 2 Dimensions Plot 2 dimensions at a time Visualize using PCA Summary statistics by feature

  59. DataCamp Cluster Analysis in R CLUSTER ANALYSIS IN R Segment the Customers

Recommend


More recommend