K-MEANS++ OPTIMAL INITIALIZATION ALGORITHM An Improved K-means - PowerPoint PPT Presentation

Sep 15, 2023 •166 likes •313 views

K-MEANS++ OPTIMAL INITIALIZATION ALGORITHM An Improved K-means Clustering Method OVERVIEW K-means Clustering Algorithm K-means++ Initialization Algorithm Experiment Datasets Conclusion K-MEANS CLUSTERING ALGORITHM A

K-MEANS++ OPTIMAL INITIALIZATION ALGORITHM An Improved K-means Clustering Method
OVERVIEW K-means Clustering Algorithm • K-means++ Initialization Algorithm • Experiment • Datasets • Conclusion •
K-MEANS CLUSTERING ALGORITHM A well-known naïve clustering method. • Designed to find natural clusters in unclassified datasets. • Only requires a single input parameter - K • Uses random initialization technique for centroids. • Uses Euclidean distance to determine instances’ cluster assignments. • Calculates means of finished clusters then starts over. •
CLUSTERING EXAMPLE
MEAN CALCULATION AND RE-CLUSTERING
K-MEANS++ INITIALIZATION ALGORITHM Arbitrarily selects the first centroid. • Every other centroids selected based on distance from other centroids. •
EXPERIMENT Compared standard K-means and K-means++ methods. • Goal: to discover if either one of them produces better results than the other. • Setup: • Both methods run against 3 datasets with classes – Cluster, Iris, and Wine. • Each set has 3 classes which are used to verify the quality of the resulting clusters. • Quality in clusters is also determined by majority class • Fixed “arbitrary” setup to create a optimal and worst random centroid selection. • Both methods run against both centroid setups 3 times with a different K value. • Total of 36 trials. •
MULTIDIMENSIONAL DATA - CLUSTER
MULTIDIMENSIONAL DATA - IRIS
MULTIDIMENSIONAL DATA - WINE
RESULTS K-means++ proven to be better. • No reason to use standard K-means. • Still not perfect. •
IMPORTANT NOTES Imperfect simulation of K-means++ • Results could be better. • Results should give clearer favor to K-means++ •
REVIEW K-means Clustering Algorithm • K-means++ Initialization Algorithm • Comparison Experiment • Multidimensional Datasets • Results •
WORKS CITED • Aleshunas, J. (2013). Cluster Set. Alsabti, K., Ranka, S., & Singh, V. (1997). An effcient k-means clustering algorithm. • Arthur, D., & Vassilvitskii, S. (2007). K-means++: the advantages of careful seeding. • Philadelphia: Society for Industrial and Applied Mathematics Philadelphia. Fisher, R. A. (1936). Iris Flower Data Set. • Forina, M. (1988). Wine Recognition Data. PARVUS: An extendable package of programs for • data exploration, classification and correlation . Genoa, Italy: Institute of Pharmaceutical and Food Analysis and Technologies. Inaba, M., Katoh, N., & Imai, H. (1994). Applications of weighted Voronoi diagrams and • randomization to variance-based k-clustering. SCG '94 Proceedings of the tenth annual symposium on Computational geometry (pp. 332-339). New York: ACM. MacKay, D. (2003). An Example Inference Task: Clustering. In D. MacKay, Information Theory, • Inference and Learning Algorithms (pp. 284-292). Cambridge University Press. Shaefer, I. (2013). Cluster Set Modified. •

Recommend

for Sound Object Initialization Xin Qi and Andrew C. Myers Cornell University Friday, June 3,

for Sound Object Initialization Xin Qi and Andrew C. Myers Cornell University Friday, June 3, 2011 Fix the initialization problem Current mechanisms for object initialization are unsound This talk: a lightweight type system for sound

365 views • 22 slides

Initializer lists and uniform initialization slides based on talk by Bjarne Stroustrup Jon

Initializer lists Uniform initialization Initializer lists and uniform initialization slides based on talk by Bjarne Stroustrup Jon Elverkilde April 25, 2008 Initializer lists Uniform initialization Initializer lists myClass :

488 views • 15 slides

Odds Algorithm An Online Algorithm Group Fibonado 20. Dec 2016 Group Fibonado Odds Algorithm

Odds Algorithm An Online Algorithm Group Fibonado 20. Dec 2016 Group Fibonado Odds Algorithm 20. Dec 2016 1 / 21 Outline Introduction 1 Online Algorithm The Secretary Problem Optimal Stopping 2 Odds Algorithm 3 Algorithm Proof

1.27k views • 55 slides

The Metropolis Hastings algorithm : introduction and optimal scaling of the transient phase

Optimal scaling of the RWMH algorithm Introduction to the Metropolis-Hastings algorithm Optimal scaling of the transient phase of RWMH Optimisation strategies for the RWMH algorithm The Metropolis Hastings algorithm : introduction and optimal

509 views • 36 slides

1 K-means clustering The K-means clustering algorithm can be seen as applying the EM algorithm to

Statistical Modeling and Analysis of Neural Data (NEU 560) Princeton University, Spring 2018 Jonathan Pillow Lecture 18 notes: K-means and Factor Analysis Tues, 4.17 1 K-means clustering The K-means clustering algorithm can be seen as

548 views • 3 slides

k -means++ seeding Have seen that the k -means algorithm can output arbitrarily poor solutions, if

k -means++ seeding Have seen that the k -means algorithm can output arbitrarily poor solutions, if started with a bad set of initial centroids k -means++ is a simple, probabilistic algorithm to compute initial centroids These centroids are

745 views • 8 slides

An Optimal Jumper An Optimal Jumper Insertion Algorithm for Antenna Insertion Algorithm for

An Optimal Jumper An Optimal Jumper Insertion Algorithm for Antenna Insertion Algorithm for Antenna Avoidance/Fixing on General Routing Avoidance/Fixing on General Routing Trees with Obstacles Trees with Obstacles Bor- -Yiing Yiing Su and

649 views • 46 slides

Bundle Adjustment and SLAM 31 March 2014 1 Structure-From-Motion Two views initialization:

3D Photography: Bundle Adjustment and SLAM 31 March 2014 1 Structure-From-Motion Two views initialization: 5-Point algorithm (Minimal Solver) 8-Point linear algorithm 7-Point algorithm E ( R,t) 2 Structure-From-Motion

987 views • 41 slides

Cluster Center Initialization for Categorical Data Using Multiple Attribute Clustering Shehroz S.

Outline Introduction K-Modes Clustering Cluster Center Initialization Proposed Approach Results Conclusions Cluster Center Initialization for Categorical Data Using Multiple Attribute Clustering Shehroz S. Khan 1 Amir Ahmad 2 1 David R.

598 views • 31 slides

Fast and Energy-Efficient In-DRAM Bulk Data Copy and Initialization Vivek Seshadri Y. Kim, C.

Fast and Energy-Efficient In-DRAM Bulk Data Copy and Initialization Vivek Seshadri Y. Kim, C. Fallin, D. Lee, R. Ausavarungnirun, G. Pekhimenko, Y. Luo, O. Mutlu, P. B. Gibbons, M. A. Kozuch, T. C. Mowry Bulk data copy and initialization

702 views • 49 slides

Selection of variables in initialization of Modelica models Masoud Najafi INRIA ( French national

Selection of variables in initialization of Modelica models Masoud Najafi INRIA ( French national research institute on computer and control ) Rocquencourt, France Outline The Modelica language in Scicos Initialization of Modelica models

243 views • 20 slides

Optimal Agents Nick Hay 27th September 2005 1 / 36 Nick Hay Optimal Agents The Optimal Agent

The Optimal Agent Application & Evaluation Optimal Agents Nick Hay 27th September 2005 1 / 36 Nick Hay Optimal Agents The Optimal Agent Application & Evaluation Motivation Artificial Intelligence (AI) is the field inspired by the

400 views • 36 slides

Toward Computing Towards an Optimal . . . An (Almost) Optimal . . . Minor Problem an Optimal

Need for Unmanned . . . Need for Easily . . . Technical Details of . . . Need for an Optimal . . . Toward Computing Towards an Optimal . . . An (Almost) Optimal . . . Minor Problem an Optimal Trajectory for Solution: How to . . . What If

429 views • 20 slides

Lecture 23/Chapter 19 Diversity of Sample Means Means versus Proportions Behavior of

Lecture 23/Chapter 19 Diversity of Sample Means Means versus Proportions Behavior of Sample Means: Example Behavior of Sample Means: Conditions Behavior of Sample Means: Rules Approach to Inference Step 1 (Chapter 19): Work

357 views • 22 slides

Minimal Spanning 9 5 Trees 14 7 8 Chapter 9 15 10 3 CPTR 318 Prims Algorithm

4/18/2013 12 6 Minimal Spanning 9 5 Trees 14 7 8 Chapter 9 15 10 3 CPTR 318 Prims Algorithm Initialization 6 12 // Initialization 6 for ( each vertex v in V ) 9 5 { 5 7 9 v.value = ; // v 14 7 v.prev = null; //

229 views • 3 slides

Greedy Algorithms 1 The main idea of greedy algorithm is look some optimal solution locally

Chapter 4 Greedy Algorithms 1 The main idea of greedy algorithm is look some optimal solution locally and then try to extend globally. Usually the greedy algorithm is efficient. The greedy algorithm may not achieve optimal solution for

1.65k views • 145 slides

ADDRESSING FREIGHT CONCERNS DURING COVID-19 WESGRO WEBINAR Clement Blanc CEO DHL Global

ADDRESSING FREIGHT CONCERNS DURING COVID-19 WESGRO WEBINAR Clement Blanc CEO DHL Global Forwarding South Africa Jakes De Wet CEO Gori Wines & Spirits Logistics South Africa / Head of Beer, Wines & Spirits Africa 11 June 2020 DHL

607 views • 32 slides

Global trends in the industrial Global trends in the industrial minerals industry minerals

Global trends in the industrial Global trends in the industrial minerals industry minerals industry Challenges & opportunities Challenges & opportunities Mike O Driscoll, Driscoll, Mike O Editor, Industrial Minerals Editor,

1.04k views • 61 slides

SIN SINGAP APORE RE MAN MANAGEME MENT NT UN UNIV IVERSIT SITY LIN INETTE E LIM IM

SMU Classification: Restricted SIN SINGAP APORE RE MAN MANAGEME MENT NT UN UNIV IVERSIT SITY LIN INETTE E LIM IM DIRECT CTOR OR OFFICE OF UNDERGRADU RADUATE TE ADMISS SSIONS IONS SMU Classification: Restricted DIG IGIT

727 views • 62 slides

CIMO TECO-2018 highlights - Towards fit-for-purpose environmental measurements Krunoslav Premec

RA II II WIGOS OS Workshop rkshop on RWCs Cs and d its ts ser ervices vices for Membe mbers CIMO TECO-2018 highlights - Towards fit-for-purpose environmental measurements Krunoslav Premec (WMO Secretariat) (Tokyo, , Japan, , 6 - 9

779 views • 38 slides

CHARTS Culture and Heritage Added value to Regional policies for Tourism Sustainability Key photo

CHARTS Culture and Heritage Added value to Regional policies for Tourism Sustainability Key photo 1 referring Good Practice to GP topic CYCLE TOURISM Veneto by Bicycle This presentation forms a part of the CHARTS project Web based toolkit

742 views • 24 slides

Cool Things You Can Do with Internet for Diseases Forecasting April 21th, 2011 Alessio Signorini

Cool Things You Can Do with Internet for Diseases Forecasting April 21th, 2011 Alessio Signorini alessio-signorini@uiowa.edu Alessio Signorini Who am I? Born in Pisa, Italy and played professional soccer until seven years ago. No coffee,

594 views • 31 slides

Track Cars Renting & Driving Academy Services Introduction 2019 www.d2p.lu Services

Track Cars Renting & Driving Academy Services Introduction 2019 www.d2p.lu Services Introduction 2019 Our Philosophy Quality, Luxury, Client Service, All- in Price and FUN Our Track Cars IN AN NUTSHELL THE BUSINESS

154 views • 13 slides

Interim Report For the three months ended 31 March 2017 Ardagh Group S.A. TABLE OF CONTENTS

ArdaghGroup Interim Report For the three months ended 31 March 2017 Ardagh Group S.A. TABLE OF CONTENTS Unaudited Financial Statements - Consolidated Interim Income Statement for the three months ended March 31, 2017 and 2016

606 views • 29 slides