Fuzzy Self-Organizing Map based on Regularized Fuzzy c-means Clustering Sándor Migály, János Abonyi and Ferenc Szeifert University of Veszprém, Department of Process Engineering www.fmt.vein.hu/ softcomp
2/10 Overview Steps and tasks of data mining Concept of Self-Organizing Maps Smoothed fuzzy c-means clustering Illustrative examples Summary
Steps of Data Mining 3/10 Database Data warehouse Data Mining Model How soft-computing can help ??? www.fmt.vein.hu/softcomp
4/10 Tasks of Data Mining Classification Change and Deviation Detection Dependency Modelling Clustering (prototypes, codebook, signatures, prob. density estimation ) Summation (inc. Visualisation, Feature extraction) Regression and time-series analysis
Clustering 5/10 x 2 Detect groups of data Hierarchical (dendograms) or not Prototypes (signatures) are based on a similarity measure x 1 (distance) (semi)-supervised or C ∑ ∑ = − 2 E x v unsupervised i = ∈ i 1 x Q i Can be fuzzy !!!
Feature Extraction 6/10 (Nonlinear) mapping of the input space into a lower dimensional one Reduction of the number of inputs Useful for visualisation Non-parametric (Sammon projection) or Model-based (principal curves, NN, Gaussian mixtures, SOM )
Concept of the SOM I. 7/10 Input space Reduced feature space Input layer Map layer x 3 s 1 s 2 x 1 x 2 Cluster centers (code vectors) Place of these code vectors in the reduced space [ ] [ ] = n << = v v ,..., v m r r ,..., r i i 1 im i i 1 in Clustering and ordering of the cluster centers in a two dimensional grid
Concept of the SOM II. 8/10 We can use it for visualization y = f ( u ) u = [ u 1 , u 2 , u 3 ] Known inputs m c1 x 1 We can use it for regression m c2 x 2 m c m c3 x 3 Unknown inputs m c4 x 4 y = [ y 1 , y 2 ] m c5 x 5 { } ′ ′ − = − x v ' min x v ' m c : Best Matching Unit: c i i We can use it for clustering
Smoothed Fuzzy c-means 9/10 ∂ 2 v d ∫ ∂ = The smoothness can be measured as x S 2 x 1 x v 1 v 2 v 3 The new cost-function: v 4 v 5 v 6 ∂ 2 c N c ( ) ∑∑ ∑ v = µ + ϑ m 2 i J ( Z, U, V ) D ∂ ik i , k 2 x = = = i 1 k 1 i 1 v 7 v 8 v 9 x 2
10/10 Fuzzy line-trace application trace a part of a spiral in 3D. For this purpose 300 points are available with noise with 0 mean and variance 0.2. The aim of the clustering is to detect seven ordered clusters that can be lined up to detect the 3D curvature. 1.5 1.5 1 1.5 1 ϑ 1 0.5 0.5 =2 , 0 0.5 0 -1 -1 -1 -0.5 0 -1 -0.5 -0.5 -1 0 0 -0.5 0 -0.5 0.5 0.5 0 0 1 0.5 1 -1 0.5 0.5 -0.5 1.5 1.5 1 0 1 1 0.5 1.5 1.5 1 1.5 1.5 Detected clusters and the Detected clusters and the obtained ordering obtained ordering when the when standard FCM proposed method is used algorithm is used
11/10 Fuzzy surface-trace application We folded a 6x6 grid on a half sphere. 900 points were taken and noise with zero mean and 0.1 variance was added 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 -1 0.5 -0.5 -1 1 0 0 0.5 -0.5 -0.5 0 0 0.5 -0.5 0.5 -1 -1 1 1 Detected clusters and the Detected clusters and the obtained ordering obtained ordering when standard FCM when proposed regularized algorithm is used FCM algorithm is used
12/10 Conclusions New regularized fuzzy c-means clustering algorithm for the visualization of high- dimensional data. The cluster centers are arranged on a grid defined on a small dimensional space that can be easily visualized. Comparison to the existing modifications of the fuzzy c-means algorithm was given and the application examples showed good performance in two geometrical case studies.
Recommend
More recommend