![]() | ![]() |
More Information | ![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
|
K-Means Clustering Analysis
This multivariate method is used for clustering/grouping clustering data with similar characteristics in multidimensional space.
The K-Means Clustering Method
- The method partition a data set of n objects into k clusters via an iterative process that continues until the sum of squares from points to the assigned cluster centers is minimized, i.e., until all cluster centers are at the mean of their voronoi sets.
- You can define the number of clusters to be generated in the analysis output, with the following limitation: k < n.
- Aabel uses the method of Hartigan and Wong (1979), performs several random starts, and attempts to converge to a global minimum of the squared error distortion.
K-Means Analysis Output
Once the analysis is performed, Aabel assigns properties to object local markers (worksheet rows) in the source worksheet, according to:
- Number of clusters you have chosen for the analysis
- Objects with similar characteristics in multidimensional space
Graphical Representation of Clusters Defined by K-Means Analysis
- For graphical representation of k-means analysis, you can use, a 2-D or 3-D plot that allows displaying scatter data points. In the example illustrated here, the left-hand side matrix displays three clusters, the data points that are member of each cluster, have similar characteristics in multidimensional space; the right-hand side matrix displays the same information, but here, the data points that are member of each cluster are connected to the corresponding group centroid.
|
Using a Scatter Matrix Graph ![]() |
Using a Scatter Matrix Graph ![]() |
Pre-Processing the Data
Aabel allows optional pre-processing of the data prior to the main k-means clustering analysis. The options include:
- Standardizing
- Normalizing
- Logarithmisizing
- Log centering
- Mean centering
- Taking square root
- Ranking variables individually
- Ranking variables jointly












