TITLE:
Initial Value Filtering Optimizes Fast Global K-Means
AUTHORS:
Jintao Han, Haiming Li
KEYWORDS:
K-Means, Cluster, Neighbourhood, Mahalanobis Distance
JOURNAL NAME:
Journal of Computer and Communications,
Vol.7 No.10,
October
14,
2019
ABSTRACT: K-means clustering algorithm is an important algorithm in unsupervised learning and plays an important role in big data processing, computer vision and other research fields. However, due to its sensitivity to initial partition, outliers, noise and other factors, the clustering results in data analysis, image segmentation and other fields are unstable and weak in robustness. Based on the fast global K-means clustering algorithm, this paper proposed an improved K-means clustering algorithm. Through the neighborhood filtering mechanism, the points in the neighborhood of the selected initial clustering center have not participated in the selection of the next initial clustering center, which can effectively reduce the randomness of initial partition and improve the efficiency of initial partition. Mahalanobis distance was used in the clustering process to better consider the global nature of data. Compared with the traditional clustering algorithm and other optimization algorithms, the results of real data set testing are significantly improved.