Optimizing K-Means by Fixing Initial Cluster Centers

Authors

  • Neeti Arora Department of Computer Science and Engineering, Rajiv Gandhi Technical University, Madhya Pradesh, India Author
  • Mahesh Motwani Department of Computer Science and Engineering, Rajiv Gandhi Technical University, Madhya Pradesh, India Author

Keywords:

Initial centroids; Recall; Precision; Partitional clustering; Agglomerative hierarchical clustering and Hierarchical partitioning clustering.

Abstract

Data mining techniques help in business decision making and predicting behaviors and future trends. Clustering is a data mining technique used to make groups of objects that are somehow similar in characteristics. Clustering analyzes data objects without consulting a known class label or category i.e. it is an unsupervised data mining technique. K-means is a widely used partitional clustering algorithm but the performance of K-means strongly depends on the initial guess of centers (centroid) and the final cluster centroids may not be the optimal ones. Therefore it is important for K-means to have good choice of initial centroids. By augmenting K-means with a technique of selecting centroids using criteria of sum of distances of data objects to all other data objects, we obtain an algorithm Farthest Distributed Centroids Clustering (FDCC) that result in better clustering as compared to not only the K-means partition clustering algorithm but also to the agglomerative hierarchical clustering algorithm and Hierarchical partitioning clustering algorithm. Unlike K-means FDCC algorithm does not perform random generation of the initial centers and does not produce different results for the same input data.

References

Downloads

Published

2014-06-30

Issue

Section

Articles

How to Cite

Optimizing K-Means by Fixing Initial Cluster Centers. (2014). International Journal of Current Engineering and Technology, 4(3), 2101-2107. https://ijcet.evegenis.org/index.php/ijcet/article/view/1000