Optimizing K-Means by Fixing Initial Cluster Centers

Neeti Arora; Mahesh Motwani

Authors

Neeti Arora Department of Computer Science and Engineering, Rajiv Gandhi Technical University, Madhya Pradesh, India Author
Mahesh Motwani Department of Computer Science and Engineering, Rajiv Gandhi Technical University, Madhya Pradesh, India Author

Keywords:

Initial centroids; Recall; Precision; Partitional clustering; Agglomerative hierarchical clustering and Hierarchical partitioning clustering.

Abstract

Data mining techniques help in business decision making and predicting behaviors and future trends. Clustering is a data mining technique used to make groups of objects that are somehow similar in characteristics. Clustering analyzes data objects without consulting a known class label or category i.e. it is an unsupervised data mining technique. K-means is a widely used partitional clustering algorithm but the performance of K-means strongly depends on the initial guess of centers (centroid) and the final cluster centroids may not be the optimal ones. Therefore it is important for K-means to have good choice of initial centroids. By augmenting K-means with a technique of selecting centroids using criteria of sum of distances of data objects to all other data objects, we obtain an algorithm Farthest Distributed Centroids Clustering (FDCC) that result in better clustering as compared to not only the K-means partition clustering algorithm but also to the agglomerative hierarchical clustering algorithm and Hierarchical partitioning clustering algorithm. Unlike K-means FDCC algorithm does not perform random generation of the initial centers and does not produce different results for the same input data.

Optimizing K-Means by Fixing Initial Cluster Centers

Authors

Keywords:

Abstract

References

Downloads

Published

Issue

Section

How to Cite

journal_details

IMPACT METRIC: 8.7

Information

call_for_papers

Make a Submission

indexed_in

facts_and_figures

Share