TY - JOUR
T1 - Efficiently finding the optimum number of clusters in a dataset with a new hybrid differential evolution algorithm
T2 - DELA
AU - Arellano-Verdejo, Javier
AU - Alba, Enrique
AU - Godoy-Calderon, Salvador
N1 - Publisher Copyright:
© 2014, Springer-Verlag Berlin Heidelberg.
PY - 2016/3/1
Y1 - 2016/3/1
N2 - Clustering algorithms, a fundamental base for data mining procedures and learning techniques, suffer from the lack of efficient methods for determining the optimal number of clusters to be found in an arbitrary dataset. The few methods existing in the literature always use some sort of evolutionary algorithm having a cluster validation index as its objective function. In this article, a new evolutionary algorithm, based on a hybrid model of global and local heuristic search, is proposed for the same task, and some experimentation is done with different datasets and indexes. Due to its design, independent of any clustering procedure, it is applicable to virtually any clustering method like the widely used (Formula presented.) -means algorithm. Moreover, the use of non-parametric statistical tests over the experimental results, clearly show the proposed algorithm to be more efficient than other evolutionary algorithms currently used for the same task.
AB - Clustering algorithms, a fundamental base for data mining procedures and learning techniques, suffer from the lack of efficient methods for determining the optimal number of clusters to be found in an arbitrary dataset. The few methods existing in the literature always use some sort of evolutionary algorithm having a cluster validation index as its objective function. In this article, a new evolutionary algorithm, based on a hybrid model of global and local heuristic search, is proposed for the same task, and some experimentation is done with different datasets and indexes. Due to its design, independent of any clustering procedure, it is applicable to virtually any clustering method like the widely used (Formula presented.) -means algorithm. Moreover, the use of non-parametric statistical tests over the experimental results, clearly show the proposed algorithm to be more efficient than other evolutionary algorithms currently used for the same task.
KW - Differential evolution
KW - Evolutionary algorithms
KW - Local search
KW - Optimum number of clusters
KW - Partitional clustering
UR - http://www.scopus.com/inward/record.url?scp=84958112516&partnerID=8YFLogxK
U2 - 10.1007/s00500-014-1548-6
DO - 10.1007/s00500-014-1548-6
M3 - Artículo
AN - SCOPUS:84958112516
SN - 1432-7643
VL - 20
SP - 895
EP - 905
JO - Soft Computing
JF - Soft Computing
IS - 3
ER -