An improved algorithm for partial clustering

G. Melendez-Melendez, D. Cruz-Paz, J. A. Carrasco-Ochoa, José Fco Martínez-Trinidad

Resultado de la investigación: Contribución a una revistaArtículoInvestigaciónrevisión exhaustiva

1 Cita (Scopus)

Resumen

© 2018 Elsevier Ltd Expert and intelligent systems use a variety of machine learning techniques to obtain and understand the information inherent in the data. Clustering is one of these techniques, which has become important and popular since it allows classifying an unlabeled dataset into clusters of similar objects. There are many clustering algorithms that have been proposed in the literature. From these algorithms, the Cross-Clustering algorithm is one of the most recent clustering algorithms for partial clustering (clustering where not necessarily all the objects are grouped into clusters), which has provided good results allowing estimating a suitable set of clusters, as well as eliminating outliers. However, this algorithm tends to eliminate too many objects as outliers, which leads to discard a lot of non-outlier objects. Additionally, the Cross-Clustering algorithms spends a lot of time evaluating several combinations of clusterings, trying to determine a suitable number of clusters. To overcome these problems, in this paper, an improved version of the Cross-Clustering algorithm (ICC) is proposed. ICC changes the clustering algorithm used for detecting outliers, as well as it modifies the way outliers are detected. Moreover, a stop criterion allowing to make a fast decision on the estimation of a suitable number of cluster, is also introduced. The performance of the improved Cross-Clustering algorithm is compared with the original algorithm on artificial and real datasets. Our results show that ICC improves the original algorithm and other state of the art clustering algorithms; in both, runtime and clustering quality.
Idioma originalInglés estadounidense
Páginas (desde-hasta)282-291
Número de páginas252
PublicaciónExpert Systems with Applications
DOI
EstadoPublicada - 1 may 2019
Publicado de forma externa

Huella dactilar

Clustering algorithms
Intelligent systems
Expert systems
Learning systems

Citar esto

Melendez-Melendez, G. ; Cruz-Paz, D. ; Carrasco-Ochoa, J. A. ; Martínez-Trinidad, José Fco. / An improved algorithm for partial clustering. En: Expert Systems with Applications. 2019 ; pp. 282-291.
@article{5574a0785b60463abe7c8de907c6cc9c,
title = "An improved algorithm for partial clustering",
abstract = "{\circledC} 2018 Elsevier Ltd Expert and intelligent systems use a variety of machine learning techniques to obtain and understand the information inherent in the data. Clustering is one of these techniques, which has become important and popular since it allows classifying an unlabeled dataset into clusters of similar objects. There are many clustering algorithms that have been proposed in the literature. From these algorithms, the Cross-Clustering algorithm is one of the most recent clustering algorithms for partial clustering (clustering where not necessarily all the objects are grouped into clusters), which has provided good results allowing estimating a suitable set of clusters, as well as eliminating outliers. However, this algorithm tends to eliminate too many objects as outliers, which leads to discard a lot of non-outlier objects. Additionally, the Cross-Clustering algorithms spends a lot of time evaluating several combinations of clusterings, trying to determine a suitable number of clusters. To overcome these problems, in this paper, an improved version of the Cross-Clustering algorithm (ICC) is proposed. ICC changes the clustering algorithm used for detecting outliers, as well as it modifies the way outliers are detected. Moreover, a stop criterion allowing to make a fast decision on the estimation of a suitable number of cluster, is also introduced. The performance of the improved Cross-Clustering algorithm is compared with the original algorithm on artificial and real datasets. Our results show that ICC improves the original algorithm and other state of the art clustering algorithms; in both, runtime and clustering quality.",
author = "G. Melendez-Melendez and D. Cruz-Paz and Carrasco-Ochoa, {J. A.} and Mart{\'i}nez-Trinidad, {Jos{\'e} Fco}",
year = "2019",
month = "5",
day = "1",
doi = "10.1016/j.eswa.2018.12.027",
language = "American English",
pages = "282--291",
journal = "Expert Systems with Applications",
issn = "0957-4174",
publisher = "Elsevier Ltd",

}

An improved algorithm for partial clustering. / Melendez-Melendez, G.; Cruz-Paz, D.; Carrasco-Ochoa, J. A.; Martínez-Trinidad, José Fco.

En: Expert Systems with Applications, 01.05.2019, p. 282-291.

Resultado de la investigación: Contribución a una revistaArtículoInvestigaciónrevisión exhaustiva

TY - JOUR

T1 - An improved algorithm for partial clustering

AU - Melendez-Melendez, G.

AU - Cruz-Paz, D.

AU - Carrasco-Ochoa, J. A.

AU - Martínez-Trinidad, José Fco

PY - 2019/5/1

Y1 - 2019/5/1

N2 - © 2018 Elsevier Ltd Expert and intelligent systems use a variety of machine learning techniques to obtain and understand the information inherent in the data. Clustering is one of these techniques, which has become important and popular since it allows classifying an unlabeled dataset into clusters of similar objects. There are many clustering algorithms that have been proposed in the literature. From these algorithms, the Cross-Clustering algorithm is one of the most recent clustering algorithms for partial clustering (clustering where not necessarily all the objects are grouped into clusters), which has provided good results allowing estimating a suitable set of clusters, as well as eliminating outliers. However, this algorithm tends to eliminate too many objects as outliers, which leads to discard a lot of non-outlier objects. Additionally, the Cross-Clustering algorithms spends a lot of time evaluating several combinations of clusterings, trying to determine a suitable number of clusters. To overcome these problems, in this paper, an improved version of the Cross-Clustering algorithm (ICC) is proposed. ICC changes the clustering algorithm used for detecting outliers, as well as it modifies the way outliers are detected. Moreover, a stop criterion allowing to make a fast decision on the estimation of a suitable number of cluster, is also introduced. The performance of the improved Cross-Clustering algorithm is compared with the original algorithm on artificial and real datasets. Our results show that ICC improves the original algorithm and other state of the art clustering algorithms; in both, runtime and clustering quality.

AB - © 2018 Elsevier Ltd Expert and intelligent systems use a variety of machine learning techniques to obtain and understand the information inherent in the data. Clustering is one of these techniques, which has become important and popular since it allows classifying an unlabeled dataset into clusters of similar objects. There are many clustering algorithms that have been proposed in the literature. From these algorithms, the Cross-Clustering algorithm is one of the most recent clustering algorithms for partial clustering (clustering where not necessarily all the objects are grouped into clusters), which has provided good results allowing estimating a suitable set of clusters, as well as eliminating outliers. However, this algorithm tends to eliminate too many objects as outliers, which leads to discard a lot of non-outlier objects. Additionally, the Cross-Clustering algorithms spends a lot of time evaluating several combinations of clusterings, trying to determine a suitable number of clusters. To overcome these problems, in this paper, an improved version of the Cross-Clustering algorithm (ICC) is proposed. ICC changes the clustering algorithm used for detecting outliers, as well as it modifies the way outliers are detected. Moreover, a stop criterion allowing to make a fast decision on the estimation of a suitable number of cluster, is also introduced. The performance of the improved Cross-Clustering algorithm is compared with the original algorithm on artificial and real datasets. Our results show that ICC improves the original algorithm and other state of the art clustering algorithms; in both, runtime and clustering quality.

UR - https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85058815603&origin=inward

UR - https://www.scopus.com/inward/citedby.uri?partnerID=HzOxMe3b&scp=85058815603&origin=inward

U2 - 10.1016/j.eswa.2018.12.027

DO - 10.1016/j.eswa.2018.12.027

M3 - Article

SP - 282

EP - 291

JO - Expert Systems with Applications

JF - Expert Systems with Applications

SN - 0957-4174

ER -