Study of the impact of resampling methods for contrast pattern based classifiers in imbalanced databases

Octavio Loyola-González, José Fco Martínez-Trinidad, Jesús Ariel Carrasco-Ochoa, Milton García-Borroto

Research output: Contribution to journalArticle

46 Scopus citations

Abstract

© 2015 Elsevier B.V. The class imbalance problem is a challenge in supervised classification, since many classifiers are sensitive to class distribution, biasing their prediction towards the majority class. Usually, in imbalanced databases, contrast pattern miners extract a very large collection of patterns from the majority class but only a few patterns (or none) from the minority class. It causes that minority class objects have low support and they could be identified as noise and consequently discarded by the contrast pattern based classifier biasing the results towards the majority class. In the literature, the class imbalance problem is commonly faced by applying resampling methods. Therefore, in this paper, we present a study about the impact of using resampling methods for improving the performance of contrast pattern based classifiers in class imbalance problems. Experimental results using standard imbalanced databases show that there are statistically significant differences between using the classifier before and after applying resampling methods. Moreover, from this study, we provide a guide based on the class imbalance ratio for selecting a resampling method that jointly with a contrast pattern based classifier allows us to have good results in a class imbalance problem.
Original languageAmerican English
Pages (from-to)935-947
Number of pages840
JournalNeurocomputing
DOIs
StatePublished - 1 Jan 2016
Externally publishedYes

    Fingerprint

Cite this