TY - JOUR
T1 - Impact of imbalanced datasets preprocessing in the performance of associative classifiers
AU - Rangel-Díaz-de-la-Vega, Adolfo
AU - Villuendas-Rey, Yenny
AU - Yáñez-Márquez, Cornelio
AU - Camacho-Nieto, Oscar
AU - López-Yáñez, Itzamá
N1 - Publisher Copyright:
© 2020 by the authors.
PY - 2020/4/1
Y1 - 2020/4/1
N2 - In this paper, an experimental study was carried out to determine the influence of imbalanced datasets preprocessing in the performance of associative classifiers, in order to find the better computational solutions to the problem of credit scoring. To do this, six undersampling algorithms, six oversampling algorithms and four hybrid algorithms were evaluated in 13 imbalanced datasets referring to credit scoring. Then, the performance of four associative classifiers was analyzed. The experiments carried out allowed us to determine which sampling algorithms had the best results, as well as their impact on the associative classifiers evaluated. Accordingly, we determine that the Hybrid Associative Classifier with Translation, the Extended Gamma Associative Classifier and the Naive Associative Classifier do not improve their performance by using sampling algorithms for credit data balancing. On the other hand, the Smallest Normalized Difference Associative Memory classifier was beneficiated by using oversampling and hybrid algorithms.
AB - In this paper, an experimental study was carried out to determine the influence of imbalanced datasets preprocessing in the performance of associative classifiers, in order to find the better computational solutions to the problem of credit scoring. To do this, six undersampling algorithms, six oversampling algorithms and four hybrid algorithms were evaluated in 13 imbalanced datasets referring to credit scoring. Then, the performance of four associative classifiers was analyzed. The experiments carried out allowed us to determine which sampling algorithms had the best results, as well as their impact on the associative classifiers evaluated. Accordingly, we determine that the Hybrid Associative Classifier with Translation, the Extended Gamma Associative Classifier and the Naive Associative Classifier do not improve their performance by using sampling algorithms for credit data balancing. On the other hand, the Smallest Normalized Difference Associative Memory classifier was beneficiated by using oversampling and hybrid algorithms.
KW - Associative classifiers
KW - Credit scoring
KW - Imbalanced datasets
UR - http://www.scopus.com/inward/record.url?scp=85084641659&partnerID=8YFLogxK
U2 - 10.3390/APP10082779
DO - 10.3390/APP10082779
M3 - Artículo
AN - SCOPUS:85084641659
SN - 2076-3417
VL - 10
JO - Applied Sciences (Switzerland)
JF - Applied Sciences (Switzerland)
IS - 8
M1 - 2779
ER -