Correlation Assessment of the Performance of Associative Classifiers on Credit Datasets Based on Data Complexity Measures

Francisco J. Camacho-Urriolagoitia; Yenny Villuendas-Rey; Itzamá López-Yáñez; Oscar Camacho-Nieto; Cornelio Yáñez-Márquez

doi:10.3390/math10091460

Correlation Assessment of the Performance of Associative Classifiers on Credit Datasets Based on Data Complexity Measures

Francisco J. Camacho-Urriolagoitia, Yenny Villuendas-Rey, Itzamá López-Yáñez, Oscar Camacho-Nieto, Cornelio Yáñez-Márquez

Research output: Contribution to journal › Article › peer-review

3 Scopus citations

Abstract

One of the four basic machine learning tasks is pattern classification. The selection of the proper learning algorithm for a given problem is a challenging task, formally known as the algorithm selection problem (ASP). In particular, we are interested in the behavior of the associative classifiers derived from Alpha-Beta models applied to the financial field. In this paper, the behavior of four associative classifiers was studied: the One-Hot version of the Hybrid Associative Classifier with Translation (CHAT-OHM), the Extended Gamma (EG), the Naïve Associative Classifier (NAC), and the Assisted Classification for Imbalanced Datasets (ACID). To establish the performance, we used the area under the curve (AUC), F-score, and geometric mean measures. The four classifiers were applied over 11 datasets from the financial area. Then, the performance of each one was analyzed, considering their correlation with the measures of data complexity, corresponding to six categories based on specific aspects of the datasets: feature, linearity, neighborhood, network, dimensionality, and class imbalance. The correlations that arise between the measures of complexity of the datasets and the measures of performance of the associative classifiers are established; these results are expressed with Spearman’s Rho coefficient. The experimental results correctly indicated correlations between data complexity measures and the performance of the associative classifiers.

Original language	English
Article number	1460
Journal	Mathematics
Volume	10
Issue number	9
DOIs	https://doi.org/10.3390/math10091460
State	Published - 1 May 2022

Keywords

associative classification
finances
meta-learning
supervised classification

Access to Document

10.3390/math10091460

Cite this

@article{22a87b0f680e49beba5408d6b8043e04,

title = "Correlation Assessment of the Performance of Associative Classifiers on Credit Datasets Based on Data Complexity Measures",

abstract = "One of the four basic machine learning tasks is pattern classification. The selection of the proper learning algorithm for a given problem is a challenging task, formally known as the algorithm selection problem (ASP). In particular, we are interested in the behavior of the associative classifiers derived from Alpha-Beta models applied to the financial field. In this paper, the behavior of four associative classifiers was studied: the One-Hot version of the Hybrid Associative Classifier with Translation (CHAT-OHM), the Extended Gamma (EG), the Na{\"i}ve Associative Classifier (NAC), and the Assisted Classification for Imbalanced Datasets (ACID). To establish the performance, we used the area under the curve (AUC), F-score, and geometric mean measures. The four classifiers were applied over 11 datasets from the financial area. Then, the performance of each one was analyzed, considering their correlation with the measures of data complexity, corresponding to six categories based on specific aspects of the datasets: feature, linearity, neighborhood, network, dimensionality, and class imbalance. The correlations that arise between the measures of complexity of the datasets and the measures of performance of the associative classifiers are established; these results are expressed with Spearman{\textquoteright}s Rho coefficient. The experimental results correctly indicated correlations between data complexity measures and the performance of the associative classifiers.",

keywords = "associative classification, finances, meta-learning, supervised classification",

author = "Camacho-Urriolagoitia, {Francisco J.} and Yenny Villuendas-Rey and Itzam{\'a} L{\'o}pez-Y{\'a}{\~n}ez and Oscar Camacho-Nieto and Cornelio Y{\'a}{\~n}ez-M{\'a}rquez",

note = "Publisher Copyright: {\textcopyright} 2022 by the authors. Licensee MDPI, Basel, Switzerland.",

year = "2022",

month = may,

day = "1",

doi = "10.3390/math10091460",

language = "Ingl{\'e}s",

volume = "10",

journal = "Mathematics",

issn = "2227-7390",

number = "9",

}

TY - JOUR

T1 - Correlation Assessment of the Performance of Associative Classifiers on Credit Datasets Based on Data Complexity Measures

AU - Camacho-Urriolagoitia, Francisco J.

AU - Villuendas-Rey, Yenny

AU - López-Yáñez, Itzamá

AU - Camacho-Nieto, Oscar

AU - Yáñez-Márquez, Cornelio

PY - 2022/5/1

Y1 - 2022/5/1

N2 - One of the four basic machine learning tasks is pattern classification. The selection of the proper learning algorithm for a given problem is a challenging task, formally known as the algorithm selection problem (ASP). In particular, we are interested in the behavior of the associative classifiers derived from Alpha-Beta models applied to the financial field. In this paper, the behavior of four associative classifiers was studied: the One-Hot version of the Hybrid Associative Classifier with Translation (CHAT-OHM), the Extended Gamma (EG), the Naïve Associative Classifier (NAC), and the Assisted Classification for Imbalanced Datasets (ACID). To establish the performance, we used the area under the curve (AUC), F-score, and geometric mean measures. The four classifiers were applied over 11 datasets from the financial area. Then, the performance of each one was analyzed, considering their correlation with the measures of data complexity, corresponding to six categories based on specific aspects of the datasets: feature, linearity, neighborhood, network, dimensionality, and class imbalance. The correlations that arise between the measures of complexity of the datasets and the measures of performance of the associative classifiers are established; these results are expressed with Spearman’s Rho coefficient. The experimental results correctly indicated correlations between data complexity measures and the performance of the associative classifiers.

AB - One of the four basic machine learning tasks is pattern classification. The selection of the proper learning algorithm for a given problem is a challenging task, formally known as the algorithm selection problem (ASP). In particular, we are interested in the behavior of the associative classifiers derived from Alpha-Beta models applied to the financial field. In this paper, the behavior of four associative classifiers was studied: the One-Hot version of the Hybrid Associative Classifier with Translation (CHAT-OHM), the Extended Gamma (EG), the Naïve Associative Classifier (NAC), and the Assisted Classification for Imbalanced Datasets (ACID). To establish the performance, we used the area under the curve (AUC), F-score, and geometric mean measures. The four classifiers were applied over 11 datasets from the financial area. Then, the performance of each one was analyzed, considering their correlation with the measures of data complexity, corresponding to six categories based on specific aspects of the datasets: feature, linearity, neighborhood, network, dimensionality, and class imbalance. The correlations that arise between the measures of complexity of the datasets and the measures of performance of the associative classifiers are established; these results are expressed with Spearman’s Rho coefficient. The experimental results correctly indicated correlations between data complexity measures and the performance of the associative classifiers.

KW - associative classification

KW - finances

KW - meta-learning

KW - supervised classification

UR - http://www.scopus.com/inward/record.url?scp=85129770812&partnerID=8YFLogxK

U2 - 10.3390/math10091460

DO - 10.3390/math10091460

M3 - Artículo

AN - SCOPUS:85129770812

SN - 2227-7390

VL - 10

JO - Mathematics

JF - Mathematics

IS - 9

M1 - 1460

ER -

Correlation Assessment of the Performance of Associative Classifiers on Credit Datasets Based on Data Complexity Measures

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this