TY - JOUR
T1 - Hybrid data selection with preservation rough sets
AU - Villuendas-Rey, Yenny
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.
PY - 2022/11
Y1 - 2022/11
N2 - The nearest neighbor classifier is one of the simplest yet accurate decision-making algorithms. However, it suffers in the presence of noisy or redundant data. This article addresses the instance selection problem to improve lazy learners in hybrid and incomplete datasets. It introduces Preservation Rough Set (PRS) model, which can deal with hybrid (numeric and categorical) and incomplete decision systems. The properties of PRS are demonstrated by theorems, and its capabilities are shown by means of an original instance selection algorithm to determine which instances are relevant and which are not to improve decision-making. The numerical experiments conducted allow asseverating that the proposed algorithm is competitive and lead to highly accurate decision-making for nearest neighbor, voting algorithm, and Naïve Associative Classifier. In addition, the experiments show the ability of the proposal for dealing with noisy datasets.
AB - The nearest neighbor classifier is one of the simplest yet accurate decision-making algorithms. However, it suffers in the presence of noisy or redundant data. This article addresses the instance selection problem to improve lazy learners in hybrid and incomplete datasets. It introduces Preservation Rough Set (PRS) model, which can deal with hybrid (numeric and categorical) and incomplete decision systems. The properties of PRS are demonstrated by theorems, and its capabilities are shown by means of an original instance selection algorithm to determine which instances are relevant and which are not to improve decision-making. The numerical experiments conducted allow asseverating that the proposed algorithm is competitive and lead to highly accurate decision-making for nearest neighbor, voting algorithm, and Naïve Associative Classifier. In addition, the experiments show the ability of the proposal for dealing with noisy datasets.
KW - Decision-making
KW - Hybrid and incomplete data
KW - Instance selection
KW - Rough sets
UR - http://www.scopus.com/inward/record.url?scp=85136980389&partnerID=8YFLogxK
U2 - 10.1007/s00500-022-07439-4
DO - 10.1007/s00500-022-07439-4
M3 - Artículo
AN - SCOPUS:85136980389
SN - 1432-7643
VL - 26
SP - 11197
EP - 11223
JO - Soft Computing
JF - Soft Computing
IS - 21
ER -