TY - GEN
T1 - Mixed data balancing through compact sets based instance selection
AU - Villuendas-Rey, Yenny
AU - García-Lorenzo, María Matilde
PY - 2013
Y1 - 2013
N2 - Learning in datasets that suffer from imbalanced class distribution is an important problem in Pattern Recognition. This paper introduces a novel algorithm for data balancing, based on compact set clustering of the majority class. The proposed algorithm is able to deal with mixed, as well as incomplete data, and with arbitrarily dissimilarity functions. Numerical experiments over repository databases show the high quality performance of the method proposed in this paper according to area under the ROC curve and imbalance ratio.
AB - Learning in datasets that suffer from imbalanced class distribution is an important problem in Pattern Recognition. This paper introduces a novel algorithm for data balancing, based on compact set clustering of the majority class. The proposed algorithm is able to deal with mixed, as well as incomplete data, and with arbitrarily dissimilarity functions. Numerical experiments over repository databases show the high quality performance of the method proposed in this paper according to area under the ROC curve and imbalance ratio.
KW - Imbalanced data
KW - Mixed data
KW - Supervised classification
UR - http://www.scopus.com/inward/record.url?scp=84893187003&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-41822-8_32
DO - 10.1007/978-3-642-41822-8_32
M3 - Contribución a la conferencia
SN - 9783642418211
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 254
EP - 261
BT - Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications - 18th Iberoamerican Congress, CIARP 2013, Proceedings
T2 - 18th Iberoamerican Congress on Pattern Recognition, CIARP 2013
Y2 - 20 November 2013 through 23 November 2013
ER -