TY - JOUR
T1 - Transformed k-nearest neighborhood output distance minimization for predicting the defect density of software projects
AU - López-Martín, Cuauhtémoc
AU - Villuendas-Rey, Yenny
AU - Azzeh, Mohammad
AU - Bou Nassif, Ali
AU - Banitaan, Shadi
N1 - Publisher Copyright:
© 2020 Elsevier Inc.
PY - 2020/9
Y1 - 2020/9
N2 - Background: Software defect prediction is one of the most important research topics in software engineering. An important product measure to determine the effectiveness of software processes is the defect density (DD). Cased-based reasoning (CBR) has been the prediction technique most widely applied in the software prediction field. The CBR involves k-nearest neighborhood for finding the number (k) of similar software projects selected to be involved in the prediction process. Objective: To propose the application of a transformed k-nearest neighborhood output distance minimization (TkDM) algorithm to predict the DD of software projects to compare its prediction accuracy with those obtained from statistical regression, support vector regression, and neural networks. Method: Data sets were obtained from the ISBSG release 2018. A leave-one-out cross validation method was performed. Absolute residual was used as the prediction accuracy criterion for models. Results: Statistical significance tests among models showed that the TkDM had the best prediction accuracy than those ones from statistical regression, support vector regression, and neural networks. Conclusions: A TkDM can be used for predicting the DD of new and enhanced software projects developed and coded in specific platforms and programming languages types.
AB - Background: Software defect prediction is one of the most important research topics in software engineering. An important product measure to determine the effectiveness of software processes is the defect density (DD). Cased-based reasoning (CBR) has been the prediction technique most widely applied in the software prediction field. The CBR involves k-nearest neighborhood for finding the number (k) of similar software projects selected to be involved in the prediction process. Objective: To propose the application of a transformed k-nearest neighborhood output distance minimization (TkDM) algorithm to predict the DD of software projects to compare its prediction accuracy with those obtained from statistical regression, support vector regression, and neural networks. Method: Data sets were obtained from the ISBSG release 2018. A leave-one-out cross validation method was performed. Absolute residual was used as the prediction accuracy criterion for models. Results: Statistical significance tests among models showed that the TkDM had the best prediction accuracy than those ones from statistical regression, support vector regression, and neural networks. Conclusions: A TkDM can be used for predicting the DD of new and enhanced software projects developed and coded in specific platforms and programming languages types.
KW - Case-based reasoning
KW - ISBSG
KW - Neural networks
KW - Software defect density prediction
KW - Support vector regression
KW - Transformed k-nearest neighborhood output distance minimization
UR - http://www.scopus.com/inward/record.url?scp=85084173623&partnerID=8YFLogxK
U2 - 10.1016/j.jss.2020.110592
DO - 10.1016/j.jss.2020.110592
M3 - Artículo
SN - 0164-1212
VL - 167
JO - Journal of Systems and Software
JF - Journal of Systems and Software
M1 - 110592
ER -