Transformed k-nearest neighborhood output distance minimization for predicting the defect density of software projects

Cuauhtémoc López-Martín, Yenny Villuendas-Rey, Mohammad Azzeh, Ali Bou Nassif, Shadi Banitaan

Research output: Contribution to journalArticlepeer-review

12 Scopus citations

Abstract

Background: Software defect prediction is one of the most important research topics in software engineering. An important product measure to determine the effectiveness of software processes is the defect density (DD). Cased-based reasoning (CBR) has been the prediction technique most widely applied in the software prediction field. The CBR involves k-nearest neighborhood for finding the number (k) of similar software projects selected to be involved in the prediction process. Objective: To propose the application of a transformed k-nearest neighborhood output distance minimization (TkDM) algorithm to predict the DD of software projects to compare its prediction accuracy with those obtained from statistical regression, support vector regression, and neural networks. Method: Data sets were obtained from the ISBSG release 2018. A leave-one-out cross validation method was performed. Absolute residual was used as the prediction accuracy criterion for models. Results: Statistical significance tests among models showed that the TkDM had the best prediction accuracy than those ones from statistical regression, support vector regression, and neural networks. Conclusions: A TkDM can be used for predicting the DD of new and enhanced software projects developed and coded in specific platforms and programming languages types.

Original languageEnglish
Article number110592
JournalJournal of Systems and Software
Volume167
DOIs
StatePublished - Sep 2020

Keywords

  • Case-based reasoning
  • ISBSG
  • Neural networks
  • Software defect density prediction
  • Support vector regression
  • Transformed k-nearest neighborhood output distance minimization

Fingerprint

Dive into the research topics of 'Transformed k-nearest neighborhood output distance minimization for predicting the defect density of software projects'. Together they form a unique fingerprint.

Cite this