Machine Learning Applications In Software Engineering

Taghi M. Khoshgoftaar, Member, IEEE, Edward B. Allen, Member, IEEE, and Jianyu Deng
Manuscript received December 29, 1999; revised October 1, 2001 and November 15, 2001. This work was supported in part by a grant from Nortel Networks through the Software Reliability Engineering Department. The findings and opinions in this paper belong solely to the authors, and are not necessarily those of the sponsor. Moreover, our results do not in any way reflect the quality of the sponsor's software products. Responsible Editor: M. A. Vouk.
T. M. Khoshgoftaar is with the Empirical Software Engineering Lab., Department of Computer Science and Engineering, Florida Atlantic University, Boca Raton, FL 33431 USA (e-mail: Taghi@cse.fau.edu).
E. B. Allen is with the Department of Computer Science, Mississippi State University, Mississippi State, MS 39762 USA (e-mail: Edward.Allen@computer.org).
J. Deng is with Motorola Metrowerks Corp., Austin, TX 78758 USA (e-mail: JDeng@metrowerks.com).
Digital Object Identifier 10.1109/TR.2002.804488
Abstract Software faults are defects in software modules that might cause failures. Software developers tend to focus on faults, because they are closely related to the amount of rework necessary to prevent future operational software failures. The goal of this paper is to predict which modules are fault-prone and to do it early enough in the life cycle to be useful to developers. A regression tree is an algorithm represented by an abstract tree, where the response variable is a real quantity. Software modules are classified as fault-prone or not, by comparing...