Machine Learning Applications In Software Engineering

Krishnamoorthy Srinivasan and Douglas Fisher, Member, IEEE
Manuscript received October 1992; revised October 1993 and October 1994. Recommended by D. Wile. D. Fisher's work was supported by NASA Ames Grant NAG 2-834.
K. Srinivasan is with Personal Computer Consultants, Inc., Washington, D.C.
D. Fisher is with the Department of Computer Science, Vanderbilt University, Nashville, Tennessee (e-mail: dfisher@vuse.vanderbilt.edu).
IEEE Log Number 9408517.
Abstract Accurate estimation of software development effort is critical in software engineering. Underestimates lead to time pressures that may compromise full functional development and thorough testing of software. In contrast, overestimates can result in noncompetitive contract bids and/or over allocation of development resources and personnel. As a result, many models for estimating software development effort have been proposed. This article describes two methods of machine learning, which we use to build estimators of software development effort from historical data. Our experiments indicate that these techniques are competitive with traditional estimators on one dataset, but also illustrate that these methods are sensitive to the data on which they are trained. This cautionary note applies to any model-construction strategy that relies on historical data. All such models for software effort estimation should be evaluated by exploring model sensitivity on a variety of historical data.
Index Terms Software development effort, machine learning, decision trees, regression trees, and neural networks.
ACCURATE estimation of software development effort has major implications for the management of software development. If management's estimate is too low, then the...