Investigating Associative Classification for Software Fault Prediction: An Experimental Perspective

International Journal of Software Engineering and Knowledge Engineering, Vol. 24, No. 1, pp. 61-90, 2014

30 Pages Posted: 19 Oct 2014

See all articles by Baojun Ma

Baojun Ma

School of Business and Management, Shanghai International Studies University

Huaping Zhang

Beijing Institute of Technology - School of Computer Science & Technology

Guoqing Chen

Tsinghua University - School of Economics & Management

Yanping Zhao

Beijing Institute of Technology - School of Management and Economics

Bart Baesens

KU Leuven - Faculty of Business and Economics (FEB)

Date Written: February 1, 2014

Abstract

It is a recurrent finding that software development is often troubled by considerable delays as well as budget overruns and several solutions have been proposed in answer to this observation, software fault prediction being a prime example. Drawing upon machine learning techniques, software fault prediction tries to upfront identify software modules that are most likely to contain faults, thereby streamlining testing efforts and improving overall software quality. When deploying fault prediction models in a production environment, both prediction performance and model comprehensibility are typically taken into consideration, although the latter is commonly overlooked in academic literature. Many classification methods have been suggested to conduct fault prediction; yet associative classification methods remain uninvestigated in this context. This paper proposes an associative classification (AC)-based fault prediction method, building upon the CBA2 algorithm. In an empirical comparison on twelve real-world datasets, the AC-based classifier is shown to achieve a predictive performance competitive to those of models induced by five other tree/rule-based classification techniques. In addition, our findings also highlight the comprehensibility of the AC-based models, while achieving similar prediction performance. Furthermore, the possibilities of cross project prediction are investigated, strengthening earlier findings on the feasibility of such approach when insufficient data on the target project is available.

Keywords: software fault prediction, associative classification, prediction performance, comprehensibility, cross project validation

Suggested Citation

Ma, Baojun and Zhang, Huaping and Chen, Guoqing and Zhao, Yanping and Baesens, Bart, Investigating Associative Classification for Software Fault Prediction: An Experimental Perspective (February 1, 2014). International Journal of Software Engineering and Knowledge Engineering, Vol. 24, No. 1, pp. 61-90, 2014, Available at SSRN: https://ssrn.com/abstract=2511609

Baojun Ma

School of Business and Management, Shanghai International Studies University ( email )

1550 Wen Xiang Rd.
Songjiang District
Shanghai, Shanghai 201620
China

HOME PAGE: http://baojunma.com/index_en.html

Huaping Zhang

Beijing Institute of Technology - School of Computer Science & Technology ( email )

5 South Zhongguancun street
Center for Energy and Environmental Policy Researc
Beijing, Haidian District 100081
China

Guoqing Chen (Contact Author)

Tsinghua University - School of Economics & Management ( email )

Beijing, 100084
China
+86-10-62789925 (Phone)
+86-10-62789925 (Fax)

HOME PAGE: http://www.sem.tsinghua.edu.cn/en/chengq

Yanping Zhao

Beijing Institute of Technology - School of Management and Economics ( email )

5 South Zhongguancun street
Center for Energy and Environmental Policy Researc
Beijing, Haidian District 100081
China

Bart Baesens

KU Leuven - Faculty of Business and Economics (FEB) ( email )

Naamsestraat 69
Leuven, B-3000
Belgium

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
54
Abstract Views
540
Rank
675,887
PlumX Metrics