A Unified Ensemble–Deep Learning Framework for Software Defect Prediction: Boosting/Voting Meets CNN–RNN Modeling

Yash Khanna; Meera Joshi; Nikhil Sood

doi:10.63282/3050-922X.IJERET-V7I1P117

Authors

Yash Khanna Computer Science Department Ashoka University Sonipat, Haryana, India. Author
Meera Joshi Artificial Intelligence Department Ashoka University Sonipat, Haryana, India. Author
Nikhil Sood Computer Science Department Ashoka University Sonipat, Haryana, India. Author

DOI:

https://doi.org/10.63282/3050-922X.IJERET-V7I1P117

Keywords:

Software Defect Prediction, Software Quality Assurance, Ensemble Learning, Boosting, Voting, Convolutional Neural Networks, Recurrent Neural Networks, LSTM, GRU, Probability Calibration, Mlops, Explainable AI

Abstract

Software defect prediction remains a cornerstone of modern software reliability engineering, enabling teams to proactively allocate testing effort, prioritize code reviews, and mitigate operational risk. Despite decades of progress, two persistent limitations motivate renewed attention: (i) conventional metric based predictors often struggle under dataset shift, noisy labels, and evolving development practices, and (ii) deep learning approaches that ingest code tokens or learned representations may be data hungry, difficult to calibrate, and challenging to operationalize in enterprise pipelines that require transparency and governance. This manuscript presents a unified framework that explicitly merges (a) ensemble machine learning—via boosting and voting across heterogeneous metric learners—with (b) a CNN–RNN representation learner designed to capture local and sequential defect cues. The proposed framework introduces a reliability aware fusion layer that calibrates probabilities, quantifies uncertainty, and performs cost sensitive thresholding to align predictions with quality of service objectives. Beyond modeling, the manuscript provides an end to end blueprint for integrating defect prediction into cloud native delivery workflows, including feature lineage, monitoring hooks, and explainability artifacts suitable for high stakes environments. A worked example illustrates how boosted metric learners and CNN–RNN predictors can be combined through weighted soft voting and stacking to produce stable risk scores. The framework is designed to support both within project and cross project evaluation regimes and to remain robust when codebases undergo modernization (e.g., monolith to microservices migrations) or platform transitions (e.g., OpenShift adoption).

References

[1] T. Hall, S. Beecham, D. Bowes, D. Gray, and S. Counsell, “A systematic literature review on fault prediction performance in software engineering,” IEEE Transactions on Software Engineering, vol. 38, no. 6, pp. 1276–1304, 2012.

[2] S. Lessmann, B. Baesens, C. Mues, and S. Pietsch, “Benchmarking classification models for software defect prediction: A proposed framework and novel findings,” IEEE Transactions on Software Engineering, vol. 34, no. 4, pp. 485–496, 2008.

[3] S. D. Sivva, R. R. Thalakanti, S. S. G. Bandari, and S. D. R. Yettapu, “AI-driven decision intelligence for agile software lifecycle governance: An architecture-centered framework integrating machine learning defect prediction and automated testing,” International Journal of Emerging Trends in Computer Science and Information Technology, vol. 4, no. 4, pp. 167–172, Dec. 2023. [Online]. Available: https://ijetcsit.org/index.php/ijetcsit/article/view/554

[4] S. R. Gudi, “Enhancing reliability in Java enterprise systems through comparative analysis of automated testing frameworks,” International Journal of Emerging Trends in Computer Science and Information Technology, vol. 4, no. 2, pp. 151–160, 2023, doi: 10.63282/3050-9246.IJETCSIT-V4I2P115.

[5] T. Raikar and V. Apelagunta, “Implementing SAP Fiori in S/4HANA transitions: Key guidelines, challenges, strategic implications, AI integration recommendations,” Journal of Engineering Research and Sciences, vol. 4, no. 11, pp. 1–9, 2025, doi: 10.55708/JS0411001.

[6] I. Manga, “Towards explainable AI: A framework for interpretable deep learning in high-stakes domains,” in Proc. 5th Int. Conf. Soft Computing for Security Applications (ICSCSA), Salem, India, 2025, pp. 1354–1360, doi: 10.1109/ICSCSA66339.2025.11170778.

[7] R. R. Thalakanti and S. S. Goud Bandari, “Intelligent continuous integration and delivery for banking systems using machine learning driven risk detection with real world deployment evaluation,” International Journal of AI, BigData, Computational and Management Studies, vol. 5, no. 4, pp. 168–175, 2024, doi: 10.63282/3050-9416.IJAIBDCMS-V5I4P118.

[8] S. S. G. Bandari, S. D. Sivva, and R. R. Thalakanti, “Regulatory grade fraud detection using explainable artificial intelligence with auditable decision pathways and empirical validation on banking data,” International Journal of Artificial Intelligence, Data Science, and Machine Learning, vol. 5, no. 3, pp. 139–147, 2024, doi: 10.63282/3050-9262.IJAIDSML-V5I3P115.

[9] S. K. Gunda, "Software Defect Prediction Using Advanced Ensemble Techniques: A Focus on Boosting and Voting Method," 2024 International Conference on Electronic Systems and Intelligent Computing (ICESIC), Chennai, India, 2024, pp. 157-161, https://doi.org/10.1109/ICESIC61777.2024.10846550.

[10] A. K. Kishore Varma Alluri, “Using Salesforce CRM and deep learning (CNN) techniques to improve patient journey mapping and engagement in small and medium healthcare organizations,” International Journal of Artificial Intelligence, Data Science, and Machine Learning, vol. 6, no. 4, pp. 101–109, Nov. 2025. [Online]. Available: https://ijaidsml.org/index.php/ijaidsml/article/view/330

[11] V. K. Reddy Mittamidi, “Leveraging AI and ML for predictive monitoring and error mitigation in change data capture pipelines,” International Journal of Emerging Trends in Computer Science and Information Technology, vol. 6, no. 3, pp. 104–111, Aug. 2025. [Online]. Available: https://ijetcsit.org/index.php/ijetcsit/article/view/515

[12] S. K. Gunda, “Comparative analysis of machine learning models for software defect prediction,” in Proc. Int. Conf. Power, Energy, Control and Transmission Systems (ICPECTS), Chennai, India, 2024, pp. 1–6, doi: 10.1109/ICPECTS62210.2024.10780167.

[13] Y. Freund and R. E. Schapire, “A decision-theoretic generalization of on-line learning and an application to boosting,” Journal of Computer and System Sciences, vol. 55, no. 1, pp. 119–139, 1997.

[14] S. R. Gudi, “Design and evaluation of secure microservices architecture for HIPAA-compliant prescription processing on AWS and OpenShift,” International Journal of Artificial Intelligence, Data Science, and Machine Learning, vol. 5, no. 2, pp. 144–149, 2024, doi: 10.63282/3050-9262.IJAIDSML-V5I2P116.

[15] I. Manga, “AutoML for all: Democratizing machine learning model building with minimal code interfaces,” in Proc. 3rd Int. Conf. Sustainable Computing and Data Communication Systems (ICSCDS), Erode, India, 2025, pp. 347–352, doi: 10.1109/ICSCDS65426.2025.11167529.

[16] S. K. Gunda, “Fault prediction unveiled: Analyzing the effectiveness of random forest, logistic regression, and KNeighbors,” in Proc. 2nd Int. Conf. Self Sustainable Artificial Intelligence Systems (ICSSAS), Erode, India, 2024, pp. 107–113, doi: 10.1109/ICSSAS64001.2024.10760620.

[17] R. R. Thalakanti, S. S. Goud Bandari, and S. D. Sivva, “Federated learning for privacy preserving fraud detection across financial institutions: Architecture protocols and operational governance,” International Journal of Emerging Research in Engineering and Technology, vol. 5, no. 2, pp. 108–114, 2024, doi: 10.63282/3050-922X.IJERET-V5I2P111.

[18] I. Manga, “Federated learning at scale: A privacy-preserving framework for decentralized AI training,” in Proc. 5th Int. Conf. Soft Computing for Security Applications (ICSCSA), Salem, India, 2025, pp. 110–115, doi: 10.1109/ICSCSA66339.2025.11170780.

[19] S. R. Gudi, “AI-driven fax-to-digital prescription automation: A cloud-native framework using OCR, machine learning, and microservices for pharmacy operations,” International Journal of Emerging Research in Engineering and Technology, vol. 5, no. 1, pp. 111–116, 2024, doi: 10.63282/3050-922X.IJERET-V5I1P113.

[20] G. V. Krishna, B. D. Reddy, and T. Vrindaa, “EmoVision: An intelligent deep learning framework for emotion understanding and mental wellness assistance in human computer interaction,” International Journal of Artificial Intelligence, Data Science, and Machine Learning, vol. 6, no. 4, pp. 14–20, Oct. 2025. [Online]. Available: https://ijaidsml.org/index.php/ijaidsml/article/view/295

[21] S. K. Gunda, “A deep dive into software fault prediction: Evaluating CNN and RNN models,” in Proc. Int. Conf. Electronic Systems and Intelligent Computing (ICESIC), Chennai, India, 2024, pp. 224–228, doi: 10.1109/ICESIC61777.2024.10846549.

[22] S. R. Gudi, “Leveraging predictive analytics and Redis-backed caching to optimize specialty medication fulfillment and pharmacy inventory management,” International Journal of AI, BigData, Computational and Management Studies, vol. 5, no. 3, pp. 155–160, 2024, doi: 10.63282/3050-9416.IJAIBDCMS-V5I3P116.

[23] V. K. Reddy Mittamidi, “AI/ML powered intelligent root cause analysis and automated remediation for multi system data integrity issues,” International Journal of AI, BigData, Computational and Management Studies, vol. 6, no. 4, pp. 133–141, Nov. 2025. [Online]. Available: https://ijaibdcms.org/index.php/ijaibdcms/article/view/338

[24] S. K. Gunda, S. Yalamati, S. R. Gudi, I. Manga, and A. K. Aleti, “Scalable and adaptive machine learning models for early software fault prediction in agile development: Enhancing software reliability and sprint planning efficiency,” International Journal of Applied Mathematics, vol. 38, no. 2s, 2025, doi: 10.12732/ijam.v38i2s.74.

[25] T. Raikar, “High-performance in-memory computing: A research study on SAP S/4 HANA database layer,” American Journal of Technology, vol. 4, no. 2, pp. 93–113, 2025, doi: 10.58425/ajt.v4i2.449.

[26] S. R. Gudi, “A comparative analysis of pivotal cloud foundry and OpenShift cloud platforms,” The American Journal of Applied Sciences, vol. 7, no. 07, pp. 20–29, 2025, doi: 10.37547/tajas/Volume07Issue07-03.

[27] A. K. Kishore Varma Alluri, “Salesforce CRM framework for real time DeFi portfolio intelligence and customer engagement forecasting in web3 based decentralized finance ecosystems using ML techniques,” International Journal of AI, BigData, Computational and Management Studies, vol. 6, no. 4, pp. 99–107, Nov. 2025. [Online]. Available: https://ijaibdcms.org/index.php/ijaibdcms/article/view/319

[28] I. Manga, “Edge software engineering for lightweight AI: Real-time environmental data processing with embedded systems,” Journal of Computational Analysis and Applications, vol. 34, no. 6, pp. 88–104, Jun. 2025.

[29] I. Manga, “Unified data engineering for smart mobility: Real-time integration of traffic, public transport, and environmental data,” in Proc. 5th Int. Conf. Soft Computing for Security Applications (ICSCSA), Salem, India, 2025, pp. 1348–1353, doi: 10.1109/ICSCSA66339.2025.11170800.

[30] S. R. Gudi, “Monitoring and deployment optimization in cloud-native systems: A comparative study using OpenShift and Helm,” in Proc. 4th Int. Conf. Innovative Mechanisms for Industry Applications (ICIMIA), Tirupur, India, 2025, pp. 792–797, doi: 10.1109/ICIMIA67127.2025.11200594.

[31] R. R. Thalakanti, “Enhancing convergence in fully connected neural networks via optimized backpropagation,” in Proc. 2nd Int. Conf. Computing and Data Science (ICCDS), Chennai, India, 2025, pp. 1–6, doi: 10.1109/ICCDS64403.2025.11209625.

[32] S. R. Gudi, “Deconstructing monoliths: A fault-aware transition to microservices with gateway optimization using Spring Cloud,” in Proc. 6th Int. Conf. Electronics and Sustainable Communication Systems (ICESC), Coimbatore, India, 2025, pp. 815–820, doi: 10.1109/ICESC65114.2025.11212326.

[33] I. Manga, “Scalable graph neural networks for global knowledge representation and reasoning,” in Proc. 9th Int. Conf. Inventive Systems and Control (ICISC), Coimbatore, India, 2025, pp. 1399–1404, doi: 10.1109/ICISC65841.2025.11188341.

[34] S. R. Gudi, “Ensuring secure and compliant fax communication: Anomaly detection and encryption strategies for data in transit,” in Proc. 4th Int. Conf. Innovative Mechanisms for Industry Applications (ICIMIA), Tirupur, India, 2025, pp. 786–791, doi: 10.1109/ICIMIA67127.2025.11200537.

[35] S. R. Gudi, “Enhancing optical character recognition (OCR) accuracy in healthcare prescription processing using artificial neural networks,” European Journal of Artificial Intelligence and Machine Learning, vol. 4, no. 6, 2025, doi: 10.24018/ejai.2025.4.6.79.

[36] S. K. Gunda, “A hybrid deep learning model for software fault prediction using CNN, LSTM, and dense layers,” in Internet and Modern Society. IMS 2025, M. Bakaev et al., Eds., Communications in Computer and Information Science, vol. 2672. Cham, Switzerland: Springer, 2026, doi: 10.1007/978-3-032-05144-8_21.

[37] K. Cho, B. van Merriënboer, D. Bahdanau, and Y. Bengio, “Learning phrase representations using RNN encoder–decoder for statistical machine translation,” in Proc. Conf. Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1724–1734.

[38] Gunda, S. K. (2025). Accelerating Scientific Discovery With Machine Learning and HPC-Based Simulations. In B. Ben Youssef & M. Ben Ismail (Eds.), Integrating Machine Learning Into HPC-Based Simulations and Analytics (pp. 229-252). IGI Global Scientific Publishing. https://doi.org/10.4018/978-1-6684-3795-7.ch009.

[39] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997.

[40] T. Fawcett, “An introduction to ROC analysis,” Pattern Recognition Letters, vol. 27, no. 8, pp. 861–874, 2006.

A Unified Ensemble–Deep Learning Framework for Software Defect Prediction: Boosting/Voting Meets CNN–RNN Modeling

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

How to Cite

Make a Submission

Callpaper

Menu

Information

Keywords

Latest publications