Data-Driven Loan Default Prediction: Enhancing Business Process Workflows with Machine Learning

Authors

  • Pooja Chandrashekar Independent Researcher. Author

DOI:

https://doi.org/10.63282/3050-922X.IJERET-V6I4P103

Keywords:

Loan Default Prediction, Financial, Business, Machine Learning, Banking

Abstract

Financial institutions perform a major activity, loan default prediction, which directly affects risk management, loan approval and profitability. The study presents a powerful method of loan default forecasting on the Lending Club data, comprising 2,260,701 individual loans records between 2007 and 2018. The presented methodology comprises data preprocessing, such as cleaning, missing values, numerical values scaling, one-hot encoding categorical variables, and correlation and importance-based feature selection. Synthetic Minority Oversampling Technique (SMOTE) is implemented to ensure that the instances of default and non-default are represented in a manner that is class-balanced, thereby mitigating the explicit class imbalance. With this data split into training and test sets, and run ensemble machine learning classifiers, including XGBoost and Light Gradient Boosting Machine (LGBM), to make predictions. Accuracy (ACC), precision (PRE), recall (REC), F1-score (F1), and ROC-AUC scores are the metrics utilized to assess performance. It has been experimentally demonstrated that the developed models possess the excellent performance of prediction, with LGBM reaching a record of 99.99% on all measures and XGBoost coming close with 99.96%- 99.98%. The improved efficacy and generalization capability of the ensemble-based developed approach is proved by comparison with more standard models, including Neural Networks, Support Vector Machines, and Convolutional Neural Networks. The findings indicate that both LGBM and XGBoost offer a highly reliable and interpretable solution to assess financial risk that can identify the possible loan defaults with the best level of accuracy and can contribute to the process of lending choices

References

[1] I. R. Berrada, F. Barramou, and O. B. Alami, “Towards a Machine Learning-based Model for Corporate Loan Default Prediction,” Int. J. Adv. Comput. Sci. Appl., vol. 15, no. 3, pp. 565–573, 2024, doi: 10.14569/IJACSA.2024.0150357.

[2] A. Tripathi, “Data-Driven Predictive Analytics For Loan Portfolio Management : Proactive Decision-Making .,” Int. J. Creat. Res. Thoughts (IJCRT, vol. 13, no. 5, pp. 916–923, 2025.

[3] S. R. Kurakula, “The Role of AI in Transforming Enterprise Systems Architecture for Financial Services Modernization,” J. Comput. Sci. Technol. Stud., vol. 7, no. 4, pp. 181–186, May 2025, doi: 10.32996/jcsts.2025.7.4.21.

[4] A. Al-Qerem, G. Al-Naymat, M. Alhasan, and M. Al-Debei, “Default prediction model: The significant role of data engineering in the quality of outcomes,” Int. Arab J. Inf. Technol., 2020, doi: 10.34028/iajit/17/4A/8.

[5] G. Modalavalasa and S. P. Bheri, “Next-Generation AI-Powered Automation for Streamlining Business Processes and Improving Operational Efficiency,” J. Comput. Technol., vol. 12, no. 12, pp. 1–7, 2023.

[6] K. B. Thakkar and H. P. Kapadia, “The Roadmap to Digital Transformation in Banking: Advancing Credit Card Fraud Detection with Hybrid Deep Learning Model,” in 2025 2nd International Conference on Trends in Engineering Systems and Technologies (ICTEST), 2025, pp. 1–6. doi: 10.1109/ICTEST64710.2025.11042822.

[7] X. Zhang et al., “Data-Driven Loan Default Prediction: A Machine Learning Approach for Enhancing Business Process Management,” Systems, vol. 13, no. 7, 2025, doi: 10.3390/systems13070581.

[8] A. R. Bilipelli, “Forecasting the Evolution of Cyber Attacks in FinTech Using Transformer-Based Time Series Models,” Int. J. Res. Anal. Rev., vol. 10, no. 3, pp. 383–389, 2023.

[9] V. Verma, “Deep Learning-Based Fraud Detection in Financial Transactions : A Case Study Using Real-Time Data Streams,” vol. 3, no. 4, pp. 149–157, 2023, doi: 10.56472/25832646/JETA-V3I8P117.

[10] S. B. Shah, “Advanced Framework for Loan Approval Predictions Using Artificial Intelligence-Powered Financial Inclusion Models,” in 2025 IEEE Integrated STEM Education Conference (ISEC), 2025, pp. 1–10. doi: 10.1109/ISEC64801.2025.11147327.

[11] U. A. M. Istia, Al-Amain, K. M. M. Uddin, M. T. Ul Islam, and M. A. Based, “An Integrated Approach Using Ensemble Machine Learning and Deep Learning for Loan Approval Prediction,” in 2025 International Conference on Electrical, Computer and Communication Engineering (ECCE), IEEE, Feb. 2025, pp. 1–6. doi: 10.1109/ECCE64574.2025.11013889.

[12] S. K. C, M. S, P. M. Reddy, and K. Gopal, “Analyzing the Performance of Ensemble Machine Learning Algorithms for Predicting Loan Eligibility,” in 2024 9th International Conference on Communication and Electronics Systems (ICCES), IEEE, Dec. 2024, pp. 1362–1367. doi: 10.1109/ICCES63552.2024.10859945.

[13] Y. K. Jain, P. K. Mannepalli, K. Kaur, A. Maheshwari, and J. Singh, “Effective Machine Learning-Based Predictive Analytics for Loan Default Prediction in Banking Sector,” in 2024 International Conference on Communication, Control, and Intelligent Systems (CCIS), IEEE, Dec. 2024, pp. 1–6. doi: 10.1109/CCIS63231.2024.10931843.

[14] S. Chauhan, “Machine Learning Models for Loan Default Forecasting: Accuracy Comparison,” in 2024 Second International Conference Computational and Characterization Techniques in Engineering & Sciences (IC3TES), IEEE, Nov. 2024, pp. 1–5. doi: 10.1109/IC3TES62412.2024.10877523.

[15] R. Nancy Deborah, S. Alwyn Rajiv, A. Vinora, C. Manjula Devi, S. Mohammed Arif, and G. S. Mohammed Arif, “An Efficient Loan Approval Status Prediction Using Machine Learning,” in 2023 International Conference on Advanced Computing Technologies and Applications (ICACTA), IEEE, Oct. 2023, pp. 1–6. doi: 10.1109/ICACTA58201.2023.10392691.

[16] A. Lakshmanarao, C. Gupta, C. S. Koppireddy, U. V. Ramesh, and D. R. Dev, “Loan Default Prediction Using Machine Learning Techniques and Deep Learning ANN Model,” in 2023 Annual International Conference on Emerging Research Areas: International Conference on Intelligent Systems (AICERA/ICIS), IEEE, Nov. 2023, pp. 1–5. doi: 10.1109/AICERA/ICIS59538.2023.10420221.

[17] R. Q. Majumder, “Machine Learning for Predictive Analytics : Trends and Future Directions,” Int. J. Innov. Sci. Res. Technol., vol. 10, no. 4, 2025.

[18] R. Sifrain, “Predictive Analysis of Default Risk in Peer-to-Peer Lending Platforms: Empirical Evidence from LendingClub,” J. Financ. Risk Manag., vol. 12, no. 01, pp. 28–49, 2023, doi: 10.4236/jfrm.2023.121003.

[19] S. J. Wawge, “A Survey on the Identification of Credit Card Fraud Using Machine Learning with Precision, Performance, and Challenges,” Int. J. Innov. Sci. Res. Technol., vol. 10, no. 4, May 2025, doi: 10.38124/ijisrt/25apr1813.

[20] G. Mantha, “Transforming the Insurance Industry with Salesforce: Enhancing Customer Engagement and Operational Efficiency,” North Am. J. Eng. Res., vol. 5, no. 3, 2024.

[21] A. Aljadani, B. Alharthi, M. A. Farsi, H. M. Balaha, M. Badawy, and M. A. Elhosseini, “Mathematical Modeling and Analysis of Credit Scoring Using the LIME Explainer: A Comprehensive Approach,” Mathematics, vol. 11, no. 19, p. 4055, Sep. 2023, doi: 10.3390/math11194055.

[22] N. Malali, “Exploring Artificial Intelligence Models for Early Warning Systems with Systemic Risk Analysis in Finance,” in 2025 International Conference on Advanced Computing Technologies (ICoACT), IEEE, Mar. 2025, pp. 1–6. doi: 10.1109/ICoACT63339.2025.11005357.

[23] H. Kali, “Optimizing Credit Card Fraud Transactions identification and classification in banking industry Using Machine Learning Algorithms,” Int. J. Recent Technol. Sci. Manag., vol. 9, no. 11, pp. 85–96, 2024.

[24] N. K. Kokkalakonda, “Risk Assessment In Banking : Ai-Driven Predictive Models For Loan Default Prediction,” Int. Res. J. Mod. Eng. Technol. Sci., vol. 7, no. 03, pp. 8623–8633, 2025.

[25] M. A. Kheneifar and B. Amiri, “A Novel Hybrid Model for Loan Default Prediction in Maritime Finance Based on Topological Data Analysis and Machine Learning,” IEEE Access, vol. 13, no. May, pp. 81474–81493, 2025, doi: 10.1109/ACCESS.2025.3566066.

[26] P. C. Ko, P. C. Lin, H. T. Do, and Y. F. Huang, “P2P Lending Default Prediction Based on AI and Statistical Models,” Entropy, vol. 24, no. 6, pp. 1–23, 2022, doi: 10.3390/e24060801.

[27] A. Akinjole, O. Shobayo, J. Popoola, O. Okoyeigbo, and B. Ogunleye, “Ensemble-Based Machine Learning Algorithm for Loan Default Risk Prediction,” Mathematics, vol. 12, no. 21, p. 3423, Oct. 2024, doi: 10.3390/math12213423.

Downloads

Published

2025-10-09

Issue

Section

Articles

How to Cite

1.
Chandrashekar P. Data-Driven Loan Default Prediction: Enhancing Business Process Workflows with Machine Learning. IJERET [Internet]. 2025 Oct. 9 [cited 2025 Dec. 5];6(4):18-26. Available from: https://ijeret.org/index.php/ijeret/article/view/329