Fair and Transparent Underwriting: Advanced AI Models for Thin-File Vehicle Insurance
DOI:
https://doi.org/10.63282/3050-922X.IJERET-V7I1P104Keywords:
Insurance, Vehicle Insurance, Car Insurance Claim Prediction Dataset, Artificial Intelligence, Machine Learning, Hybrid Model CNN-GRU, Random Forest, Naïve Bayes, MLPAbstract
The insurance industry struggles to predict insurance claims accurately. This prediction is crucial for running operations smoothly, preventing fraud, and setting the right premium prices. Traditional manual methods of claim evaluation are slow, subjective, and not capable of effectively handling the large volume of data from policyholders. This work comprises a machine learning (ML) framework for the prediction of vehicle insurance claims that supports the resolution of the hard-core problem of the identification of claim probabilities based on the different policyholder demographics and vehicle characteristics. A pipeline was systematically designed that features data preprocessing with cleaning, one-hot encoding, and outlier removal, followed by feature engineering and SMOTE-based class imbalance mitigation to tackle the issue of severe imbalance when no-claim instances account for 93.6% while claim cases make up only 6.4%. A hybrid CNN-GRU model was trained and evaluated using a variety of performance criteria. This model combines convolutional neural networks for the extraction of spatial features with gated recurrent units for the learning of sequential patterns. Using Random Forest (86.77%), Naive Bayes (95.28%), and MLP (76%) models, the proposed model was comprehensively assessed. CNN-GRU outshined other methods in terms of performance and attained the following measures: accuracy of 98.34%, precision of 98.45%, recall of 99.21%, F1-score of 98.56%, and AUC of 1.00, thus greatly surpassing traditional models. SHAP analysis showed that car age, policyholder demographics, and vehicle specifications are the main factors contributing to the prediction. This confirms that hybrid deep learning (DL) architectures are robust, scalable solutions for real-time insurance claim prediction systems.
References
[1] M. Uddin, M. F. Ansari, M. Adil, R. K. Chakrabortty, and M. J. Ryan, “Modeling Vehicle Insurance Adoption by Automobile Owners: A Hybrid Random Forest Classifier Approach,” Processes, vol. 11, no. 2, p. 629, Feb. 2023, doi: 10.3390/pr11020629.
[2] T. Baker and A. Shortland, “Insurance and enterprise: cyber insurance for ransomware,” Geneva Pap. Risk Insur. Issues Pract., vol. 48, pp. 275–299, 2023, doi: 10.1057/s41288-022-00281-7.
[3] A. E. M. F. Alrashidi, W. Faris, and A. M. S. Arafat, “Short Review of the Motor Vehicle Insurance Industry In Malaysia,” WSEAS Trans. Bus. Econ., 2022, doi: 10.37394/23207.2022.19.109.
[4] A. Parupalli, “The Evolution of Financial Decision Support Systems : The Evolution of Financial Decision Support Systems : From BI Dashboards to Predictive Analytics,” KOS J. Bus. Manag., vol. 1, no. 1, pp. 1–8, 2025.
[5] G. Mantha, “Transforming the Insurance Industry with Salesforce: Enhancing Customer Engagement and Operational Efficiency,” North Am. J. Eng. Res., vol. 5, no. 3, 2024.
[6] Y. C. Hsu, Y. M. Shiu, P. L. Chou, and Y. M. J. Chen, “Vehicle insurance and the risk of road traffic accidents,” Transp. Res. Part A Policy Pract., vol. 74, pp. 201–209, 2015, doi: 10.1016/j.tra.2015.02.015.
[7] Srinivasa Rao Kurakula, “The Role of AI in Transforming Enterprise Systems Architecture for Financial Services Modernization,” J. Comput. Sci. Technol. Stud., vol. 7, no. 4, pp. 181–186, 2025, doi: 10.32996/jcsts.2025.7.4.21.
[8] S. P. Kalava, “Revolutionizing Customer Experience: How CRM Digital Transformation Shapes Business,” Eur. J. Adv. Eng. Technol., no. 2394–658X, p. 4, 2024.
[9] R. J. S. K. Das and Y. Makin, “Behavioral Risk Tolerance in U.S. Retirement Planning Vs. Property Insurance: A Comparative Analysis,” Int. J. Appl. Math., vol. 38, pp. 41–70, 2025.
[10] A. R. Bilipelli, “Forecasting the Evolution of Cyber Attacks in FinTech Using Transformer-Based Time Series Models,” Int. J. Res. Anal. Rev. | Ijrar.Org, vol. 10, no. 3, pp. 383–389, 2023, [Online]. Available: https://www.ijrar.org/papers/IJRAR23C3692.pdf
[11] S. B. Shah, “Improving Financial Fraud Detection System with Advanced Machine Learning for Predictive Analysis and Prevention,” Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol., vol. 10, no. 6, pp. 2451–2463, Nov. 2024, doi: 10.32628/CSEIT24861147.
[12] N. Malali, “MICROSERVICES IN LIFE INSURANCE : ENHANCING SCALABILITY AND AGILITY IN LEGACY SYSTEMS,” Int. J. Eng. Technol. Res. Manag., no. 03, pp. 118–125, 2022.
[13] Y. Macha, “A Data-Driven Framework for Medical Insurance Cost Prediction Using Efficient AI Approaches,” Int. J. Res. Anal. Rev., vol. 11, no. 4, pp. 887–893, 2024.
[14] S. Kafková and L. Krivánková, “Generalized linear models in vehicle insurance,” Acta Univ. Agric. Silvic. Mendelianae Brun., 2014, doi: 10.11118/actaun201462020383.
[15] P. T. Selvy, S. Akash, T. Gobalakrishnasridhar, and T. Hariharan, “Retraction: Web Intelligence Based Flexi Vehicle Insurance Application,” Journal of Physics: Conference Series. 2021. doi: 10.1088/1742-6596/1916/1/012177.
[16] K. McDonnell, F. Murphy, B. Sheehan, L. Masello, and G. Castignani, “Deep learning in insurance: Accuracy and model interpretability using TabNet,” Expert Syst. Appl., 2023, doi: 10.1016/j.eswa.2023.119543.
[17] G. M. and H. Kali, “Exploring Big Data Role in Modern Business Strategies: A Survey with Techniques and Tools,” Int. J. Adv. Res. Sci. Commun. Technol., vol. 3, no. 3, pp. 1–11, 2023.
[18] K. S. Hebbar, “Priority-Aware Reactive APIs: Leveraging Spring WebFlux for SLA-Tiered Traffic in Financial Services,” Eur. J. Electr. Eng. Comput. Sci., vol. 9, no. 5, pp. 31–40, Sep. 2025, doi: 10.24018/ejece.2025.9.5.743.
[19] N. Malali, “Using Machine Learning to Optimize Life Insurance Claim Triage Processes Via Anomaly Detection in Databricks: Prioritizing High-Risk Claims for Human Review,” Int. J. Eng. Technol. Res. Manag., vol. 6, no. 6, 2022, doi: 10.5281/zenodo.15176507.
[20] N. Prajapati, “The Role of Machine Learning in Big Data Analytics: Tools, Techniques, and Applications,” ESP J. Eng. Technol. Adv., vol. 5, no. 2, pp. 16–22, 2025, doi: 10.56472/25832646/JETA-V5I2P103.
[21] T. Shah, “The Role of Customer Data Platforms (CDPs) in Driving Hyper-Personalization in FinTech,” Int. Res. J. Eng. Technol., vol. 12, no. 04, p. 10, 2025.
[22] Chehui, Zhangjiwu, and Zhangxingyang, “Research on motor vehicle insurance underwriting risk management model,” in Procedia Engineering, 2011. doi: 10.1016/j.proeng.2011.08.924.
[23] X. Xu and C. K. Fan, “Autonomous vehicles, risk perceptions and insurance demand: An individual survey in China,” Transp. Res. Part A Policy Pract., vol. 124, pp. 549–556, 2019, doi: 10.1016/j.tra.2018.04.009.
[24] C. Patel, “A Survey of Data-Driven Customer Segmentation Methods for Targeted Marketing Campaigns,” ESP J. Eng. Technol. Adv., vol. 3, no. 3, pp. 154–162, 2023, doi: 10.56472/25832646/JETA-V3I7P119.
[25] R. Q. Majumder, “Machine Learning for Predictive Analytics: Trends and Future Directions,” Int. J. Innov. Sci. Res. Technol., pp. 3557–3564, May 2025, doi: 10.38124/ijisrt/25apr1899.
[26] R. Q. Majumder, “Assessing the Impact of Audit Committees on Financial Data Reporting Quality and Corporate Accountability,” Int. J. Adv. Res. Sci. Commun. Technol., pp. 33–41, 2025, doi: 10.48175/ijarsct-28504.
[27] H. P. Kapadia and K. C. Chittoor, “AI Chatbots for Financial Customer Service: Challenges & Solutions,” J. Adv. Futur. Res., vol. 2, no. 2, pp. 1–7, 2024.
[28] R. Palwe, “Adaptive human: AI decision support for high-stakes financial advice,” Int. J. Comput. Artif. Intell., vol. 6, no. 2, pp. 385–392, Jul. 2025, doi: 10.33545/27076571.2025.v6.i2e.226.
[29] S. B. Karri, S. Gawali, S. Rayankula, and P. Vankadara, “AI Chatbots in Banking: Transforming Customer Service and Operational Efficiency,” in Advancements in Smart Innovations, Intelligent Systems, and Technologies, 2025, pp. 61–81. doi: 10.3233/FAIA251498.
[30] R. Agarwal, D. Kalsi, P. Jain, P. Gupta, and R. Goel, “Car Insurance Fraud Detection using Machine Learning Models,” in 2025 International Conference on Next Generation Information System Engineering (NGISE), 2025, pp. 1–8. doi: 10.1109/NGISE64126.2025.11085234.
[31] S. R. Raja, R. R. Cholla, R. K. Kadu, N. Legapriyadharshini, G. V. Jagatap, and S. Jothilakshmi, “Macroeconomic Modeling for Insurance Applications using the ANN-SVM Method,” in 2025 International Conference on Intelligent Computing and Knowledge Extraction (ICICKE), 2025, pp. 1–6. doi: 10.1109/ICICKE65317.2025.11136562.
[32] L. Nyström and O. Witt, “Predicting Vehicle Insurance Premiums Using Linear Regression, XGBoost, and Neural Networks A Comparative Study of Predictive Power,” 2025.
[33] M. Sun, “Predictive Analysis of Vehicle Insurance Demand Using Machine Learning Techniques,” in Proceedings of the 2024 5th International Conference on Computer Science and Management Technology, Association for Computing Machinery, 2025, pp. 1193–1197. doi: 10.1145/3708036.3708233.
[34] D. Saikia, R. Barua, M. K. Gourisaria, A. Bandyopadhyay, S. R. Mishra, and S. Bilgaiyan, “Machine Learning Enhancements for Car Insurance Claim Prediction,” in 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT), 2024, pp. 1–6. doi: 10.1109/ICCCNT61001.2024.10724028.
[35] R. Ibraimoh, “Using Artificial Intelligence to Improve Insurance Claim Evaluation,” ICONIC Res. Eng. JOURNALS, vol. 8, no. 2, pp. 749–759, 2024.
[36] B. Cao, C. Li, Y. Song, Y. Qin, and C. Chen, “Network Intrusion Detection Model Based on CNN and GRU,” Appl. Sci., 2022, doi: 10.3390/app12094184.
[37] D. Y. Mohammed, “Detection of Vehicle Insurance Claim Fraud: A Fraud Detection Use-Case for the Vehicle Insurance Industry,” Int. J. Progress. Sci., vol. 30, no. 1, pp. 504–507, 2021.
[38] M. Hanafy and R. Ming, “Machine Learning Approaches for Auto Insurance Big Data,” pp. 1–23, 2021, doi: 10.3390/risks9020042.
[39] G. Mahiyudin, M. Hussain, and D. D. Dewi, “A Comprehensive Study on Predicting the Need for Vehicle Maintenance Using Machine Learning,” Eng. Proc., vol. 107, no. 1, 2025, doi: 10.3390/engproc2025107089.
[40] C. Mare, D. Manaţe, G.-M. Mureşan, S. L. Dragoş, C. M. Dragoş, and A.-A. Purcel, “Machine Learning Models for Predicting Romanian Farmers’ Purchase of Crop Insurance,” Mathematics, vol. 10, no. 19, 2022, doi: 10.3390/math10193625.