GenAI-Powered Test Case Generation for Microservices in CI/CD Pipelines via Trusted Federated Explainability
DOI:
https://doi.org/10.63282/3050-922X.IJERET-V1I1P111Keywords:
Generative AI, Microservices, CI/CD, Federated Learning, Explainable AI, Automated Testing, Test Generation, Trust Metrics, Software GovernanceAbstract
Microservices increase delivery velocity by enabling independent deployments, yet they also expand the surface area for regressions across APIs, message flows, and distributed data contracts. At enterprise scale, test assets and operational evidence are often siloed across teams, platforms, and regulatory boundaries, limiting the ability to build a unified learning loop for test generation. Meanwhile, generative AI (GenAI) methods can synthesize test cases from service specifications, code changes, and production telemetry, but their use in CI/CD must satisfy integrity and accountability requirements: generated tests must be explainable, reproducible, and robust against low-quality or malicious contributions. Additionally, teams require practical controls to manage the explainability–performance trade-off: highly interpretable generation strategies can be slower or less effective, while high-performing black-box generation may be difficult to justify in audits. This paper proposes T-FedTest, a GenAI-powered framework for automated test case generation for microservices within CI/CD pipelines, using a trust metric-based federated learning (FL) approach and federated explainability. T-FedTest intro- duces: (i) a trust metric that quantifies participant integrity and accountability using provenance attestations, update con- sistency, evaluation reliability, and policy compliance; (ii) a trust-aware federated aggregation protocol that limits poisoning and emphasizes high-accountability contributors; and (iii) an explainability–performance trade-off controller that allocates explanation budgets to generated tests and learning updates, enabling organizations to optimize auditability and runtime effec- tiveness without complex mathematics. We evaluate T-FedTest us- ing a controlled prototype simulation of multi-team microservice ecosystems with heterogeneous APIs, non-IID fault distributions, and adversarial/faulty participants. Results show that trust- aware federated learning improves fault-detection effectiveness and reduces harmful regressions compared to standard federated averaging baselines, while moderate explanation budgets preserve stable, actionable rationales with limited performance degra- dation. We conclude with deployment guidance for integrating trusted GenAI test generation into enterprise CI/CD.
References
[1] J. Humble and D. Farley, Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation. Addison- Wesley, 2010.
[2] S. Newman, Building Microservices. O’Reilly Media, 2015.
[3] N. Dragoni et al., “Microservices: Yesterday, today, and tomorrow,” in Present and Ulterior Software Engineering, Springer, 2017.
[4] P. Di Francesco, P. Lago, and I. Malavolta, “Research on architecting microservices: Trends, focus, and potential for industrial adoption,” in Proc. IEEE ICSA, 2017.
[5] J. Soldani, D. A. Tamburri, and W.-J. van den Heuvel, “The pains and gains of microservices: A systematic grey literature review,” J. Systems and Software, vol. 146, pp. 215–232, 2018.
[6] G. Fraser and A. Arcuri, “EvoSuite: Automatic test suite generation for object-oriented software,” in Proc. ACM ESEC/FSE, 2011.
[7] A. Atlidakis, P. Godefroid, and M. Polikarpova, “RESTler: Stateful REST API fuzzing,” in Proc. IEEE/ACM ICSE, 2019.
[8] J. Konecˇny´, B. McMahan, and D. Ramage, “Federated optimiza- tion: Distributed optimization beyond the datacenter,” arXiv preprint arXiv:1511.03575, 2015.
[9] H. B. McMahan et al., “Communication-efficient learning of deep networks from decentralized data,” in Proc. AISTATS, 2017.
[10] K. Bonawitz et al., “Practical secure aggregation for privacy-preserving machine learning,” in Proc. ACM CCS, 2017.
[11] P. Blanchard, E. Mhamdi, R. Guerraoui, and J. Stainer, “Machine learning with adversaries: Byzantine tolerant gradient descent,” in Proc. NeurIPS, 2017.
[12] D. Yin, Y. Chen, K. Ramchandran, and P. Bartlett, “Byzantine-robust distributed learning: Towards optimal statistical rates,” in Proc. ICML, 2018.
[13] P. Kairouz et al., “Advances and open problems in federated learning,” arXiv preprint arXiv:1912.04977, 2019.
[14] M. T. Ribeiro, S. Singh, and C. Guestrin, “Why should I trust you?: Explaining the predictions of any classifier,” in Proc. ACM KDD, 2016.
[15] S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,” in Proc. NeurIPS, 2017.
[16] M. Sundararajan, A. Taly, and Q. Yan, “Axiomatic attribution for deep networks,” in Proc. ICML, 2017.
[17] M. T. Ribeiro, S. Singh, and C. Guestrin, “Anchors: High-precision model-agnostic explanations,” in Proc. AAAI, 2018.
[18] C. Rudin, “Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead,” Nature Machine Intelligence, vol. 1, no. 5, pp. 206–215, 2019.
[19] I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” in Proc. NeurIPS, 2014.
[20] D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” in Proc. ICLR, 2015.
[21] A. Vaswani et al., “Attention is all you need,” in Proc. NeurIPS, 2017.
[22] E. Androulaki et al., “Hyperledger Fabric: A distributed operating system for permissioned blockchains,” in Proc. EuroSys, 2018.
[23] B. Putz, F. Pernul, and G. Kablitz, “A secure and auditable logging infrastructure based on a permissioned blockchain,” Computers & Secu- rity, vol. 87, 2019.