End-to-End MLOps Pipeline on Kubernetes for Scalable Healthcare Applications

Authors

  • Srichandra Boosa Senior Associate at Vertify and Proinkfluence IT Solutions PVT LTD, INDIA. Author
  • Karthik Allam Big Data Infrastructure Engineer at JP Morgan and Chase, USA. Author

DOI:

https://doi.org/10.63282/3050-922X.IJERET-V3I1P108

Keywords:

MLOps, Kubernetes, Healthcare Applications, Scalability, Machine Learning, Continuous Integration, Continuous Deployment (CI/CD), Model Deployment, Data Pipelines, Container Orchestration, Fault Tolerance, Predictive Analytics, Kubeflow, ML flow, Airflow, Data Privacy, Regulatory Compliance, Model Monitoring, Automation, Edge Computing

Abstract

One notable impact of the use of machine learning (ML) in the medical field is the increased need for dependable, automated, and scalable workflows for operations, whereby MLOps has become a crucial practice that bridges the gap between model creation and model application. MLOps, through all the stages of the ML model lifecycle from data preprocessing to training, deployment, and monitoring, changes the way this process is done while it also assures all other necessary healthcare-specific requirements, such as compliance with standards, auditability for traceability of any changes, and continuous improvement through further monitoring and feedback loops. All these are exactly the requirements in healthcare, where the accuracy of the model along with its reliability will have a direct impact on patient outcomes. By way of his substantial container orchestration abilities, Kubernetes has come to be the manufacturing facility for scaled as well as fault-free MLOps pipelines, and thus, it offers quite a few automated features, including automated scaling, trouble-free updates, effective resource management, etc., which are the tools to overcome the healthcare applications' changing workloads as well as the applications' nature, such as diagnostics, personalised treatment, and predictive analytics. Kubeflow, ML flow, and Airflow are three open-source technologies that Kubernetes is compatible with. Their association with Kubernetes allows them to build ML pipelines from start to finish that are not only easily restorable from the fault but also can be even more covered with the extent of available training datasets and are well connected with existing ML systems. This paper describes the architecture that takes a healthcare Kubernetes-based MLOps pipeline and faces problems related to data privacy, regulatory compliance, and model interpretability and also presents an example of the advantages of automation, CI, and monitoring practice. The article is hereupon to state the advantages of Kubernetes and, going further, to point at the future in which a number of discussions could be gaining ground, including the one about large language model (LLM) adoption, federated learning, and edge computing, all of them invented with the idea of helping healthcare to meet the demand arising

References

[1] Immaneni, J. (2022). End-to-End MLOps in Financial Services: Resilient Machine Learning with Kubernetes. Journal of Computational Innovation, 2(1).

[2] Mohammad, Abdul Jabbar. “Sentiment-Driven Scheduling Optimizer”. International Journal of Emerging Research in Engineering and Technology, vol. 1, no. 2, June 2020, pp. 50-59

[3] Patel, Piyushkumar, et al. "Leveraging Predictive Analytics for Financial Forecasting in a Post-COVID World." African Journal of Artificial Intelligence and Sustainable Development 1.1 (2021): 331-50.

[4] Veluru, Sai Prasad. "Leveraging AI and ML for Automated Incident Resolution in Cloud Infrastructure." International Journal of Artificial Intelligence, Data Science, and Machine Learning 2.2 (2021): 51-61.

[5] Shaik, Babulal. "Network Isolation Techniques in Multi-Tenant EKS Clusters." Distributed Learning and Broad Applications in Scientific Research 6 (2020).

[6] Jani, Parth. "UM Decision Automation Using PEGA and Machine Learning for Preauthorization Claims." The Distributed Learning and Broad Applications in Scientific Research 6 (2020): 1177-1205.

[7] Nelson, J., and Temple, S. (2020, April). MLOps Framework for Continuous Integration and Deployment.

[8] Mishra, Sarbaree, et al. “Training AI Models on Sensitive Data - The Federated Learning Approach”. International Journal of Artificial Intelligence, Data Science, and Machine Learning, vol. 1, no. 2, June 2020, pp. 33-42

[9] Pandey, V., and Bengani, S. (2022). Operationalizing Machine Learning Pipelines: Building Reusable and Reproducible Machine Learning Pipelines Using MLOps (English Edition). BPB Publications.

[10] Manda, Jeevan Kumar. "Cloud Security Best Practices for Telecom Providers: Developing comprehensive cloud security frameworks and best practices for telecom service delivery and operations, drawing on your cloud security expertise." Available at SSRN 5003526 (2020).

[11] Shaik, Babulal. "Developing Predictive Autoscaling Algorithms for Variable Traffic Patterns." Journal of Bioinformatics and Artificial Intelligence 1.2 (2021): 71-90.

[12] Fleming, S. (2020). Accelerated DevOps with AI, ML and RPA: Non-Programmer’s Guide to AIOPS and MLOPS. Stephen Fleming.

[13] Mishra, Sarbaree. “The Age of Explainable AI: Improving Trust and Transparency in AI Models”. International Journal of Artificial Intelligence, Data Science, and Machine Learning, vol. 1, no. 4, Dec. 2020, pp. 41-51

[14] Guntupalli, Bhavitha. “My Approach to Data Validation and Quality Assurance in ETL Pipelines”. International Journal of Artificial Intelligence, Data Science, and Machine Learning, vol. 2, no. 3, Oct. 2021, pp. 62-73

[15] Arugula, Balkishan, and Sudhkar Gade. “Cross-Border Banking Technology Integration: Overcoming Regulatory and Technical Challenges”. International Journal of Emerging Research in Engineering and Technology, vol. 1, no. 1, Mar. 2020, pp. 40-48.

[16] Gift, N., and Deza, A. (2021). Practical MLOps. "O'Reilly Media, Inc.".

[17] Nookala, Guruprasad. "Internal and External Audit Preparation for Risk and Controls." International Journal of Digital Innovation 2.1 (2021).

[18] Mishra, Sarbaree. “Moving Data Warehousing and Analytics to the Cloud to Improve Scalability, Performance and Cost-Efficiency”. International Journal of Emerging Research in Engineering and Technology, vol. 1, no. 1, Mar. 2020, pp. 77-85

[19] Kaniganti, S. T., and Challa, V. N. S. K. (2020). Leveraging Microservices Architecture with AI and ML for Intelligent Applications. ResearchGate, December.

[20] Talakola, Swetha. “Challenges in Implementing Scan and Go Technology in Point of Sale (POS) Systems”. Essex Journal of AI Ethics and Responsible Innovation, vol. 1, Aug. 2021, pp. 266-87

[21] Guntupalli, Bhavitha. “Debugging ETL Failures: A Structured, Step-by-Step Approach”. International Journal of AI, BigData, Computational and Management Studies, vol. 2, no. 1, Mar. 2021, pp. 66-75.

[22] Sharma, T. K. T. A. R. (2022). Scalable AI: Deploying Deep Learning Models on Cloud Infrastructure," Meeting your Requested Word Counts for Each Section.

[23] Arugula, Balkishan. “Implementing DevOps and CI CD Pipelines in Large-Scale Enterprises”. International Journal of Emerging Research in Engineering and Technology, vol. 2, no. 4, Dec. 2021, pp. 39-47.

[24] Jani, Parth. "Privacy-Preserving AI in Provider Portals: Leveraging Federated Learning in Compliance with HIPAA." The Distributed Learning and Broad Applications in Scientific Research 6 (2020): 1116-1145.

[25] Gade, P. K. (2019). MLOps Pipelines for GenAI in Renewable Energy: Enhancing Environmental Efficiency and Innovation. Asia Pacific Journal of Energy and Environment, 6(2), 113-122.

[26] Immaneni, J. (2021). Scaling Machine Learning in Fintech with Kubernetes. International Journal of Digital Innovation, 2(1).

[27] Nookala, G. (2020). Automation of privileged access control as part of enterprise control procedure. Journal of Big Data and Smart Systems, 1(1).

[28] Datla, Lalith Sriram, and Rishi Krishna Thodupunuri. “Applying Formal Software Engineering Methods to Improve Java-Based Web Application Quality”. International Journal of Artificial Intelligence, Data Science, and Machine Learning, vol. 2, no. 4, Dec. 2021, pp. 18-26.

[29] Hagos, D. H., Kakantousis, T., Sheikholeslami, S., Wang, T., Vlassov, V., Payberah, A. H., ... and Dowling, J. (2022). Scalable artificial intelligence for Earth observation data using hopsworks. Remote Sensing, 14(8), 1889.

[30] Nookala, G., Gade, K. R., Dulam, N., and Thumburu, S. K. R. (2021). Unified Data Architectures: Blending Data Lake, Data Warehouse, and Data Mart Architectures. MZ Computing Journal, 2(2).

[31] Mishra, Sarbaree. “Automating the Data Integration and ETL Pipelines through Machine Learning to Handle Massive Datasets in the Enterprise”. International Journal of Emerging Research in Engineering and Technology, vol. 1, no. 2, June 2020, pp. 69-78.

[32] Roychowdhury, S., and Sato, J. Y. (2021). Video-Data Pipelines for Machine Learning Applications. arXiv preprint arXiv:2110.11407.

[33] Manda, J. K. "Blockchain Applications in Telecom Supply Chain Management: Utilizing Blockchain Technology to Enhance Transparency and Security in Telecom Supply Chain Operations." MZ Computing Journal 2.2 (2021).

[34] Abdul Jabbar Mohammad. “Cross-Platform Timekeeping Systems for a Multi-Generational Workforce”. American Journal of Cognitive Computing and AI Systems, vol. 5, Dec. 2021, pp. 1-22

[35] Shaik, Babulal. "Developing Predictive Autoscaling Algorithms for Variable Traffic Patterns." Journal of Bioinformatics and Artificial Intelligence 1.2 (2021): 71-90.

[36] Potgieter, T., and Dahlberg, J. (2022). Automated Machine Learning on AWS: Fast-track the development of your production-ready machine learning applications the AWS way. Packt Publishing Ltd.

[37] Nookala, Guruprasad. "Internal and External Audit Preparation for Risk and Controls." International Journal of Digital Innovation 2.1 (2021).

[38] Mishra, Sarbaree, et al. “A New Pattern for Managing Massive Datasets in the Enterprise through Data Fabric and Data Mesh”. International Journal of Emerging Trends in Computer Science and Information Technology, vol. 1, no. 4, Dec. 2020, pp. 47-57

[39] Veluru, S. P. (2021). AI-Driven Data Pipelines: Automating ETL Workflows With Kubernetes. American Journal of Autonomous Systems and Robotics Engineering, 1, 449-473.

[40] Datla, Lalith Sriram, and Rishi Krishna Thodupunuri. “Designing for Defense: How We Embedded Security Principles into Cloud-Native Web Application Architectures”. International Journal of Emerging Research in Engineering and Technology, vol. 2, no. 4, Dec. 2021, pp. 30-38

[41] Mohammad, Abdul Jabbar. “AI-Augmented Time Theft Detection System”. International Journal of Artificial Intelligence, Data Science, and Machine Learning, vol. 2, no. 3, Oct. 2021, pp. 30-38.

[42] Shaik, Babulal, and Jayaram Immaneni. "Enhanced Logging and Monitoring With Custom Metrics in Kubernetes." African Journal of Artificial Intelligence and Sustainable Development 1 (2021): 307-30.

[43] Allam, Hitesh. Exploring the Algorithms for Automatic Image Retrieval Using Sketches. Diss. Missouri Western State University, 2017.

[44] Anand, S. (2021). Comparative Analysis of Hadoop and Snowflake in Handling Healthcare Encounter Data. International Journal of AI, BigData, Computational and Management Studies, 2(2), 44-54.

[45] Manda, Jeevan Kumar. "Securing Remote Work Environments in Telecom: Implementing Robust Cybersecurity Strategies to Secure Remote Workforce Environments in Telecom, Focusing on Data Protection and Secure Access Mechanisms." Focusing on Data Protection and Secure Access Mechanisms (April 04, 2020) (2020).

[46] Jani, Parth, and Sangeeta Anand. “Apache Iceberg for Longitudinal Patient Record Versioning in Cloud Data Lakes”. Essex Journal of AI Ethics and Responsible Innovation, vol. 1, Sept. 2021, pp. 338-57

[47] Sai Prasad Veluru. “Real-Time Fraud Detection in Payment Systems Using Kafka and Machine Learning”. JOURNAL OF RECENT TRENDS IN COMPUTER SCIENCE AND ENGINEERING (JRTCSE), vol. 7, no. 2, Dec. 2019, pp. 199-14.

[48] Jeno, G. (2022). Federated Learning with Python: Design and implement a federated learning system and develop applications using existing frameworks. Packt Publishing Ltd.

[49] Mohammad, Abdul Jabbar, and Waheed Mohammad A. Hadi. “Time-Bounded Knowledge Drift Tracker”. International Journal of Artificial Intelligence, Data Science, and Machine Learning, vol. 2, no. 2, June 2021, pp. 62-71

[50] Arugula, Balkishan. “Change Management in IT: Navigating Organizational Transformation across Continents”. International Journal of AI, BigData, Computational and Management Studies, vol. 2, no. 1, Mar. 2021, pp. 47-56

[51] Guntupalli, Bhavitha. “Unit Testing in ETL Workflows: Why It Matters and How to Do It”. International Journal of Artificial Intelligence, Data Science, and Machine Learning, vol. 2, no. 4, Dec. 2021, pp. 38-50.

[52] Hilman, M. H. (2020). Budget-constrained Workflow Applications Scheduling in Workflow-as-a-Service Cloud Computing Environments (Doctoral dissertation, Ph. D. thesis, The University of Melbourne).

[53] Sreekandan Nair, S., & Lakshmikanthan, G. (2021). Open Source Security: Managing Risk in the Wake of Log4j Vulnerability. International Journal of Emerging Trends in Computer Science and Information Technology, 2(4), 33-45. https://doi.org/10.63282/d0n0bc24

Downloads

Published

2022-03-30

Issue

Section

Articles

How to Cite

1.
Boosa S, Allam K. End-to-End MLOps Pipeline on Kubernetes for Scalable Healthcare Applications. IJERET [Internet]. 2022 Mar. 30 [cited 2025 Oct. 28];3(1):74-85. Available from: https://ijeret.org/index.php/ijeret/article/view/236