Dynamic Loss Function Tuning via Meta-Gradient Search

Sai Prasad Veluru

doi:10.63282/3050-922X.IJERET-V5I2P103

Authors

Sai Prasad Veluru Software Engineer at Apple, USA. Author

DOI:

https://doi.org/10.63282/3050-922X.IJERET-V5I2P103

Keywords:

Meta-learning, loss function optimization, dynamic tuning, gradient-based search, deep learning, meta-gradients, neural networks, hyperparameter optimization, adaptive learning, machine learning robustness

Abstract

Since they guide the optimization by measuring the difference of predictions from actual outcomes, loss functions are fundamental to the learning process of machine learning models. Historically, these roles are predefined and unchangeable during training, therefore restricting the ability of a model to adapt to changing data dynamics or learning phases. By providing a dynamic approach adjusting the loss function during training using a meta-gradient search technique this work reduces that limitation. Our method uses meta-gradients to dynamically change the parameters of the loss function in actual time, therefore matching the performance of the model. The basic idea is to improve not just the model but also the goal it absorbs, therefore providing a more flexible and tailored learning environment. We define the meta-gradient approach, in which changes to the loss function are evaluated by a superior optimization loop on next model updates. Experiments spanning many benchmarks including image classification and sequence prediction tasks show that our dynamic loss tuning produces quicker convergence, improved generalization, and higher robustness to noisy data. In many situations, the models using this adaptive approach outperform those using fixed, manually generated loss functions. This work highlights the importance of reconsidering a fundamental component of ML & offers a feasible path for automated, context-sensitive development. Giving models the ability to learn helps to create truly self-adjusting AI systems capable of independently addressing the latest challenges

References

[1] Gao, Boyan. "Meta-learning to optimise: Loss functions and update rules." (2023).

[2] Rajeswaran, Aravind, et al. "Meta-learning with implicit gradients." Advances in neural information processing systems 32 (2019).

[3] Bechtle, Sarah, et al. "Meta learning via learned loss." 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 2021.

[4] Raymond, Christian, et al. "Fast and efficient local-search for genetic programming based loss function learning." Proceedings of the Genetic and Evolutionary Computation Conference. 2023.

[5] Raymond, Christian, et al. "Learning symbolic model-agnostic loss functions via meta-learning." IEEE Transactions on Pattern Analysis and Machine Intelligence 45.11 (2023): 13699-13714.

[6] Kupunarapu, Sujith Kumar. "Data Fusion and Real-Time Analytics: Elevating Signal Integrity and Rail System Resilience." International Journal of Science And Engineering 9.1 (2023): 53-61.

[7] Bohdal, Ondrej, Yongxin Yang, and Timothy Hospedales. "Evograd: Efficient gradient-based meta-learning and hyperparameter optimization." Advances in neural information processing systems 34 (2021): 22234-22246.

[8] Paidy, Pavan. “Scaling Threat Modeling Effectively in Agile DevSecOps”. American Journal of Data Science and Artificial Intelligence Innovations, vol. 1, Oct. 2021, pp. 556-77

[9] Syed, Ali Asghar Mehdi, and Shujat Ali. “Linux Container Security: Evaluating Security Measures for Linux Containers in DevOps Workflows”. American Journal of Autonomous Systems and Robotics Engineering, vol. 2, Dec. 2022, pp. 352-75

[10] Baik, Sungyong, et al. "Meta-learning with task-adaptive loss function for few-shot learning." Proceedings of the IEEE/CVF international conference on computer vision. 2021.

[11] Tarra, Vasanta Kumar, and Arun Kumar Mittapelly. “Sentiment Analysis in Customer Interactions: Using AI-Powered Sentiment Analysis in Salesforce Service Cloud to Improve Customer Satisfaction”. International Journal of Artificial Intelligence, Data Science, and Machine Learning, vol. 4, no. 3, Oct. 2023, pp. 31-40

[12] Sun, Yi, Jian Li, and Xin Xu. "Meta-GF: Training dynamic-depth neural networks harmoniously." European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2022.

[13] Paidy, Pavan. “ASPM in Action: Managing Application Risk in DevSecOps”. American Journal of Autonomous Systems and Robotics Engineering, vol. 2, Sept. 2022, pp. 394-16

[14] Xu, Zhixiong, Lei Cao, and Xiliang Chen. "Meta-learning via weighted gradient update." IEEE Access 7 (2019): 110846-110855.

[15] Talakola, Swetha, and Abdul Jabbar Mohammad. “Microsoft Power BI Monitoring Using APIs for Automation”. American Journal of Data Science and Artificial Intelligence Innovations, vol. 3, Mar. 2023, pp. 171-94

[16] Atluri, Anusha. “Oracle HCM Extensibility: Architectural Patterns for Custom API Development”. International Journal of Emerging Trends in Computer Science and Information Technology, vol. 5, no. 1, Mar. 2024, pp. 21-30

[17] Raymond, Christian, et al. "Learning symbolic model-agnostic loss functions via meta-learning." arXiv preprint arXiv:2209.08907 (2022).

[18] Sangaraju, Varun Varma, and Senthilkumar Rajagopal. "Applications of Computational Models in OCD." Nutrition and Obsessive-Compulsive Disorder. CRC Press 26-35.

[19] Flennerhag, Sebastian, et al. "Meta-learning with warped gradient descent." arXiv preprint arXiv:1909.00025 (2019).

[20] Yasodhara Varma Rangineeni. “End-to-End MLOps: Automating Model Training, Deployment, and Monitoring”. JOURNAL OF RECENT TRENDS IN COMPUTER SCIENCE AND ENGINEERING ( JRTCSE), vol. 7, no. 2, Sept. 2019, pp. 60-76

[21] Chaganti, Krishna C. "Advancing AI-Driven Threat Detection in IoT Ecosystems: Addressing Scalability, Resource Constraints, and Real-Time Adaptability.

[22] Meunier, Dimitri, and Pierre Alquier. "Meta-strategy for learning tuning parameters with guarantees." Entropy 23.10 (2021): 1257.

[23] Sangeeta Anand, and Sumeet Sharma. “Temporal Data Analysis of Encounter Patterns to Predict High-Risk Patients in Medicaid”. American Journal of Autonomous Systems and Robotics Engineering, vol. 1, Mar. 2021, pp. 332-57

[24] Talakola, Swetha. “Challenges in Implementing Scan and Go Technology in Point of Sale (POS) Systems”. Essex Journal of AI Ethics and Responsible Innovation, vol. 1, Aug. 2021, pp. 266-87

[25] Syed, Ali Asghar Mehdi, and Erik Anazagasty. “Hybrid Cloud Strategies in Enterprise IT: Best Practices for Integrating AWS With on-Premise Datacenters”. American Journal of Data Science and Artificial Intelligence Innovations, vol. 2, Aug. 2022, pp. 286-09

[26] Zou, Yayi, and Xiaoqi Lu. "Gradient-em bayesian meta-learning." Advances in Neural Information Processing Systems 33 (2020): 20865-20875.

[27] Atluri, Anusha. “Leveraging Oracle HCM REST APIs for Real-Time Data Sync in Tech Organizations”. Essex Journal of AI Ethics and Responsible Innovation, vol. 1, Nov. 2021, pp. 226-4

[28] Paul, Supratik, Vitaly Kurin, and Shimon Whiteson. "Fast efficient hyperparameter tuning for policy gradient methods." Advances in Neural Information Processing Systems 32 (2019).

[29] Xu, Zhongwen, et al. "Meta-gradient reinforcement learning with an objective discovered online." Advances in Neural Information Processing Systems 33 (2020): 15254-15264.

Dynamic Loss Function Tuning via Meta-Gradient Search

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

How to Cite

Make a Submission

Callpaper

Menu

Information

Keywords

Latest publications