Predictive Performance Tuning

Authors

  • Nagireddy Karri Senior IT Administrator Database, Sherwin-Williams, USA. Author
  • Partha Sarathi Reddy Pedda Muntala Software Developer at Cisco Systems, Inc, USA. Author
  • Sandeep Kumar Jangam Lead Consultant, Infosys Limited, USA. Author

DOI:

https://doi.org/10.63282/3050-922X.IJERET-V2I1P108

Keywords:

Predictive Performance Tuning, Tail Latency, Bayesian Optimization, Reinforcement Learning, Autoscaling, Observability, Concept Drift

Abstract

Predictive performance tuning transforms system engineering to be reactive firefighting and becomes anticipatory control by predicting the load, contention, and tail latency and responds before SLOs are violated. In 2021, with the maturation of observability (high-cardinality metrics, distributed tracing, eBPF profiling) and cloud orchestration (containers, service meshes, autoscaling), it became possible to have closed-loop pipelines, which learn through telemetry, choose configurations, and roll out changes using safety guardrails. Three layers application (query planning, caching, concurrency), runtime (thread pools, GC/JIT), and infrastructure (autoscaling targets, placement, I/O limits), have been synthesized in this paper. Describe a reference architecture that consumes and authenticates telemetry, develops features, predicts the workload and performance (e.g., p95/p99), and conditions predictions to decision policies through Bayesian optimization, bandits, and reinforcement learning. Compared trials demonstrate similar decreases in SLO-miss rates and tail latency as compared to the use of static or threshold heuristics and resource use efficiency and cost per request. Examine failure modes concept drift, actuator lag, and interference as well as governance requirements like canaries, rollbacks and auditability. Lastly suggest a way ahead that integrates model-based surrogates (queueing/control) with learning policies that can be safer to explore and to be portable, and use carbon/energy signals in multi-objective control. Collectively, predictive tuning is brought out as a practical guidebook on how to run cloud-native systems with strict SLOs and trade-off costs, risk and sustainability

References

[1] Van Aken, D., Yang, D., Brillard, S., Fiorino, A., Zhang, B., Bilien, C., & Pavlo, A. (2021). An inquiry into machine learning-based automatic configuration tuning services on real-world database management systems. Proceedings of the VLDB Endowment, 14(7), 1241-1253.

[2] Costa, R. L. D. C., Moreira, J., Pintor, P., dos Santos, V., & Lifschitz, S. (2021). A survey on data-driven performance tuning for big data analytics platforms. Big Data Research, 25, 100206.

[3] Akash, L., Fernando, D., Jayasinghe, M., Keppitiyagama, C., & Thangarajah, K. (2021, December). Machine Learning Based Thread Pool Tuning via Program Analysis. In 2021 IEEE 23rd Int Conf on High Performance Computing & Communications; 7th Int Conf on Data Science & Systems; 19th Int Conf on Smart City; 7th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys) (pp. 648-653). IEEE.

[4] Kaliappan, J., Srinivasan, K., Mian Qaisar, S., Sundararajan, K., Chang, C. Y., & C, S. (2021). Performance evaluation of regression models for the prediction of the COVID-19 reproduction rate. Frontiers in Public Health, 9, 729795.

[5] Towards predictive accuracy: tuning hyperparameters and pipelines, domino, 2021. online. https://domino.ai/blog/towards-predictive-accuracy-tuning-hyperparameters-and-pipelines

[6] Lee, J., & Yu, Z. H. (1994). Tuning of model predictive controllers for robust performance. Computers & chemical engineering, 18(1), 15-37.

[7] Hutter, F., Hamadi, Y., Hoos, H. H., & Leyton-Brown, K. (2006, September). Performance prediction and automated tuning of randomized and parametric algorithms. In International conference on principles and practice of constraint programming (pp. 213-228). Berlin, Heidelberg: Springer Berlin Heidelberg.

[8] Garriga, J. L., & Soroush, M. (2010). Model predictive control tuning methods: A review. Industrial & Engineering Chemistry Research, 49(8), 3505-3515.

[9] Hoefler, T., Gropp, W., Kramer, W., & Snir, M. (2011). Performance modeling for systematic performance tuning. In State of the Practice Reports (pp. 1-12).

[10] Koliai, S., Zuckerman, S., Oseret, E., Ivascot, M., Moseley, T., Quang, D., & Jalby, W. (2009, October). A balanced approach to application performance tuning. In International Workshop on Languages and Compilers for Parallel Computing (pp. 111-125). Berlin, Heidelberg: Springer Berlin Heidelberg.

[11] Cong, G., Chung, I. H., Wen, H. F., Klepacki, D., Murata, H., Negishi, Y., & Moriyama, T. (2011). A systematic approach toward automated performance analysis and tuning. IEEE Transactions on Parallel and Distributed Systems, 23(3), 426-435.

[12] Yigitbasi, N., Willke, T. L., Liao, G., & Epema, D. (2013, August). Towards machine learning-based auto-tuning of mapreduce. In 2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems (pp. 11-20). IEEE.

[13] Ertuğrul, E., Baytar, Z., Çatal, Ç., & Muratli, Ö. C. (2019). Performance tuning for machine learning-based software development effort prediction models. Turkish Journal of Electrical Engineering and Computer Sciences, 27(2), 1308-1324.

[14] Yildirim, M., Gebraeel, N. Z., & Sun, X. A. (2017). Integrated predictive analytics and optimization for opportunistic maintenance and operations in wind farms. IEEE Transactions on power systems, 32(6), 4319-4328.

[15] Taylor, T., Araujo, F., & Shu, X. (2020, December). Towards an open format for scalable system telemetry. In 2020 IEEE International Conference on Big Data (Big Data) (pp. 1031-1040). IEEE.

[16] Baker, B., Gupta, O., Raskar, R., & Naik, N. (2017). Accelerating neural architecture search using performance prediction. arXiv preprint arXiv:1705.10823.

[17] Roberts, D. R., Bahn, V., Ciuti, S., Boyce, M. S., Elith, J., Guillera‐Arroita, G., ... & Dormann, C. F. (2017). Cross‐validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography, 40(8), 913-929.

[18] Chao, L., Peng, X., Xu, Z., & Zhang, L. (2019). Ecosystem of things: Hardware, software, and architecture. Proceedings of the IEEE, 107(8), 1563-1583.

[19] Nawrocki, P., & Osypanka, P. (2021). Cloud resource demand prediction using machine learning in the context of qos parameters. Journal of Grid Computing, 19(2), 20.

[20] Ameer, S., Shah, M. A., Khan, A., Song, H., Maple, C., Islam, S. U., & Asghar, M. N. (2019). Comparative analysis of machine learning techniques for predicting air quality in smart cities. IEEE access, 7, 128325-128338.

[21] Schubnel, B., Carrillo, R. E., Alet, P. J., & Hutter, A. (2020). A hybrid learning method for system identification and optimal control. IEEE Transactions on Neural Networks and Learning Systems, 32(9), 4096-4110.

Downloads

Published

2025-10-23

Issue

Section

Articles

How to Cite

1.
Karri N, Pedda Muntala PSR, Jangam SK. Predictive Performance Tuning. IJERET [Internet]. 2025 Oct. 23 [cited 2025 Oct. 28];2(1):67-76. Available from: https://ijeret.org/index.php/ijeret/article/view/311