A Real-Time Enterprise Application Architecture for High-Volume Data Processing with Integrated Master Data Management

Authors

  • Divya Sai Jaladi Application Developer, South Carolina Department of Motor Vehicles, USA. Author
  • Ashok Mallempati Software Engineer, Kemper Corporation, Chicago, IL, USA. Author

DOI:

https://doi.org/10.63282/3050-922X.IJERET-V6I1P116

Keywords:

Real-Time Processing, Enterprise Architecture, Master Data Management, Distributed Systems, Stream Processing, Data Governance, Microservices, Data Integration, High-Volume Data, Event-Driven Architecture

Abstract

With the digital transformation, Internet of Things (IoT), cloud computing, and real-time customer interactions, the volume of enterprise data has grown exponentially, making it more challenging than ever to handle, process, and ensure data consistency within large systems. Companies are being compelled to handle large volumes of data, high velocities, and high variations of data and maintain accuracy, governance, and consistency by having a well-built Master Data Management (MDM) systems. Traditional batch-based architectures, as well as siloed data management systems, are not able to support the needs of real-time decision-making, operational flexibility, and regulatory compliance. The paper suggests a Real-Time Enterprise Application Architecture (RTEAA) that would need to meet the requirements of processing a large volume of data and at the same time integrate the principles of Master Data Management smoothly. It relies on the architecture based on the distributed computing paradigm, event-driven processing, design using microservices, and scalable data pipelines to support real-time analytics and data synchronization. With the integration of MDM, key business entities, including customers, products and suppliers, have one source of truth in enterprise systems. The suggested architecture utilizes the latest technologies such as stream processing engines, distributed messaging systems, in-memory databases and cloud-native infrastructure. It focuses on data governance, metadata management, and data quality enforcement controls integrated into the processing pipeline. Moreover, the architecture allows hybrid deployment models, which allows businesses to run on premises and on the cloud without impacting performance and data quality. One of the contributions of this work is the concept of a synchronized data orchestration layer, which provides a connection between real-time data ingestion and MDM validation and enrichment processes. The layer will make sure that data flowing through the system is constantly checked, de-duplicated and made to match the master data standards. Also, the architecture utilizes machine learning-driven anomaly detection to increase the quality of data and identify anomalies in real time. The system design, implementation plan and testing under simulated enterprise workloads are detailed in the methodology. The effectiveness of the proposed approach is demonstrated by analyzing performance metrics throughput, latency, scalability, and data consistency. Findings show that processing efficiency, latency and data reliability are greatly improved over traditional architectures. The study is beneficial in the area of enterprise systems because it delivers an in-depth model of integrating real-time data processing and master data governance. Operational excellence, better decision-making, and adherence to data regulations are the key attributes that the proposed architecture allows to ensure its applicability to the finance, healthcare, retail, and manufacturing industries.

References

[1] Hohpe, G., & Woolf, B. (2004). Enterprise integration patterns: Designing, building, and deploying messaging solutions. Addison-Wesley Professional.

[2] White, T. (2012). Hadoop: The definitive guide. " O'Reilly Media, Inc.".

[3] Zaharia, M., Xin, R. S., Wendell, P., Das, T., Armbrust, M., Dave, A., ... & Stoica, I. (2016). Apache spark: a unified engine for big data processing. Communications of the ACM, 59(11), 56-65.

[4] Warren, J., & Marz, N. (2015). Big Data: Principles and best practices of scalable realtime data systems. Simon and Schuster.

[5] Kreps, J., Narkhede, N., & Rao, J. (2011, June). Kafka: A distributed messaging system for log processing. In Proceedings of the NetDB (Vol. 11, No. 2011, pp. 1-7).

[6] Carbone, P., Katsifodimos, A., Ewen, S., Markl, V., Haridi, S., & Tzoumas, K. (2015). Apache flink: Stream and batch processing in a single engine. The Bulletin of the Technical Committee on Data Engineering, 38(4).

[7] Karau, H., Konwinski, A., Wendell, P., & Zaharia, M. (2015). Learning spark: lightning-fast big data analysis. " O'Reilly Media, Inc.".

[8] Lakshman, A., & Malik, P. (2010). Cassandra: a decentralized structured storage system. ACM SIGOPS operating systems review, 44(2), 35-40.

[9] Silvola, R., Jaaskelainen, O., Kropsu‐Vehkapera, H., & Haapasalo, H. (2011). Managing one master data–challenges and preconditions. Industrial Management & Data Systems, 111(1), 146-162.

[10] Otto, B. (2011). A morphology of the organisation of data governance.

[11] Koomey, J. (2011). Growth in data center electricity use 2005 to 2010. A report by Analytical Press, completed at the request of The New York Times, 9(2011), 161.

[12] Liu, X., Iftikhar, N., & Xie, X. (2014, July). Survey of real-time processing systems for big data. In Proceedings of the 18th international database engineering & applications symposium (pp. 356-361).

[13] Chen, W. J., Eshwar, B., Rajendiran, R., Srinivas, S., Subramanian, M. B., & Venkatasubramanian, B. (2014). Master Data Management for SaaS Applications. IBM Redbooks.

[14] Rathore, M. M. U., Paul, A., Ahmad, A., Chen, B. W., Huang, B., & Ji, W. (2015). Real-time big data analytical architecture for remote sensing application. IEEE journal of selected topics in applied earth observations and remote sensing, 8(10), 4610-4621.

[15] Chodorow, K. (2013). MongoDB: the definitive guide. "O'Reilly Media, Inc.".

[16] Brückmann, T., Gruhn, V., & Pfeiffer, M. (2011, September). Towards real-time monitoring and controlling of enterprise architectures using business software control centers. In European Conference on Software Architecture (pp. 287-294). Berlin, Heidelberg: Springer Berlin Heidelberg.

[17] Nathali Silva, B., Khan, M., & Han, K. (2017). Big data analytics embedded smart city architecture for performance enhancement through real‐time data processing and decision‐making. Wireless communications and mobile computing, 2017(1), 9429676.

[18] Loshin, D. (2010). Master data management. Morgan Kaufmann.

[19] Ng, S. T., Xu, F. J., Yang, Y., & Lu, M. (2017). A master data management solution to unlock the value of big infrastructure data for smart, sustainable and resilient city planning. Procedia engineering, 196, 939-947.

[20] Zimmermann, A., Schmidt, R., Sandkuhl, K., Jugel, D., Bogner, J., & Möhring, M. (2018, October). Evolution of enterprise architecture for digital transformation. In 2018 IEEE 22nd International Enterprise Distributed Object Computing Workshop (EDOCW) (pp. 87-96). IEEE.

[21] Allen, M., & Cervo, D. (2015). Multi-domain master data management: Advanced MDM and data governance in practice. Morgan Kaufmann.

[22] Vera-Baquero, A., Colomo-Palacios, R., & Molloy, O. (2016). Real-time business activity monitoring and analysis of process performance on big-data domains. Telematics and Informatics, 33(3), 793-807

Downloads

Published

2025-03-25

Issue

Section

Articles

How to Cite

1.
Jaladi DS, Mallempati A. A Real-Time Enterprise Application Architecture for High-Volume Data Processing with Integrated Master Data Management. IJERET [Internet]. 2025 Mar. 25 [cited 2026 Apr. 15];6(1):126-34. Available from: https://ijeret.org/index.php/ijeret/article/view/538