Integrating ETL with other data management approaches
DOI:
https://doi.org/10.63282/3050-922X.IJERET-V6I2P114Keywords:
ETL (Extract, Transform, Load), Data Warehousing, Data Lakes, Master Data Management (MDM), Data GovernanceAbstract
ETL activity is one of the key elements of the current data ecosystem, and its presence guarantees trustworthy integration and conversion of data across multiple sources. Although ETL has always been connected to data warehousing, it gained additional functions due to popularization of big data and real-time analytics, as well as regulatory needs. This paper discusses ETL integration with five mainstream methods of data management, which include data warehousing, data lakes, master data management (MDM), data governance and data virtualization. Through a series of studies based on synergies, challenges and emerging trends, the study has indicated the changing role of ETL in helping organizations attain data consistency, quality and strategic value.
References
[1] Badgujar, P., 2021. Optimizing ETL Processes for Large-Scale Data Warehouses. Journal of Technological Innovations, 2(4).
[2] Biswas, N., Sarkar, A. and Mondal, K.C., 2020. Efficient incremental loading in ETL processing for real-time data integration. Innovations in Systems and Software Engineering, 16(1), pp.53-61.
[3] Cheruku, S.R., Goel, O. and Jain, S., 2024. A comparative study of ETL tools: DataStage vs. Talend. Journal of Quantum Science and Technology, 1(1), p.80.
[4] DataCamp, 2020. Data Lakes vs. Data Warehouses. DataCamp blog. Available at: https://www.datacamp.com/blog/data-lakes-vs-data-warehouses.
[5] Educative, 2025. What is an ETL Pipeline? Educative. Available at: https://www.educative.io/courses/transferring-data-with-etl/what-is-an-etl-pipeline.
[6] Ghassani, A., 2023. Training 5: Introduction to ETL, Data Lake, Data Warehouse and Setup Environment. Medium. Available at: <https://ashilaghassani99.medium.com/training-5-introduction-to-etl-data-lake-data-warehouse-and
[7] Hamza, O., Collins, A., Eweje, A. and Babatunde, G.O., 2024. Advancing data migration and virtualization techniques: ETL-driven strategies for Oracle BI and Salesforce integration in agile environments. International Journal of Multidisciplinary Research and Growth Evaluation, 5(1), pp.1100-1118.
[8] IBM, 2024. Data warehouses vs. data lakes vs. data lakehouses. IBM Think. Available at: https://www.ibm.com/think/topics/data-warehouse-vs-data-lake-vs-data-lakehouse (Accessed: 10 September 2025).
[9] Kumaran, R., 2021. ETL Techniques for Structured and Unstructured Data. International Research Journal of Engineering and Technology (IRJET), 8, pp.1727-1735.
[10] Lavanya, D., Marupaka, D., Rangineni, S., Agarwal, S., Thammareddi, L. and Shynu, T., 2024. Evolving Business Intelligence on Data Integration, ETL Procedures, and the Power of Predictive Analytics. In Data-Driven Intelligent Business Sustainability (pp. 1-17). IGI Global Scientific Publishing.
[11] Machireddy, J.R., 2023. Data quality management and performance optimization for enterprise-scale etl pipelines in modern analytical ecosystems. Journal of Data Science, Predictive Analytics, and Big Data Applications, 8(7), pp.1-26.
[12] Mishra, S., 2020. Automating the data integration and ETL pipelines through machine learning to handle massive datasets in the enterprise. International Journal of Emerging Research in Engineering and Technology, 1(2), pp.69-78.
[13] Nwokeji, J.C. and Matovu, R., 2021. A systematic literature review on big data extraction, transformation and loading (etl). Intelligent computing, pp.308-324.
[14] Paul, C., Shama, V. & Laisis, R., 2022. ETL in the Era of Big Data: Challenges and Solutions. Available at: https://www.researchgate.net/publication/387534198_ETL_in_the_Era_of_Big_Data_Challenges_and_Solutions.
[15] Peng, Y. et al., 2024. Use of metadata-driven approaches for data harmonization in the medical domain: scoping review. JMIR Medical Informatics, 12, e52967. Available at: doi: 10.2196/52967.
[16] Seenivasan, D., 2021. ETL in a World of Unstructured Data: Advanced Techniques for Data Integration. International Journal of Management, IT and Engineering (IJMIE), 11(1), pp.127-145.
[17] Sreemathy, J., Nisha, S. and RM, G.P., 2020, March. Data integration in ETL using TALEND. In 2020 6th international conference on advanced computing and communication systems (ICACCS) (pp. 1444-1448). IEEE.
[18] Walha, A., Ghozzi, F. & Gargouri, F., 2024. Data integration from traditional to big data: main features and comparisons of ETL approaches. The Journal of Supercomputing, 80(19), pp.26687–26725. Available at: doi: 10.1007/s11227-024-06413-1.
[19] Walha, A., Ghozzi, F. and Gargouri, F., 2024. Data integration from traditional to big data: main features and comparisons of ETL approaches. The Journal of Supercomputing, 80(19), pp.26687-26725.