ETL: From Design to Deployment

Authors

  • Sandeep Kumar Oracle Corporation, Singapore Author

DOI:

https://doi.org/10.63282/3050-922X.IJERET-V2I2P102

Keywords:

ETL, Data Transformation, Data Integration, Data Loading, Data Pipelines, Deployment, Data Quality, Scalability, Performance Optimization

Abstract

The ETL (Extract, Transform, Load) process is critical for organizations aiming to harness data from various sources for analytics and decision-making. This article addresses the common challenges faced in ETL, such as data quality, scalability, and performance. It proposes solutions through the implementation of robust ETL architectures, advanced tools, and best practices. Key contributions include a comprehensive overview of ETL processes, a detailed examination of tools and technologies, and insights into deployment strategies that enhance efficiency and reliability

References

[1] Simitsis, A., Vassiliadis, P., & Sellis, T. (2005). Optimizing ETL processes in data warehouses. In 21st International Conference on Data Engineering (ICDE'05) (pp. 564-575). IEEE.

[2] Vassiliadis, P., Simitsis, A., & Skiadopoulos, S. (2002). Conceptual modeling for ETL processes. In Proceedings of the 5th ACM international workshop on Data Warehousing and OLAP (pp. 14-21).

[3] Kimball, R., & Caserta, J. (2004). The data warehouse ETL toolkit: practical techniques for extracting, cleaning, conforming, and delivering data. John Wiley & Sons.

[4] Vassiliadis, P. (2009). A survey of extract–transform–load technology. International Journal of Data Warehousing and Mining (IJDWM), 5(3), 1-27.

[5] Dinu, V., & Nadkarni, P. (2007). Guidelines for the effective use of entity-attribute-value modeling for biomedical databases. International journal of medical informatics, 76(11-12), 769-779.

[6] Golfarelli, M., & Rizzi, S. (2009). A survey on temporal data warehousing. International Journal of Data Warehousing and Mining (IJDWM), 5(1), 1-17.

[7] Kimball, R., & Ross, M. (2013). The data warehouse toolkit: the definitive guide to dimensional modeling. John Wiley & Sons.

[8] Inmon, W. H. (2005). Building the data warehouse. John wiley & sons.

[9] Jarke, M., Lenzerini, M., Vassiliou, Y., & Vassiliadis, P. (2003). Fundamentals of data warehouses. Springer Science & Business Media.

[10] Golfarelli, M., Rizzi, S., & Turricchia, E. (2011). Optimal design of star schemas. Information Systems, 36(1), 25-41.

[11] Vassiliadis, P., Simitsis, A., Georgantas, P., Terrovitis, M., & Skiadopoulos, S. (2005). A generic and customizable architecture for ETL. Data & Knowledge Engineering, 62(3), 485-510.

[12] Simitsis, A., Vassiliadis, P., & Sellis, T. (2005). State-space optimization of ETL workflows. IEEE Transactions on Knowledge and Data Engineering, 17(10), 1404-1419.

[13] Simitsis, A., & Vassiliadis, P. (2003). A method for the mapping of conceptual designs to logical blueprints for ETL processes. Decision Support Systems, 45(1), 22-40.

[14] Vassiliadis, P., Simitsis, A., Georgantas, P., & Terrovitis, M. (2005). A framework for the design of ETL scenarios. In Proceedings of the 18th international conference on Advanced Information Systems Engineering (pp. 520-535). Springer, Berlin, Heidelberg.

[15] Simitsis, A., Vassiliadis, P., & Sellis, T. (2005). Optimizing ETL processes in data warehouses. In 21st International Conference on Data Engineering (ICDE'05) (pp. 564-575). IEEE.

[16] Vassiliadis, P., Simitsis, A., & Skiadopoulos, S. (2002). Conceptual modeling for ETL processes. In Proceedings of the 5th ACM international workshop on Data Warehousing and OLAP (pp. 14-21).

[17] Kimball, R., & Caserta, J. (2004). The data warehouse ETL toolkit: practical techniques for extracting, cleaning, conforming, and delivering data. John Wiley & Sons.

[18] Vassiliadis, P. (2009). A survey of extract–transform–load technology. International Journal of Data Warehousing and Mining (IJDWM), 5(3), 1-27.

[19] Dinu, V., & Nadkarni, P. (2007). Guidelines for the effective use of entity-attribute-value modeling for biomedical databases. International journal of medical informatics, 76(11-12), 769-779.

[20] Golfarelli, M., & Rizzi, S. (2009). A survey on temporal data warehousing. International Journal of Data Warehousing and Mining (IJDWM), 5(1), 1-17.

[21] Kimball, R., & Ross, M. (2013). The data warehouse toolkit: the definitive guide to dimensional modeling. John Wiley & Sons.

[22] Inmon, W. H. (2005). Building the data warehouse. John wiley & sons.

[23] Jarke, M., Lenzerini, M., Vassiliou, Y., & Vassiliadis, P. (2003). Fundamentals of data warehouses. Springer Science & Business Media.

[24] Golfarelli, M., Rizzi, S., & Turricchia, E. (2011). Optimal design of star schemas. Information Systems, 36(1), 25-41.

[25] Vassiliadis, P., Simitsis, A., Georgantas, P., Terrovitis, M., & Skiadopoulos, S. (2005). A generic and customizable architecture for ETL. Data & Knowledge Engineering, 62(3), 485-510.

Downloads

Published

2021-05-20

Issue

Section

Articles

How to Cite

1.
Kumar S. ETL: From Design to Deployment. IJERET [Internet]. 2021 May 20 [cited 2025 Sep. 12];2(2):11-9. Available from: https://ijeret.org/index.php/ijeret/article/view/30