Data Loading, ETL, and ELT
Traditionally, data loading was considered the act of copying and loading data from an application, source file, folder or other location to a database or similar application. Data was copied from a source and then loaded into a data storage solution or location (such as a data lake or data warehouse).
With the rise of more sophisticated data loading mechanisms built to address larger and larger data sets, data loading tools became the L in ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform). With these processes, data is collected from multiple sources, cleaned and formatted, and then loaded into the storage system (with ELT, the last two steps are naturally reversed).
Increasingly, many ETL solutions are cloud-based, which boosts performance and scalabilty. On-premises data loading via ETL has traditionally relied on scripts to collect and load data, which is both slow and error-prone.
As data volumes continue to grow, driven by enormous amounts of online, IoT, and other unstructured and semi-structured data, continuous data ingestion has become essential to companies that require almost real-time analytics to run their businesses. With these requirement, large-batch ETL is not always up to the task. Modern data pipelines built on the ELT model are designed to extract and load the data first and then transform the data once it reaches its intended destination. More modern ELT systems move transformation workloads to the cloud, enabling much greater scalability and elasticity.
Data Loading and Snowflake
The Snowflake Data Cloud includes flexible, scalable data pipeline capabilities, including ELT. Users can continuously ingest raw data directly into Snowflake, so they do not require the pipeline to transform data into a different format. Snowflake automatically performs these transformations, significantly reducing storage and compute costs.