ETL

What is ETL?

ETL (extract, transform, load) is three combined processes that are used to pull data from one database and move it to another database. It is a common function in data warehousing.

The Extract, Transform, Load Process

The process comprises three distinct phrases:

Extract: data is extracted from homogeneous or heterogeneous or homogeneous data sets

For extraction, most data warehouses combine data from disparate sources into a single data set. Then data validation is performed. If validation fails, some or all data may be removed from the extraction process

Transform: data is transformed into the proper formats for storage

During data transformation, rules are applied to the data before it is loaded. Transformation is also the stage where data cleansing is performed to eliminate sending bad or inappropriate data to the data warehouse or other end destination.

Load: data is finally loaded into the destination database, data mart, or data warehouse

When a load enters the data warehouse, it may also be subject to any schema constraints or limitations of the receiving database, which can impact end data quality.

Snowflake and ETL

As highlighted above, there are several potential points of failure during any ETL process. Snowflake eliminates the need for lengthy, risky and often labor intensive extract, transform. load processes by making data easily accessible for internal and external partners via data sharing and Snowflake for Collaboration.