Snowflake Intelligence is here

Bring enterprise insights to every employee’s fingertips.

Building a Scalable Data Ingestion Framework

A well-designed data ingestion framework forms the foundation of scalable data architecture, enabling businesses to collect, move and prepare data for analysis at speed and scale.

  1. Home
  2. Data Engineering
  3. Data Ingestion
  4. Data Ingestion Framework
  • Overview
  • What Is a Data Ingestion Framework?
  • Data Ingestion Versus Data Integration
  • How To Know Which Integration Process To Use
  • Resources

Overview

Today, data is streaming into businesses from a variety of sources: applications, SaaS solutions, social channels, mobile devices, IoT devices, and more. The big data revolution has created astonishing increases in the volume, velocity, and variety of data.  Organizations can increase the value of all this data by making better data-driven decisions when the data is easily accessible to analytics and data science teams. To handle these increases and make data accessible, organizations require a modern data ingestion framework. If you’re new to data ingestion, read on to learn the types of ingestion and how data ingestion relates to data integration.

What Is a Data Ingestion Framework?

A data ingestion framework is a process for transporting data from various sources to a storage repository or data processing tool. While there are several ways to design a framework based on different models and architectures, data ingestion is done in one of two ways: batch or streaming. How you ingest data will depend on your data source(s) and how quickly you need the data for analysis.

Batch data ingestion

A batch data ingestion framework was the method used for all data ingested before the rise of big data, and it continues to be commonly used. Batch processing groups data and transports it into a data platform or tool periodically, in batches. While batch processing is usually cheaper since it uses fewer computing resources, it can be slow if you’re working with a lot of data. If you need real-time or near real-time access to data, it’s best to ingest data using a streaming process.

Streaming data ingestion

Streaming data ingestion continuously transports data into a data platform as soon as data is created (or identified by the system). It’s ideal for business intelligence that requires up-to-the-minute data to ensure the best accuracy and quick problem-solving. 

The lines between batch processing and streaming are becoming blurry in some instances. Some tools marketed as streaming are actually using batch processing. Because they use small data groups and ingest data at short intervals, the process is incredibly fast. This approach is sometimes called micro-batching.

Data Ingestion Versus Data Integration

Although data integration is related to the data ingestion framework, it’s not the same thing as data ingestion. Integration usually includes ingestion, but it involves additional processes that ensure the data is compatible with the repository and with existent data. Another way of thinking about it is that data ingestion is focused on transporting data into a repository or tool, while data integration works further with the data sets to combine them into an accurate single source of truth.

ETL and ELT

There are two data integration methods: extract, transform, and load (ETL) and extract, load, and transform (ELT). The difference between the two lies in the sequence of events in each process.

ETL: ETL collects data from various sources, transforms it (cleanses, merges, and validates), and then loads it into a data platform or tool. All data is transformed before it enters the destination. 

ELT: As compute and storage technology developed, the transformation process gained speed and became more flexible, and ELT was born. ELT allows raw data to be loaded into a database or data platform. The transformation process then happens ad hoc when a user is ready to conduct an analysis. This approach allows organizations to efficiently collect substantial data sets from many different sources for use in daily decision-making.

How To Know Which Integration Process To Use

A holistic approach to data ingestion and data integration considers not just how data is moved into the data platform, but also how it’s integrated and analyzed. While ETL is fine for some use cases, ELT ensures you have all the data ready when you need it. With a strong platform that supports ELT, you can transform data with all the performance, scalability, and concurrency you need, right where your data lives. 

Some data warehouses and data platforms (including Snowflake) have designed proprietary tools to ingest data. These capabilities take ingestion and integration to the next level, streamlining the process while optimizing resource usage.

What Is Data Transformation? Techniques and Best Practices

Data transformation serves as the engine behind modern data pipelines, powering everything from real-time analytics to advanced AI and ML applications.

Enterprise Data Warehouse: Benefits & Components

Discover what an enterprise data warehouse (EDW) is, explore key benefits, and how it supports modern data warehouse solutions.

What is Compliance Monitoring? Components and Benefits

Learn what compliance monitoring is, how it works, key components, industry use cases and how Snowflake supports security compliance monitoring at scale.

What Is Data Analytics? A Complete Guide

Learn about data analytics technology, explore top tools and types, and see how our analytics services power smarter decisions.

What Is Cloud Analytics? A Guide to Data-Driven Insights

What is cloud analytics? Learn how cloud based analytics works and explore the top tools and services to find the right cloud analytics platform for you.

What Is Data Modeling? Types, Benefits & Approaches

Learn what data modeling is, its key benefits, main types, and approaches. Discover how data modeling improves data quality, integration, and analytics.

What Is Data Ingestion? Process & Tools [2025]

Explore data ingestion, including its process, types, architecture and leading tools to efficiently collect, prepare and analyze data in 2025.

Ecommerce Analytics: Using Data to Grow Sales

Master ecommerce analytics to increase conversions, optimize inventory & maximize ROAS. Learn key metrics, buyer journey insights & Snowflake integration.

What Is ELT (Extract, Load, Transform)?

Extract, load, transform (ELT) has emerged as a modern data integration technique that enables businesses to efficiently process and analyze vast amounts of information.