Skip to content
Start for Free
Start for Free
Data Engineering Guide
Articles about important cloud data engineering topics, including ETL, data integration, and JSON.
What is ETL?
ETL stands for 'extract, transform, load', the three processes that move data from one database, multiple databases, or other sources to a unified repository.
Data integration is the process of combining data from multiple sources into a unified view to provide users valuable and actionable information.
What is Spark TensorFlow?
Spark Tensorflow: TensorFlow is an open-source project library from Google for machine learning
What is Machine Learning?
Machine learning is an application of artificial intelligence (AI) that enables systems to learn automatically and improve through experience without the assistance of explicit programming.
Data Science Pipeline
A data science pipeline is the set of processes that convert raw data into actionable answers to business questions.
A data pipeline is a means of moving data from one place (the source) to a destination (such as a data warehouse). Along the way, data is transformed and optimized, arriving in a state that can be analyzed and used to develop business insights. Learn how.
The Pitfalls of ETL Processing
ETL processing has been around since the 1970s but 40 years later many organizations continue to suffer from the constraints of traditional batch processing
Master Data Management (MDM)
Master Data Management (MDM), which goes hand in hand with data governance, has moved to the forefront of data-driven organizations.
Big Data Engineering
Big data engineering is the process of managing the ingestion and transformation of high volume data sets from various disparate sources.
Snowflake Workloads Overview