DATAENGINEERING
18 Results
Newest - Oldest

Apache Iceberg™ 1.10 Is Here: Features, Fixes and Future-Ready Spec Work
Apache Iceberg 1.10 is here! Discover the key features, bug fixes, and spec updates in this new release, including support for Flink 2.0 and Spark 4.0, BigQuery integration, and core performance improvements.
SEP 11, 2025|4 min read

Building a Spark Connect Engine on Snowflake: An Engineering Deep Dive
Learn how Snowflake built Snowpark Connect for Apache Spark, including technical insights, best practices, and engineering lessons from our team.
SEP 09, 2025|10 min read

Real-Time Change Data Capture at Scale: Engineering Openflow's Database Replication Architecture
Openflow’s CDC architecture streams real-time changes from operational databases into Snowflake for faster analytics and AI—no fragile ETL pipelines required.
JUL 31, 2025|7 min read

Supercharging DML Performance with Snowflake Gen2 Warehouses
Accelerate data modification workloads with Snowflake Gen2 Warehouses—automatically and without code changes—enabling faster pipelines and better performance.
JUL 24, 2025|6 min read

Apache Polaris™ (Incubating) 1.0 Released: A New Era for the Open Source Catalog
Apache Polaris™ (Incubating) 1.0 is here—bringing a REST-based open source catalog for modern data lakehouses with Helm chart support and new features.
JUL 10, 2025|4 min read

Centralize Your Data Lake: Apache Polaris Supports Apache Iceberg and Now Delta Lake
Learn how Apache Polaris's new Delta Lake support enables unified data catalog management, simplifying access control and interoperability across Iceberg and Delta table formats.
JUN 09, 2025|6 min read

Metadata That Works: How Snowflake Is Raising the Bar for Iceberg Performance
Discover how Snowflake boosts Apache Iceberg performance using format-agnostic metadata optimization, including NDV estimation for faster, efficient queries.
MAY 07, 2025|4 min read

Conquer Complex Updates: Mastering SQL Server's UPDATE FROM in Your Snowflake Migration
Learn how to migrate complex SQL Server UPDATE FROM statements to Snowflake. Discover best practices for handling joins, aliases, and multi-table updates.
APR 23, 2025|4 min read

Pruning for Iceberg: 90% of an Iceberg Is Underwater
Learn how Snowflake optimizes Apache Iceberg table performance through advanced data pruning techniques across Parquet file layers.
MAR 26, 2025|7 min read

How Snowflake Optimizes Apache Iceberg Queries with Adaptive Execution
By optimizing I/O parallelism and memory allocation for peak efficiency, Snowflake’s Adaptive Scan was observed to boost Apache Iceberg query performance by up to 70% for scan-heavy queries accessing small regions from micro-partitions in production.
MAR 20, 2025|5 min read

Snowflake’s Data Architecture: Enabling AI Apps, Next-Gen Lakehouse Analytics And More
Snowflake builds a cloud for data used in different types of workloads, such as AI apps, analytics & BI, transactional processing, streaming & data engineering.
FEB 25, 2025|15 min read

Profiling Python Stored Procedures to Identify Inefficiencies
Optimize Snowpark Python stored procedures with the new built-in profiler, helping users identify inefficiencies, track execution time, and enhance performance.
FEB 19, 2025|5 min read
1
2