DATAENGINEERING
37 Results
Newest - Oldest

Do You Need to Tune Postgres Vacuum?
Learn why Postgres vacuuming matters, how autovacuum works, and when to tune it to prevent bloat, improve performance, and avoid transaction ID wraparound issues.
MAR 31, 2026|5 min read

Solving the Identifier Mismatch Challenge in Multi-Engine Lakehouses
This post explores identifier case sensitivity challenge in multi-engine lakehouses and explains how Snowflake ensures seamless interoperability
MAR 25, 2026|10 min read

Full bidirectional interoperability for Iceberg tables now in Snowflake Horizon Catalog
Today, we are thrilled to announce the final piece of the puzzle: the ability to securely write to Snowflake managed Iceberg tables from external engines via Snowflake Horizon Catalog’s implementation of open APIs.
MAR 17, 2026|5 min read

Stop Moving Data. Start Talking To It.
Learn how Snowflake's Catalog-Linked Databases (CLD) and Cortex Code CLI turn the open lakehouse into a native experience, allowing you to bridge external Iceberg catalogs to Snowflake in seconds for AI-powered insights.
MAR 11, 2026|6 min read

Building a High-Performance Postgres Time Series Stack with Iceberg
Learn how to build a vendor-agnostic, open source time series stack using Postgres extensions like pg_lake, pg_partman, and pg_incremental to offload cold data to Apache Iceberg on S3.
MAR 06, 2026|4 min read

Beyond toPandas(): Streaming PySpark Data to DuckDB via Apache Arrow
Bypass Spark’s toPandas() memory bottleneck by streaming PySpark DataFrames to DuckDB via Apache Arrow using the PyCapsule interface for efficient, zero-copy data-transfer
FEB 20, 2026|3 min read

Snowflake Egress IPs: The Missing Piece for Predictable Outbound Connectivity
Simplify firewall configurations and securely integrate with third-party APIs using stable, Snowflake-managed IP ranges.
FEB 13, 2026|5 min read

Engineering Enterprise-Grade Reliability for Writes to Any Iceberg REST Catalog
Discover the engineering behind Snowflake’s robust write path for external Iceberg tables. We detail design choices for concurrency control (with enhanced locking) and commit resilience to eliminate wasted compute and ensure data consistency.
DEC 11, 2025|7 min read

Bringing the Snowflake Platform to Data Lakes
A practical guide to bridging data lake-native engines and the warehouse model: Iceberg catalogs, external catalogs vs platform catalogs, pros and cons, and when to use each
DEC 10, 2025|8 min read

How Snowflake’s Storage Subsystem Handles Credentials Vended by Apache Iceberg Catalogs
Learn how Snowflake’s storage subsystem consumes and secures credentials vended by Apache Iceberg catalogs, comparing external-volume and catalog-managed models with setup tips.
DEC 09, 2025|6 min read

Snowflake Managed Iceberg Tables: Industry Leading Interop Performance
Snowflake's new file size and partitioning features for Iceberg tables provide hassle-free tuning for high performance on Snowflake and external engines.
NOV 10, 2025|6 min read

The Apache Iceberg™ Variant Type: Flexible Semistructured Data, Reimagined
Explore the new Variant type in Apache Iceberg v3, a powerful feature for storing and querying semi-structured data like logs, telemetry, and IoT data.
NOV 07, 2025|8 min read
1
2
3
4