DATAENGINEERING
32 Results
Newest - Oldest

Beyond toPandas(): Streaming PySpark Data to DuckDB via Apache Arrow
Bypass Spark’s toPandas() memory bottleneck by streaming PySpark DataFrames to DuckDB via Apache Arrow using the PyCapsule interface for efficient, zero-copy data-transfer
FEB 20, 2026|3 min read

Snowflake Egress IPs: The Missing Piece for Predictable Outbound Connectivity
Simplify firewall configurations and securely integrate with third-party APIs using stable, Snowflake-managed IP ranges.
FEB 13, 2026|5 min read

Engineering Enterprise-Grade Reliability for Writes to Any Iceberg REST Catalog
Discover the engineering behind Snowflake’s robust write path for external Iceberg tables. We detail design choices for concurrency control (with enhanced locking) and commit resilience to eliminate wasted compute and ensure data consistency.
DEC 11, 2025|7 min read

Bringing the Snowflake Platform to Data Lakes
A practical guide to bridging data lake-native engines and the warehouse model: Iceberg catalogs, external catalogs vs platform catalogs, pros and cons, and when to use each
DEC 10, 2025|8 min read

How Snowflake’s Storage Subsystem Handles Credentials Vended by Apache Iceberg Catalogs
Learn how Snowflake’s storage subsystem consumes and secures credentials vended by Apache Iceberg catalogs, comparing external-volume and catalog-managed models with setup tips.
DEC 09, 2025|6 min read

Snowflake Managed Iceberg Tables: Industry Leading Interop Performance
Snowflake's new file size and partitioning features for Iceberg tables provide hassle-free tuning for high performance on Snowflake and external engines.
NOV 10, 2025|6 min read

The Apache Iceberg™ Variant Type: Flexible Semistructured Data, Reimagined
Explore the new Variant type in Apache Iceberg v3, a powerful feature for storing and querying semi-structured data like logs, telemetry, and IoT data.
NOV 07, 2025|8 min read

Introducing pg_lake: Integrate Your Data Lakehouse with Postgres
Introducing pg_lake, a set of open-source PostgreSQL extensions from Snowflake that allow you to query, manage, and write to Iceberg tables in your data lakehouse.
NOV 04, 2025|6 min read

Apache Polaris (incubating) 1.2 Released, Featuring Enhanced Governance, Connectivity and Usability
Apache Polaris 1.2 is now available, featuring enhanced enterprise data governance, improved connectivity, and schema evolution tolerance for seamless upgrades.
OCT 29, 2025|4 min read

Scaling Streaming at Snowflake: Introducing Our Next-Gen Snowpipe Streaming Architecture
Snowpipe Streaming’s new performance architecture is now Generally Available in AWS, enabling organizations to act on data at an unprecedented scale and speed.
OCT 09, 2025|12 min read

Building an Apache Spark™ Connect Server Powered by Snowflake
Learn how Snowflake built Snowpark Connect for Apache Spark, including technical insights, best practices, and engineering lessons from our team.
OCT 06, 2025|9 min read

Meet the Team Behind Snowflake Postgres
Learn about the Crunchy Data team, the Postgres experts acquired by Snowflake to build a unified platform for transactional and analytical workloads.
SEP 24, 2025|4 min read
1
2
3