DATAENGINEERING

32 Results

Newest - Oldest
Z - A
A - Z
Newest - Oldest
Oldest - Newest
Open Source

Beyond toPandas(): Streaming PySpark Data to DuckDB via Apache Arrow

Bypass Spark’s toPandas() memory bottleneck by streaming PySpark DataFrames to DuckDB via Apache Arrow using the PyCapsule interface for efficient, zero-copy data-transfer
FEB 20, 2026|3 min read
Data Engineering

Snowflake Egress IPs: The Missing Piece for Predictable Outbound Connectivity

Simplify firewall configurations and securely integrate with third-party APIs using stable, Snowflake-managed IP ranges.
|
FEB 13, 2026|5 min read
Data Engineering

Engineering Enterprise-Grade Reliability for Writes to Any Iceberg REST Catalog

Discover the engineering behind Snowflake’s robust write path for external Iceberg tables. We detail design choices for concurrency control (with enhanced locking) and commit resilience to eliminate wasted compute and ensure data consistency.
DEC 11, 2025|7 min read
Data Engineering

Bringing the Snowflake Platform to Data Lakes

A practical guide to bridging data lake-native engines and the warehouse model: Iceberg catalogs, external catalogs vs platform catalogs, pros and cons, and when to use each
DEC 10, 2025|8 min read
Data Engineering

How Snowflake’s Storage Subsystem Handles Credentials Vended by Apache Iceberg Catalogs

Learn how Snowflake’s storage subsystem consumes and secures credentials vended by Apache Iceberg catalogs, comparing external-volume and catalog-managed models with setup tips.
DEC 09, 2025|6 min read
Data Engineering

Snowflake Managed Iceberg Tables: Industry Leading Interop Performance

Snowflake's new file size and partitioning features for Iceberg tables provide hassle-free tuning for high performance on Snowflake and external engines.
|||||
NOV 10, 2025|6 min read
Open Source

The Apache Iceberg™ Variant Type: Flexible Semistructured Data, Reimagined

Explore the new Variant type in Apache Iceberg v3, a powerful feature for storing and querying semi-structured data like logs, telemetry, and IoT data.
|
NOV 07, 2025|8 min read
Core Platform

Introducing pg_lake: Integrate Your Data Lakehouse with Postgres

Introducing pg_lake, a set of open-source PostgreSQL extensions from Snowflake that allow you to query, manage, and write to Iceberg tables in your data lakehouse.
NOV 04, 2025|6 min read
Open Source

Apache Polaris (incubating) 1.2 Released, Featuring Enhanced Governance, Connectivity and Usability

Apache Polaris 1.2 is now available, featuring enhanced enterprise data governance, improved connectivity, and schema evolution tolerance for seamless upgrades.
|
OCT 29, 2025|4 min read
Data Engineering

Scaling Streaming at Snowflake: Introducing Our Next-Gen Snowpipe Streaming Architecture

Snowpipe Streaming’s new performance architecture is now Generally Available in AWS, enabling organizations to act on data at an unprecedented scale and speed.
|
OCT 09, 2025|12 min read
Data Engineering

Building an Apache Spark™ Connect Server Powered by Snowflake

Learn how Snowflake built Snowpark Connect for Apache Spark, including technical insights, best practices, and engineering lessons from our team.
OCT 06, 2025|9 min read
Core Platform

Meet the Team Behind Snowflake Postgres

Learn about the Crunchy Data team, the Postgres experts acquired by Snowflake to build a unified platform for transactional and analytical workloads.
SEP 24, 2025|4 min read

Previous

1

2

3

Next

Where Data Does More

  • 30-day free trial
  • No credit card required
  • Cancel anytime