DATAENGINEERING
Results
Newest - Oldest

Apache Polaris™ (Incubating) 1.0 Released: A New Era for the Open Source Catalog
Apache Polaris™ (Incubating) 1.0 is here—bringing a REST-based open source catalog for modern data lakehouses with Helm chart support and new features.
JUL 10, 2025|4 min read

Centralize Your Data Lake: Apache Polaris Supports Apache Iceberg and Now Delta Lake
Learn how Apache Polaris's new Delta Lake support enables unified data catalog management, simplifying access control and interoperability across Iceberg and Delta table formats.
JUN 09, 2025|6 min read

Metadata That Works: How Snowflake Is Raising the Bar for Iceberg Performance
Discover how Snowflake boosts Apache Iceberg performance using format-agnostic metadata optimization, including NDV estimation for faster, efficient queries.
MAY 07, 2025|4 min read

Conquer Complex Updates: Mastering SQL Server's UPDATE FROM in Your Snowflake Migration
Learn how to migrate complex SQL Server UPDATE FROM statements to Snowflake. Discover best practices for handling joins, aliases, and multi-table updates.
APR 23, 2025|4 min read

Pruning for Iceberg: 90% of an Iceberg Is Underwater
Learn how Snowflake optimizes Apache Iceberg table performance through advanced data pruning techniques across Parquet file layers.
MAR 26, 2025|7 min read

How Snowflake Optimizes Apache Iceberg Queries with Adaptive Execution
By optimizing I/O parallelism and memory allocation for peak efficiency, Snowflake’s Adaptive Scan was observed to boost Apache Iceberg query performance by up to 70% for scan-heavy queries accessing small regions from micro-partitions in production.
MAR 20, 2025|5 min read

Snowflake’s Data Architecture: Enabling AI Apps, Next-Gen Lakehouse Analytics And More
Snowflake builds a cloud for data used in different types of workloads, such as AI apps, analytics & BI, transactional processing, streaming & data engineering.
FEB 25, 2025|15 min read

Profiling Python Stored Procedures to Identify Inefficiencies
Optimize Snowpark Python stored procedures with the new built-in profiler, helping users identify inefficiencies, track execution time, and enhance performance.
FEB 19, 2025|5 min read

Creating Automated Optimizations for Python User-Defined Functions with Snowpark's Parallel Execution
Snowpark’s parallel execution optimizes Python UDF performance, addressing Python global interpreter lock and data skew with advanced redistribution techniques.
JAN 30, 2025|4 min read

Creating a Secure and Flexible File Write Approach from Snowpark User-Defined Functions (UDFs)
NOV 07, 2024|4 min read

Snowpark Observability: Deep Dive into Functions and Procedures Optimizations
OCT 29, 2024|5 min read

Optimize Your Data Pipelines by Augmenting Network Concurrency with Snowpark External Access
SEP 11, 2024|8 min read
1
2