Data Engineering

Apache Iceberg™ 1.10 Is Here: Features, Fixes and Future-Ready Spec Work

The Apache Iceberg™ community has officially released version 1.10! This milestone brings a host of new capabilities, meaningful refinements and thoughtful fixes that continue to make Iceberg one of the most robust open table formats for modern data lakes and lakehouses.

So let’s dive in and take a closer look at what’s new in Iceberg 1.10, and why it matters for you.

Core changes and new capabilities

First up, Apache Flink® and Apache Spark™ users get an immediate win in Iceberg 1.10 with support for both Flink 2.0 and Spark 4.0. This allows you to continue to pair Iceberg with the latest streaming and batch engines without missing a beat. 

For Flink users, we have a change to the Flink IcebergSink with added support for a dynamic sink. This change makes it so that schema evolution no longer requires manual intervention, instead allowing the input data schema changes to be propagated directly to the Iceberg tables automatically. The IcebergSink also now supports table fan-out and auto-creation of new tables in the event that new data types appear in the data stream. 

It’s also worth noting that, despite the addition of the IcebergSource and IcebergSink connectors, the existing FlinkSource and FlinkSink interfaces are still available for the time being; expect more community discussion about when they’ll be fully phased out.

On the Spark side, users have access to a new Spark Action for computing partition statistics, which is made even better by the fact that you can now calculate stats incrementally. Instead of recalculating everything from scratch, Iceberg reuses existing stats files, saving time and compute.

Another standout feature in 1.10 is a new catalog plugin to support using the BigQuery Metastore Catalog in Iceberg, giving you more flexibility to plug Iceberg into the BigQuery data ecosystem. The BigQuery Metastore catalog has also been added to the Iceberg Kafka Connect runtime bundle, as well.

The final notable change from this release impacts Iceberg’s core compaction logic. The compaction code for both Flink and Spark have undergone a thoughtful refactor so that both implementations share the same underlying code and configuration for compaction operations. On top of this, users also have more control over compaction tasks with a new max-files-to-rewrite setting to limit the scope of your compaction jobs.

Key bug fixes and improvements

This release also tackles a few persistent bugs that the community flagged.

First, for Spark Structured Streaming, there was an issue where streams could stall if streaming-max-rows-per-micro-batch was set too low. The problem? Iceberg’s connector streams entire data files at a time, meaning if the file’s row count exceeded the limit, the stream wouldn’t progress. In 1.10, this limit is now a soft cap, fixing the stalling issue for Spark 3.4, 3.5 and 4.0 users.

Another fix improves the table import process in Spark. Previously, if a table import encountered a missing partition directory (often due to manual cleanup or automated housekeeping), the entire migration could fail. Now, a new ignoreMissingFiles flag lets you skip these missing partitions and keep moving forward.

Finally, a critical fix for the REST Client resolves an edge case where retrying commits after a 5xx server error could cause conflicting operations and potential table corruption. This issue has been around since version 1.5 of the Java SDK, so if you’re running older versions, this upgrade is especially important.

Spec progress: What’s new in v3

The Iceberg community has officially closed the v3 table spec, but that doesn’t mean the work has stopped. In 1.10, you’ll see a few final tweaks: clearer guidance for table encryption, better handling of geometric nan-values, updated rules for deletion vectors and some cleanup around default values for complex data types. There are even small refinements to metadata naming conventions to keep things tidy.

Some parts of the v3 spec are already ready to use in Spark today — like row lineage, deletion vectors, variant and unknown type. Spark supports full read capability (basic and shredded variants) and only basic write capability (no shredded variant). Be sure to check the release notes for details on what’s live and what’s next.

Looking ahead

Apache Iceberg 1.10 is another great step forward for a project that thrives on community collaboration. This release wouldn’t be possible without the many contributors, reviewers and testers who made it happen. If you’d like to get involved, check out the Iceberg GitHub repo, browse the open issues, and join the conversation shaping the next generation of open table formats.

If you haven’t already, give Iceberg 1.10 a spin, try out the new features, and share your feedback. We can’t wait to see what you build with it.

Share Article

Subscribe to our blog newsletter

Get the best, coolest and latest delivered to your inbox each week

Where Data Does More

  • 30-day free trial
  • No credit card required
  • Cancel anytime