Industry Benchmarks and Competing with Integrity

When we founded Snowflake, we set out to build an innovative platform. We had the opportunity to take into account what had worked well and what hadn’t in prior architectures and implementations. We saw how we could leverage the cloud to rethink the limits of what was possible. We also focused on ease of use and building a system that “just worked.” We knew there were many opportunities to improve upon prior implementations and innovate to lead on performance and scale, simplicity of administration, and data-driven collaboration.

In the same way that we had clarity about many things we wanted to do, we also had conviction about what we didn’t want to do. One such thing was engaging in benchmarking wars and making competitive performance claims divorced from real-world experiences. This practice is simply inconsistent with our core value of putting customers first.

Twenty years ago, the game of leapfrogging benchmark results every few months was a priority for the database industry and both of us were on the front line fighting the benchmark war. Posted results kept getting better and new world records were being set on a regular basis. Most in the industry started adding configuration knobs, special settings, and very specific optimizations that would improve a benchmark by a fraction of a percent. Unfortunately, many such changes translated into additional complexity for customers and, worse, most of them had little or even negative impact on customers’ day-to-day workloads. The negative results compound: Development teams are distracted from focusing on what really matters to customers, and users are left underserved with more complex technology. Anyone who has been in the industry long enough can likely attest to the reality that the benchmark race became a distraction from building great products for customers. There is a reason why all the relevant players in the database industry, those that are running the majority of customer workloads, have largely stopped publishing new results.

Since founding Snowflake, we have focused on our customers and their workloads, and not on synthetic benchmarks. We’re reiterating this philosophy today due to a recent benchmark published by Databricks that included comparisons to Snowflake. Though Databricks’ results are under audit as part of the TPC submission process, it’s turned the communication of a technical accomplishment into a marketing stunt lacking integrity in its comparisons with Snowflake. The Snowflake results that it published were not transparent, audited, or reproducible. And, those results are wildly incongruent with our internal benchmarks and our customers’ experiences.

We are therefore sharing our own results but, more importantly, we want to provide guidance on how simple it is to reproduce Snowflake results. We encourage anyone interested in the comparison to do their own assessment, and validate the inaccuracy of the Databricks blog post. The “Trust, but Verify” section later in this blog post shows how this can be achieved with Snowflake, using only a few mouse clicks.

This week, we ran the TPC-DS power run in our AWS-US-WEST cloud region. The entire power run consists of running 99 queries against the 100 TB scale TPC-DS database. Out of the box, all the queries execute on a 4XL warehouse in 3,760s, using the best elapsed time of two successive runs.¹ This is more than two times faster than what Databricks has reported as the Snowflake result, while using a 4XL warehouse, which is only half the size of what Databricks indicated it used for its own power run.

Using an even bigger 5XL warehouse, which is currently in Public Preview, further improves our performance and lowers the total elapsed time to 2,597s. However, because Snowflake already runs these queries so fast on a 4XL, fixed time such as query startup time and synchronization overheads become more pronounced on a 5XL, reducing the speedup we get with that configuration. So for Snowflake, the warehouse size sweet spot at 100 TB is 4XL and not 5XL. The 5XL warehouse size would be useful for higher data volumes, such as 300 TB or 1 PB. In the rest of this blog, all numbers are reported using a 4XL configuration. Finally, note that a Snowflake 4XL warehouse performs similarly to the 3,527s TPC-DS power run that Databricks audited on a much more powerful hardware configuration similar to a 5XL warehouse. Our 5XL in its current form significantly beats Databricks in total elapsed time (2,597s versus 3,527s), and we expect material improvements when it reaches General Availability.

The most misleading aspect of the Databricks blog post is how Databricks derives our price/performance. Our Standard Edition on-demand price for a 4XL warehouse run in the AWS-US-WEST cloud region is $256 for an hour. Since Snowflake has per-second billing, the price/performance for the entire power run is $267 for Snowflake, versus the $1,791 Databricks reported on our behalf. Note that there is no need to use our Enterprise Edition to run TPC-DS since materialized views and multi-cluster warehouses are not needed for the benchmark.

Using Standard Edition list price, Snowflake matches Databricks on price/performance: $267 versus $275 for the on-demand price of the Databricks configuration used for the 3,527s power run that was submitted to TPC.²

The numbers above show that even without focusing time and energy on tuning our system to set records for marketing purposes, Snowflake delivers exceptional results. Our price performance is a key reason that our customers are migrating more and more workloads to Snowflake, and we are continuously improving the performance of Snowflake to make real customer workloads run faster, directly translating into overall price/performance improvements for them. In particular, we have many customer workload performance enhancements that are in the process of being released, and we have many more in development. We will continue to stay focused on customer outcomes, not competitive games.

Trust, but Verify

Anyone who is curious about where we stand regarding TPC-DS should run the benchmark themselves. With Snowflake, it literally only takes a few mouse clicks, about an hour from start to finish, and a few hundred dollars to run the full TPC-DS power test @ 100 TB. Anyone can do this without any prior experience. This is how simple Snowflake is, and a big part of our value proposition beyond just stellar performance.

Follow the three simple steps below to run the TPC-DS power run @ 100TB on Snowflake:

First, if you are not yet a Snowflake customer, create a new trial account by visiting signup.snowflake.com/. Choose the cloud and region you want for your first Snowflake account; for example, AWS in the US-West Oregon region. Since TPC-DS does not use materialized views or multi-cluster warehouses, simply use our Standard Edition. In a few seconds, you’ll receive an email allowing you to log in to your new Snowflake account.
Once logged in, you will land on the Snowflake worksheet tab. From there, open the TPC-DS 100 TB tutorial by clicking on the down arrow located on the right side of the worksheet tab named New Worksheet (top left of the screen), then select Open Tutorials, and select Tutorial 4: TPC-DS 100 TB Complete Query Test. This will open a new worksheet tab with the script you can use to run the TPC-DS power run @ 100 TB scale.
Run the script that has been loaded by selecting All queries and then by clicking on the Run button. Wait for the entire script to execute; this should take around an hour.

If you look at the script, it first creates a 4XL warehouse to run the benchmark, and uses the TPC-DS 100 TB schema from the SNOWFLAKE_SAMPLE_DATA database. This database illustrates the power of data sharing in Snowflake. It was created out of the box by Snowflake and shared with all accounts in all Snowflake cloud regions. No need to pay for its storage and for loading data!

The creation of the warehouse should run sub-second and then the queries will start executing against the shared database. At the end of the script, the overall elapsed time and the geometric mean for all the queries is computed directly by querying the history view of all TPC-DS statements that have executed on the warehouse. You should see that metric being reported. Note that results reported in the previous section are on a warm warehouse by running the full query set twice, taking the second run into account. This generally slightly improves the overall performance by about 5%.

This illustrates the sheer simplicity and power of Snowflake: After logging into your new Snowflake account, running the entire benchmark only takes four mouse clicks!

What’s Next

Earlier this week, Databricks also announced a change to its service terms enabling competitive benchmarking and publishing of results with clear disclosures, and encouraging others in the industry to do so as well. While we will not be diverting our customer focus to chase synthetic benchmarks, we do agree with Databricks that anyone who publishes benchmarks should do them in a fair, transparent, and replicable manner. We also agree with the exclusion of preview capabilities from being benchmarked. In the end, we want customers to be able to make informed decisions about the capabilities of our platform. As such, we have updated our Acceptable Use Policy to reflect this philosophy. It’s disingenuous that in its representation of Snowflake’s performance Databricks didn’t follow the principles of reproducibility, disclosure, and beta feature testing it’s advocating.

We hope this perspective is useful, not only for this specific conversation but, more importantly, for the many entrepreneurs out there who get tempted to hop on a benchmark competition that risks distracting from the most crucial goal—innovating for customers’ benefit.

We are strong believers in competition—the industry gets better and, most importantly, customers reap the technical and economic benefits. But we will focus our competitive efforts on the merit of our technology and the advantages to actual customer workloads. Therefore, we will not publish synthetic industry benchmarks as they typically do not translate to benefits for customers. We know our customers get tremendous value from the capabilities and performance we’ve delivered, and will be excited about all the additional innovation that is coming.

The Snowflake results reported in this blog are derived from the TPC-DS 100 TB power run and as such are not comparable to published TPC-DS 100 TB power run results.

¹ The difference between a cold and a warm run is about 5%.

² The results described in Databricks’ blog post do not match the results submitted to the TPC council.

Subscribe to our blog!

Thank you for your submission.

Industry Benchmarks and Competing with Integrity

Trust, but Verify

What’s Next

O’Reilly Report: Architecting Data-Intensive SaaS Apps with Snowflake