Partner & Customer Value

Automatic Query Optimization. No Tuning!

Automatic Query Optimization. No Tuning!

Hopefully you had a chance to read all our previous top 10 posts. We now conclude the series with a final deep dive into my #1 Cool Feature from Snowflake:

#1 Automatic Query Optimization. No Tuning!

As a long-time data architect, and not a DBA, this is my favorite part of Snowflake. I do not have to worry about query performance at all. It is all handled “auto-magically” via a dynamic query optimization engine in our cloud services layer. I just model, load, and query the data.

Cloud Services

Snowflake is an MPP, columnar store thus designed for high speed analytic queries by definition. Our cloud services layer does all the query planning and query optimization based on data profiles that are collected automatically as the data is loaded. It automatically collects and maintains the required statistics to determine how to most effectively distribute the data and queries across the available compute nodes.

So, no indexes, no need to figure out partitions and partition keys, no need to pre-shard any data for even distribution, and no need to remember to update statistics.



Another key point to remember is that not only does this dynamic optimization auto-tune queries that work against traditional structured data, it also works against semi-structured data (like JSON, Avro, or XML). When semi-structured data is loaded into our VARIANT data type, as mentioned in an earlier blog post, the data is profiled and columnarized behind the scenes. With this approach, querying the data (using our JSON or XML extensions) is optimized just like for structured relational data, without having to spend the time and effort to split it out into relational tables and columns. Because of this unique approach to handling semi-structured data, Snowflake not only supports joining structured and semi-structured data with SQL, it optimizes those joins too.

Let go Luke - Trust the Force!

As you can imagine, there are some folks out there who do not like change and want us to add more knobs so they can manually tune their queries.

Why? Well, they say that they know their workload better than Snowflake and therefore can optimize it better than Snowflake software. Others say that they've already done all the work to optimize another platform for their workload and don’t want to “waste” all that time and effort.

Since I come from a traditional data warehouse background, and know many expert data warehouse architects who are world class at tuning specific databases to do specific types of queries,  I definitely understand and can sympathize with this point of view.

But the world of data has changed! The way we build, deploy, and use data (of all kinds) is continually evolving and we need to evolve our approach to to be successful.

Because of that, there are a few things to consider with regard to tuning:

  1. Your workloads will change.
  2. The sheer number of different workloads is going to grow faster than your bandwidth to manually tune (so someone has to prioritize who wins).
  3. What happens if your expert query tuner decides to sell real estate instead?

Relying on yourself, and other expert DBA tuning staff, is ultimately going to result in a performance tuning bottleneck, which will in turn lead to increased contention for resources in the data warehouse. No matter how good you are, in the end you will not be able to keep up. So why not use a data warehouse service to handle that work for you so you can focus on getting the business people access to all this data in a form they can use to make better business decisions? In short, people are the hardest thing to scale in any data analytics solution. Thus any solution that allows you to scale without bottlenecks on people is a winner.

And if the query is still not fast enough, with Snowflake you always have the option to resize your virtual warehouse to add additional compute resources, as we described in the post about elasticity. Without the benefit of a dynamic optimization engine, you could not change the compute configuration on the fly and see immediate improvements. In a many other environments, if you changed the number of compute nodes you would have to redistribute the data, or do some other manual tuning, in order to get the improved performance. That would not be elastic at all.

For all these reasons, our patent-pending dynamic optimization is, to me, one of the most important features when it comes to making the Snowflake Elastic Data Warehouse a true SaaS offering and enabling agile, high-speed analytics at any scale. As part of our vision, we developed an offering that helps customers concentrate on understanding and deriving value from their data rather than spending time refining, managing , and optimizing the environment. Business users and data scientists can easily analyze their data by connecting natively to Snowflake through various tools, and not have to worry about tuning and query optimization.

As I said before, as an architect and data warehouse designer, I really like this model!

End of this trail...more to come

Well, that is the last of my Top 10 list of cool features in Snowflake.  I want to thank Saqib Mustafa for all his help collaborating on this series of posts.

There are of course many other cool features in Snowflake, and more to come in the future. So keep your eyes on this blog for future posts with more details on the Snowflake Elastic Data Warehouse, the cloud-native data warehouse as a service.

As always, keep an eye on our Snowflake-related Twitter feeds (@SnowflakeDB), (@kentgraziano), and (@cloudsommelier) for continuing updates on all the action and activities here at Snowflake Computing.

Additional Links

Share Article

Data Cloning | Snowflake Fast Clone | Snowflake Blog

Find out how Snowflake Elastic Data Warehouse enables users to clone tables, schemas, and entire databases almost instantly with no additional storaged

Snowflake Arctic Cookbook Series: Instruction-Tuning Arctic

Explore the Snowflake Arctic Cookbook Series and get insights from AI research team on fine-tuning MoE models and optimizing AI performance.

How Snowflake Delivers on the National Cybersecurity Strategy

Find out how the White House’s National Cybersecurity Strategy aligns with Snowflake’s own mission to protect data at all costs.

Data Vault Automation with erwin and Snowflake

Read on to find out how the combination of Snowflake and erwin provides an end-to-end solution for a governed Data Vault with powerful performance.

Snowflake Versus Query Engines

When evaluating the differences between Snowflake and query engines, consider the origins of the platforms. This provides an indication of how your data warehouse requirements are met today and how the platforms will evolve into the future.

Introducing Even Easier-to-Use Snowflake Adaptive Compute with Better Price/Performance

Discover how Snowflake Adaptive Compute delivers intelligent scaling and optimization for easier to use warehouses with better price/performance.

Automatic Encryption of Data

One of the biggest concerns about moving to the cloud is security. Snowflake provides automatic data encryption by default. Learn more.

Saving Time & Space: Simplifying DevOps with Fast Cloning

Snowflake Elastic Data Warehouse fast cloning allows you to have multiple copies of your data without the additional cost of storage usually associated with replicating data.

Fetching Query Results From Snowflake Just Got a Lot Faster With Apache Arrow

Fetching Query Results From Snowflake Just Got a Lot Faster With Apache Arrow

Subscribe to our blog newsletter

Get the best, coolest and latest delivered to your inbox each week

Where Data Does More

  • 30-day free trial
  • No credit card required
  • Cancel anytime