How Snowflake Automates Performance in a Modern Cloud Data Warehouse
Oct 18, 2019 | 3 Min Read
Author: Andrew Meyendorff
Database administrators spend up to 80% of their time performing manual administrative tasks. What if much of that work were automated, so they could instead focus on analytics that empowered the business to increase growth and profitability?
With Snowflake’s automation capabilities, as well as its patented multi-cluster, shared-data architecture, it’s easy to take advantage of immediately available resources in the cloud, address common performance challenges, and maximize performance to give users the best analytics experience possible.
Here are three capabilities of Snowflake’s modern cloud-built data warehouse that enable it to perform significantly better than traditional data warehouses, freeing DBAs to focus on creating value for the business.
- Concurrent workloads can be isolated on dedicated compute resources.
In traditional data warehouses, administrators can add nodes to clusters to increase scale, but the architecture cannot support infinite scaling. At some point, the concurrency and performance benefits of additional nodes reach a limit as issues such as data skew and CPU context switching become more pronounced. As a result, performance suffers, forcing users to wait for results.
Because resources are not limited or fixed with Snowflake, you can create as many virtual warehouses as you want in your Snowflake account, and you can size, resize, or turn on or off each one independently of the others.
And because the storage layer is independent of the compute layer, all virtual warehouses can share the entire data set. Any individual, user group, application, or automated workload can take advantage of dedicated, isolated compute resources while operating on a single source of truth—with no impact on others’ query performance.
- Data is automatically micro-partitioned.
In traditional databases, good performance requires some combination of indexes, partition keys, shard keys, sort keys, manual query optimization, optimizer hints, and so on. Time that DBAs spend on these activities comes at the expense of performing higher-value work, such as getting access to new data sets or performing new types of analysis.
In contrast, Snowflake partitions data automatically based on the natural ingestion order and stores partitions in a central storage layer that all compute nodes can access. Snowflake automatically collects statistics and updates them with every executed DML statement. Any maintenance on these partitions can be done as an automatic background process using resources separate from the ones running your workloads.
During a query, Snowflake automatically picks the optimal distribution method for just the partitions needed based on the current size of your virtual warehouse. This makes Snowflake inherently more flexible and adaptive than traditional systems, while reducing the risk of hotspots.
- Every layer of the system can self-tune and self-heal.
When different pieces of a traditional system start to break down, whether due to software or hardware defects or just an intensive workload, the static nature of the system prevents them from healing without manual intervention. DBAs must be prepared to analyze workloads and tweak any of the hundreds or thousands of controls that might be available.
Snowflake builds high availability and the ability to self-tune and self-heal into every layer of the system. For example, as data is written to a table in Snowflake, it is synchronously written to highly durable cloud storage in three different data centers.
If your compute cluster in one data center starts losing machines or if the entire data center goes down, Snowflake can instantly provision a cluster in another data center that still has access to all your data. This capability extends to every layer of the Snowflake service.
A commitment to performance
With the arrival of the cloud-built data warehouse, performance optimization becomes a challenge of the past. By using Snowflake, everyone at your organization can benefit from performance automation that requires very little manual effort or maintenance.
Best of all, we at Snowflake are available to help you with performance automation strategies to ensure every user receives all the benefits of exceptional performance, and we will continue to invest in making your life easier when it comes to analyzing your data.
To learn more about how Snowflake enables you to deliver significant performance improvements over traditional data warehouses, download our ebook How Snowflake Automates Performance in a Modern Cloud Data Warehouse.