Snowflake’s Data Cloud is powered by a single engine. From day 1, we have been focusing on consistently evolving and improving this engine to allow existing workloads to run more efficiently and enable new workloads to run on Snowflake. The single engine approach translates into a single experience—from one consistent pricing model to an integrated approach combining performance, security, governance, and the foundation to seamlessly enable cross-region or cross-cloud scenarios. Ultimately, the result is the elimination of complexity. There’s no need to manage multiple engines, services, complex system architectures, or data flows. You won’t need to figure out different pricing implications when deploying data engineering, analytics, or data science workloads. You also get more platform innovation more frequently. Nearly every improvement we make applies across the board rather than to a single use case or workload. At Summit, we announced a number of new engine capabilities we are going to summarize in this blog post.
Transparent engine updates
For those of you running on AWS, you will get faster performance for all of your workloads. We’ve optimized Snowflake to take advantage of new hardware improvements offered by AWS, and we are seeing 10% faster compute on average in the regions already rolled out. No user intervention or choosing a particular configuration is required for this latest performance enhancement.
On the execution side, we are happy to announce that Join Eliminations will be in public preview soon. These will save substantial query execution time. Snowflake will now automatically detect and eliminate unnecessary joins from your query rather than relying on the application layer. For example, one of our government customers was able to simplify their migration from Teradata to Snowflake without the need to change their application stack or rewrite queries.
Also, we improved out-of-the-box substring searches, now running up to twice as fast. Business applications that rely on searching through large text documents for a word or a phrase will run faster without changing application code. This update is currently being rolled out across regions.
On the compilation side, common table expressions (CTEs) are now faster with up to 30% reduced compilation. CTEs have become a convenient way to simplify and manage complex SQL queries, leading to increased readability and productivity of our SQL users. And you will now see faster compilations of those often complex queries.
We improved secure views used for faster data sharing, now in private preview. We have seen up to 50% reduction for a large financial services firm who saw this improvement as a must-have for their data collaboration use case across the globe.
Complex analytical queries typically involve joins with very large tables. We are excited to announce a 36% compilation time improvement, on average, for this class of complex queries, observed on Snowflake AWS US West. This has rolled out to all regions. For the data geeks, this was achieved by improving the performance of calculating the “number of distinct values (NDV),” which is a core step for such queries.
Soon to be in private preview, we made data policy evaluations for your data governance use cases faster via memorizable functions speeding up complex policies relying on multiple mapping tables. We have a large number of our customers today who are protecting their tables and looking for zero or minimal performance overhead.
We’ve kept improving the high degree of concurrency that has been a critical component of our scalable platform. We’re happy to announce a number of new critical improvements.
You can now perform write-heavy DML workloads significantly faster. Our users who have been running hundreds of DML operations per second saw 10% reduction in latency on average, and some of them saw as much as a 40% reduction.
Last year, we announced that Snowflake can now power interactive dashboards and embedded analytic use cases. This was the result of the team making precise improvements: reducing latency and improving concurrent select query processing. And that was just the beginning. We’ve continued to find new areas to improve and further reduced the latency of very short 100ms queries by another 10%.
Compressed, self-organizing storage
Last year, we announced a major improvement to our compression algorithm. This saved customers 30% off their storage bill on average—specifically improving the way some data and file formats are handled. This change was fully managed with no user action or disruptions required.
Today, we are excited to announce that we have started to release another compression improvement for numeric data types, seeing a 7-10% reduction in storage cost. Again, this is fully transparent for you as the end user. No action is required to see these benefits this year.
Accelerating new workloads
Location analytics and working with geospatial data is one exciting area of growth for our customers. To make migrating geospatial use cases to Snowflake even easier, we are adding GEOMETRY support in public preview soon. We’re dedicated to making Snowflake the best place for location analytics, and this is just one way we’re delivering that.
As part of the Data Cloud, you get robust language support for working with geospatial data, plus the ability to tap into a rich set of data from the Marketplace to enrich your analysis and share insights with your organizations or your ecosystem. And we give you a number of flexible options to work with this data, including SQL-based Search and integrations with partners like Carto and ESRI.
Search Optimization Service is another area that we are continuously improving. Last year, we announced its general availability, letting you analyze billions of rows of data and find answers to specific narrow questions fast. Simply activate it on a table and Snowflake handles the rest—automatically tracking additions, updates, or deletions while preserving the optimal access structure underneath. At Summit, we highlighted a few improvements to our Search Optimization Service: support of more data types, including VARIANT, text, and geospatial, all in private preview. In particular for geospatial, this means searches on maps are now up to 5x faster.
We are also announcing the availability (in private preview) of a new syntax for configuring the Search Optimization Service, allowing customers to select the columns they want to use for search. This gives customers another level to minimize their SO costs.
Moreover, since mid-June customers who use the Search Optimization Service are spending on average 25%, and in many cases up to 50%, less on Search Optimization compute. This is because we optimized the background maintenance of Search Optimization and made these operations consume less resources, resulting in very significant savings to customers.
We are also happy to introduce our latest serverless feature, Query Acceleration Service, now in public preview for Enterprise edition and above. Query Acceleration Service enables more flexibility in balancing cost and performance. Whereas multi-cluster virtual warehouses allow for horizontal scaling to handle more concurrency, Query Acceleration Service enables vertical scaling to accelerate queries.
As we’ve been developing these capabilities, we’ve aimed to increase the predictability and transparency of Snowflake performance, and we are announcing today a series of new workload management capabilities.
Now available in private preview, we’re releasing programmatic access to query profile statistics. This will improve your ability to identify and troubleshoot challenging queries at scale. In addition, we’re introducing more transparency into the efficacy and benefits of two serverless compute services: Search Optimization Service and Automatic Clustering Service, which automate the maintenance and management of clustering. With this update, you’ll be able to see key metrics to understand the impact and benefits of Search Optimization and Automatic Clustering services as well as their impact on performance due to frequent table changes from DML operations.
In addition to these new workload management capabilities, we are introducing new cost governance capabilities. Available now in private preview is a new Budgets feature to monitor and get alerts on multiple compute services (e.g., warehouse and materialized views). We are also adding cost controls for automatic clustering, now in private preview. We’re also making it easier to use tagging for cost governance. You can add tags like Finance or Marketing to accounts and objects and see a consolidated cost tags view using organization usage. This is available in private preview.
My journey with Snowflake started over 7 years ago. Looking back on these years, Snowflake’s engine has evolved and is now powering very diverse workloads — from traditional analytics, to building modern data-centric applications and experiences, allowing smaller and larger organizations to effectively collaborate including sharing of data and applications. This engine and platform enables Snowflake’s Data Cloud. But our core mission did not change: One Engine, One Product, One Integrated Experience.
To learn more about these innovations, watch the Summit 2022 keynote featuring the core engine and platform and other selected sessions on demand here.
Forward Looking Statements
This post contains express and implied forwarding-looking statements, including statements regarding (i) Snowflake’s business strategy, (ii) Snowflake’s products, services, and technology offerings, including those that are under development or not generally available, (iii) market growth, trends, and competitive considerations, and (iv) the integration, interoperability, and availability of Snowflake’s products with and on third-party platforms. These forward-looking statements are subject to a number of risks, uncertainties and assumptions, including those described under the heading “Risk Factors” and elsewhere in the Quarterly Reports on Form 10-Q and the Annual Reports on Form 10-K that Snowflake files with the Securities and Exchange Commission. In light of these risks, uncertainties, and assumptions, actual results could differ materially and adversely from those anticipated or implied in the forward-looking statements. As a result, you should not rely on any forwarding-looking statements as predictions of future events.
© 2022 Snowflake Inc. All rights reserved. Snowflake, the Snowflake logo, and all other Snowflake product, feature and service names mentioned herein are registered trademarks or trademarks of Snowflake Inc. in the United States and other countries. All other brand names or logos mentioned or used herein are for identification purposes only and may be the trademarks of their respective holder(s). Snowflake may not be associated with, or be sponsored or endorsed by, any such holder(s).