Happy Holidata 2019!
Dec 18, 2019 | 2 Min Read
Author: Rob Smoot
At Snowflake, performance is the gift that keeps on giving the whole year. As an area of continuous improvement, our engineering team is regularly working on ways to make the superior performance and concurrency of Snowflake’s cloud data platform even better. As in years past, our gift to you this holiday season is a platform that continues to deliver on your needs and the promise of cloud scale, regardless of what you throw at it. And since one of Snowflake’s key benefits is that you pay only for what you use, these performance improvements lead to shorter compute cycles and lower cost of ownership.
Throughout 2019, we added several performance improvements to Snowflake, all without downtime or the need for maintenance windows. The performance benefits were impressive:
- New scan improvements benefit most queries with large scans and improves performance up to 7x in many cases.
- Query compilation improvements reduced average compilation time across the system by over 40%.
- JOIN improvements increase performance for qualifying queries up to 10x.
- Aggregation improvements also deliver up to 10x improvements for impacted queries.
In case you missed them, here’s a list of key capabilities we released that were on your performance wish list:
- Automatic Clustering seamlessly and continually manages all needed reclustering of clustered tables.
- Materialized Views (MVs) improve query performance for workloads composed of common, repeated query patterns and always reflect what is in the source table.
- MVs on external tables from data stored in files in an external stage such as Amazon Simple Storage Service (Amazon S3), Microsoft Azure Blob Storage, and Google Cloud Storage improve Snowflake performance over data stored in cloud storage or external data lakes.
- Persisted query reuse improvements allow a broader set of queries to avoid regenerating results when nothing has changed.
- Apache Arrow format for improved query fetch performance avoids the overhead associated with serializing and deserializing data structures for big performance gains with JDBC and Python.
The best part? All of these performance features have been added without any additional cost!
So what does 2020 hold in store? Suffice it to say, performance will remain a top priority for Snowflake again next year. What we will deliver in performance only Santa (aka Snowflake engineering) knows, but if the past is any indication of the future, you can expect we’ll be sharing a festive list of new performance features this time again next year.