How to Build Successful Data Applications on Snowflake
Jan 16, 2020 | 3 Min Read
Author: Snowflake Staff
There has never been a better time to build SaaS data applications. International Data Corporation (IDC) says that big data and business analytics solutions will experience double-digit annual growth over the next three years, and worldwide BDA revenue will reach $274.3 billion by 2022. But to be successful, the apps must be able to ingest and analyze large volumes of data quickly and easily.
A cloud-built data platform provides the development infrastructure for data apps to handle modern customer demands. Many data apps today are built on traditional data stacks, including legacy on-premises and “cloud-washed” data warehouses. These lack the attributes that make modern apps successful.
Snowflake Cloud Data Platform provides the stack you need to develop and scale modern data applications. Snowflake is built on and for the cloud, which provides fundamental benefits that become clear when you examine its architecture, deployment, and operations.
If you’re looking to build your data applications on Snowflake Cloud Data Platform, here are three best practices that will help you deliver remarkable customer experiences while guaranteeing the right framework and support for your own organic growth.
- SELECT STRATEGIC VIRTUAL WAREHOUSE SIZES BY SERVICE OR FEATURE
When you set up and customize your Snowflake solution, dedicate separate Snowflake virtual warehouses (compute clusters) for your workloads, based on usage needs. This practice helps enable lower compute usage by allocating the rightsized compute resources to specific services, features, or workloads.
For example, rather than use a large virtual warehouse (eight credits per hour), you may discover that a medium-size virtual warehouse (four credits per hour) and a small-size virtual warehouse (two credits per hour) match your application’s needs better. This strategy saves two credits per hour without any sacrifice in performance.
And, for those times when a heavy one-time analysis is needed, you can run queries on a separate right-sized warehouse that doesn’t impact other queries. If there’s a fixed amount of work, it often makes sense to use the biggest warehouse size. Query performance tends to scale linearly, so a large warehouse will end up delivering faster analysis and costs the same as a smaller warehouse that takes more time.
- ADJUST MINIMUM AND MAXIMUM CLUSTER NUMBERS TO MATCH EXPECTED WORKLOADS
Select a warehouse size (small, medium, large) that provides adequate performance for each individual query that runs, keeping in mind that a given warehouse size can run individual queries twice as fast as the size below it, and each additional cluster allows the warehouse to run more queries in parallel to increase concurrency. Then, to maximize performance and minimize costs, it’s important to adjust a virtual warehouse’s minimum and maximum number of clusters based on the corresponding concurrent throughput you expect for the workload. Keep in mind that, as workloads subside, one cluster at a time shuts off, so you pay only for the resources needed at any given moment. This Snowflake strategy provides consistent performance regardless of the number of queries.
- TARGET WORKLOADS TO THE RIGHT SERVICES
If you’re ingesting multiple types of data from multiple sources, it’s important to recognize what your data needs and set up your architecture to support separate workloads that are targeted at the technologies that make the most sense. For example, you might want to process some streaming data in near real time and take action on it, while other data types might not need immediate attention and instead should be sent straight to storage for future complex analytical segmentation.
By building these capabilities into your architecture early on, you accelerate the ability to manage your data and derive fast insights exactly where they are needed. With Snowflake, every piece of data can be sent to two places, which makes it easy to immediately process and store, and it’s easy to handle unstructured data without forcing schemas on customers.
These are just a few of the best practices that will enable you to develop modern data applications on Snowflake Cloud Data Platform. For more best practices, download our ebook, 7 Best Practices for Building Data Applications on Snowflake.