Startup Spotlight: Making Snowflake Queries Smarter and Cheaper with Sundeck
Welcome to Snowflake’s Startup Spotlight, where we highlight the people and companies building businesses on Snowflake. In this Q&A series, Jacques Nadeau, Co-Founder and CEO of Sundeck and co-creator of Apache Arrow, talks about what inspires him to make powerful data tools available to all, how Sundeck’s query engineering platform can help Snowflake users, and why they “eat, sleep, and drink” Snowflake every day at Sundeck.
Tell us about yourself.
I’m Jacques Nadeau, Co-Founder and CEO of Sundeck. I’ve been working in both the commercial and open source data worlds for the last 25+ years. Things I’ve worked on include co-creation of Apache Arrow (used inside Snowflake, with over 50 million downloads/month), and I am Co-Founder and former CTO of Dremio.
What inspires you as a founder?
Empowerment. We all want to feel empowered. This is true in both our personal and professional spheres. As a founder (and OSS creator), I’ve generally looked for new ways to empower people in their daily “data” life. A key theme of this has been democratization of database internals and the belief that powerful data tools should be available to the everyday data worker, not just those with PhDs in database internals. Apache Arrow came out of the realization that in-memory processing and transfer should be something everyone can access. Apache Calcite, another project I’ve been involved in for the last decade, is focused on democratizing query optimization and transformation. Substrait—a newer open source project I recently co-created—focuses on standardizing query plans and relational algebra.
How does this translate into what Sundeck does?
I’ve had several chances to build systems that influence how queries are expressed, interpreted, and executed through sophisticated query flows. These flows help organizations block query anti-patterns, mitigate security and privacy risks, enable routing to optimize performance and/or cost, and can enrich user interaction to make data consumption easier and less error-prone. The systems that enable this “query engineering” fundamentally change the breadth of tools data engineers and analysts have to get their job done.
Unfortunately, this level of power has historically been something only the most resource-rich data organizations can build and maintain. Sundeck looks to provide this kind of powerful platform to all data workers in much the same way Apache Arrow helped data scientists and data engineers access high-performance in-memory computing.
How can a query engineering platform help Snowflake users?
Sundeck is a query engineering platform that allows analysts, data engineers, and DBAs to influence where, how, and what queries run on Snowflake.
Sundeck allows any Snowflake user to create a new query flow, which can then be configured to execute additional operations before and/or after a query is executed in Snowflake. The simple concept is very powerful in practice. For example:
- Hate it when a new user mistakenly does a query on a large fact table without limiting the set of results they return? Just define a Sundeck reject pre-hook that uses our QLIKE query-matching technology to identify specific suboptimal query patterns.
- Want to shutdown warehouses immediately off-hours once the last running query completes? Define a Sundeck SQL post-hook that examines current load and time of day to suspend idle warehouses.
- Want to implement tighter cost controls per user so each can run a maximum of $50 of queries each day? Implement a Sundeck SQL post-hook that collects query activity and records that in a Snowflake table, which is then consulted in a SQL pre-hook to reject excessive consumers (unless a manager overrides).
- Want to automatically monitor dbt model-processing telemetry and reroute individual model executions to avoid excessive spilling, slow completion times, or warehouse inefficiency? Create a Sundeck SQL pre-hook that uses the model being created to look up historical trends of target warehouse size and the current warehouse load to route your operation intelligently.
All of these patterns are hard to achieve without a query engineering platform, but become trivial once you have one. It allows all Snowflake users to better leverage their investment and create query flows that can do previously impossible things.
It sounds like Snowflake is a core technical component for Sundeck. Is that the case?
Absolutely. We eat, drink, and sleep Snowflake each and every day. As a company trying to provide seamless extensibility to Snowflake, the Snowflake Data Cloud provided us substantial initial acceleration given all the things we otherwise would have needed to build ourselves. Top of these is Snowflake’s strategy to be more than just a data warehouse and broadening what is possible. Additionally, Snowflake has a wide array of extension points and we’ve been able to take advantage of most of them including Snowflake scripting, Snowpark, external functions, data sharing, OAuth support, and more.
What Snowflake architecture pattern have you implemented?
We try to do everything we can inside Snowflake. All of our data processing is done in Snowflake as well as much of the data business logic. Much of that work is done in our account, but we also take advantage of data sharing to share per-account data back to our customers via Snowflake sharing. We do run some systems external to Snowflake, given our need for more advanced patterns than Snowflake yet supports in Snowpark. (This will only get more exciting with the Snowflake Native App Framework.) Of Snowflake’s three main architectures of hybrid, connected, and managed, we use a hybrid architecture.
How has adopting Snowflake influenced your company?
Two pillars of the Snowflake ethos that we’ve fully embraced in our product are Snowflake’s fanaticism around security and the drive to make all things available via SQL. Both of these permeate our work and product. Everything inside of Sundeck can be managed by our customers using SQL, and we have some groundbreaking tech to make sure data always stays safe and in the hands of Snowflake and our customers.
What are you personally excited about next?
I’m super excited to be making Sundeck publicly (and freely) available for Snowflake users today. We’ve found that the patterns it introduces both enrich and expand the ways Snowflake can be used, and we hope other Snowflake users find the same.
I’m also very excited about Snowflake’s continued push into Data Cloud-managed computing. Snowpark was a great first foray and initial step. Snowflake Native Apps and Streamlit are exciting but still just scratching the surface of the expressibility we have in the broader domain of software development. With data as the backbone of many modern applications, Snowflake has an opportunity to fundamentally invert the way we build applications, allowing organizations to own control of their operational and reporting data while leveraging modern technologies in a secure way. We’ve been saying “bring the processing to the data” for years. With the right tooling, we can achieve that at the application level—and celebrate the demise of both ETL and reverse ETL!
Long term, the shift the entire industry is going through as we figure out new ways to apply and leverage powerful LLMs is exciting. It’s already drastically changing how we work with data. We’re just in our infancy around the application of these tools and how they are best leveraged. Chat is a great generic killer app, but merging our thinking with the acceleration of modern machine learning in a more streamlined and iterative way is going to fundamentally reshape our world. That’s daunting and awesome.