Database Fun — as a Service!
September 14, 2018
Author: Marcin Zukowski
Engineering, Snowflake News
When talking about Snowflake, we typically focus on the amazing capabilities that Cloud and the SaaS model have, allowing things like simplicity, elasticity, scalability, HA, and more.
However, Cloud and SaaS also dramatically change how we develop our software.
I’ve worked on internals of six different databases (PostgreSQL extensions, MonetDB, SQL Server, a project at Google, Vectorwise, and now Snowflake). Snowflake is delivered as a pure service solution. I expected that things would be different, but they were even more different than I expected.
So what is different for developers? A lot! I tried to list some of the most exciting things below. I know many of them are “obvious” to people working in such environments, but coming from a packaged software world, SaaS is full of revelations.
Only one version to manage
Although the companies I worked at had a relatively small number of versions, I can sympathize when my colleagues tell war stories from their previous companies about the pains of trying to reproduce and fix a problem in a really old product version or a problem unique to a specific supported OS. The time spent on developing, testing and maintaining all these different versions was clearly painful and frustrating to them.
In a SaaS offering like Snowflake, there is only one platform and only one software version to manage. It saves a huge amount of time — and avoids developer frustration. 🙂
When working on a big software project, it often takes years for code you’re writing today to be released. And then it often takes years for that code to be adopted by a critical mass of your users. As a result, you get real customer feedback on it long after you already forgot about it.
For us it couldn’t be more different. Putting in place strong test and release automation allows us to rapidly do a huge amount of testing to ensure that the new code is solid. Combine that with non-disruptive online upgrades (something almost impossible in packaged software) and we can release code whenever it is ready – no reason to wait. That has the added benefit that each release is far smaller and far less complex. With packaged software, the release model forces a lot of changes to be released simultaneously, dramatically increasing the complexity of testing and deployment.
It’s amazingly satisfying to see the new features and improvements you worked on available to customers so quickly. Our new engineers in particular love to see their features out in a matter of days instead of years.
Oh, and the customers also love it! They get fixes faster, new features sooner, and all without a painful, complex and risky upgrade process.
Test exactly what will happen in production
For a developer it would be ideal to test new features in a real production environment. With packaged software that’s both impossible and a nightmare for a customer–they’d never want to test new features in their production environment because of the risk, but no matter how hard you try (even if you’re able to spend the money to duplicate your production infrastructure), you can never exactly replicate the stresses the production environment will put on the software.
We’re able to eliminate that dilemma. Because of the cloud, it’s easy for us to exactly replicate the production environment. We can test at huge scale when needed without buying huge amounts of hardware, we can replicate the exact software environment in production, and more.
We can even safely do tests in a production environment. To make that possible (and safe) we did two things: we allow more than one version of our software to run in our production environment at once, and we make sure that every feature can be turned on or off by a simple parameter that we can change on the fly for individual accounts in Snowflake.
That means that we can “bake” our new code into the production environment for additional testing without changing what customers are running on. What’s also cool is that we can transparently run actual customer queries on the new code and old code at the same time to verify results.
In many cases we can also turn on (and off) new features customer by customer. We’ve used that capability several times to roll out new features carefully, and even to immediately disable functionality when a customer sees a problem because of it.
Insight into how our product is actually being used
With SaaS, we have instant insight into how our product is being used. That lets us notice problems, identify improvement opportunities, and validate that everything works as expected.
To do it well, however, we had to design our system from the beginning with that in mind. On one side, things like metrics gathering and logging are built into the product foundations. And on the other side, we needed to create user interfaces to present this information to our development and operations teams. None of these can be afterthoughts.
Of course, all that comes with an important grain of salt: by design, Snowflake does not have access to our customers’ data. For privacy and security reasons, that’s a critical design requirement.
Even though we have amazing engineers, we do make mistakes (really! :)) and problems do occur. From previous experience I remember endless struggles reproducing a problem–you often needed users to compile all the detailed information about queries, schemas, data, configuration, operating system and hardware to have any chance of reproducing it. At Snowflake, when a problem occurs the software automatically gathers all the important information and files an issue in our tracking system. In some cases that’s resulted in the ability to fix issues in a matter of hours from the finding the issue to having a fix in production, sometimes even before the customer noticed a problem.
And we can fix bugs that customers hit but never noticed or did not report!
What it adds up to: faster innovation
These differences make us, as developers, dramatically more efficient. We focus on things that matter and avoid obstacles found with traditional models. This allows us to develop features in a fraction of the time required with traditional models. The quick feedback loop allows us to get users’ immediate reaction and adapt instantly.
All this allows us to innovate much faster than in traditional environments, and also gives us confidence that what we’re building is what our users really need. This is why we were able to build something as complex and powerful as Snowflake in such a short time.
For all of this to work, amazing people are also needed. If what we do sounds like fun to you, make sure to visit our careers page!