When it comes to data, state of the art is an ever-moving target. There’s also a lot of hype around “cutting edge” that isn’t always grounded in reality. At Snowflake, we’re able to see how cutting-edge companies are actually working with data on our platform. In our inaugural Data Trends report we examine actual usage data from those companies—rather than simply what leaders feel about their data operations—and surfaced four trends redefining the way modern companies succeed:
1. Companies are connecting data everywhere they can. Conversely, in some ways, it grows harder to connect your data, despite years of talk about knocking down data silos. The number of SaaS applications increases, each a potential silo, and modern companies stretch their cloud resources across multiple providers. The number of organizations with data across all three major public clouds grew by 207% over the year*. Rising data complexity undermines the drive toward more AI/ML use cases. There are more and more computational workloads that need building for these advanced tools, and if you don’t connect all your data to a single, all-encompassing data source, you’re bound to fall behind.
2. State-of-the-art companies are bringing their work to the data—not vice versa. Like many data platforms, Snowflake advocates storing all your data in one place. But that alone still creates challenges if you have to pull out and prep discrete data sets for each kind of job you want to do. On our platform, we’re seeing users embrace the next step: doing meaningful work with all that data together, rather than extracting and uploading the data to each new application. Companies are able to do more for less by eliminating siloed infrastructure stacks. Tasks that used to take a team upwards of six weeks can be done in mere days. A Snowflake-specific example comes to mind: Our CEO, Frank Slootman, called me on a Thursday. He wanted a generative AI solution that would make it easier to navigate all our sales data. Frank wanted to be able to pose any sales question with natural language, not code, and get an answer. In just two days we developed an app with a simple interface powered by Streamlit that could answer high-level executive questions a CEO may want to review on a daily basis. This element of speed and self-service is a game changer.
3. Governance matters more. Every single data trend has a governance undercurrent. If your data lives in five different places, chances are there are five conflicting governance policies. LLMs are the topic of the hour and we should lean in, but as always do so in a way that ensures data is protected. LLMs have recently proven they can make both developers and business users more productive through models trained by the internet’s data. A huge opportunity lies ahead for organizations to enhance LLMs with their own data. But the most advanced LLMs are hosted in external services which opens the risk of exposing proprietary data. Organizations need to begin defining a strategy around how to bring LLMs, both open-source and commercial, to the data and not the other way around. To connect data, work where the data lives, and automate at scale, consistent governance is key. Ideally, a single platform with built-in governance capabilities powers classification, role-based access control (RBAC), object tagging, data quality, and observability. In a previous blog post, My 2023 Predictions for Chief Data Officers, I stressed that it’s ill-advised to endure the cost and complexity of using disparate tools. This is the future—it will become standard to have everything in one place.
4. Companies are embracing automation and expect a fully managed platform. A fully managed data platform is necessary to not only access new insights but to take action without the lag of human review. Responding to a security event or managing cloud resources with real-time efficiency are standard operating procedures in a modern business. In the case of the latter, we saw a dramatic uptick last year in automated warehouse resize events, the use of automation that helps customers be efficient with immensely scalable cloud resources. At Snowflake, we applied the principle to software application licenses. We created an automated management tool that revokes an employee’s software license if they haven’t used the application for a set period of time. In the tool’s first year, we saved $5.5M in unnecessary SaaS costs.
The broader implications
These trends point to the bigger picture—without data unification, you’re limiting your company, your insights, your potential, and the opportunity for monetization. Every company is looking to realize value from its data, whether by creating an entirely new digital product or whipping up an internal tool over a long weekend. Everyone wants to be in the race toward generative AI but not everyone is equipped with enough data to get quality insights. By accounting for these four trends in your data strategy, you put yourself in the race.
*The data in the Data Trends Report 2023 covers the 12-month period ending Jan. 31, 2023 (referenced throughout as “this year” or “the year”), to align with Snowflake’s 2023 fiscal year. We examined data usage of roughly 7,800 Snowflake customers, some of them longtime Snowflake users, and others only recently joining the Data Cloud. Note that Snowflake’s customer base grew 31% in the 2023 fiscal year, which provides a baseline of comparison for statistics identifying trends that outpaced this overall growth.