Modern Data Infrastructure is Like Good Parenting: Freedom to Explore with Guidelines

“In the future we will be designing our world, not just studying,” predicted Robert Dijkgraaf, a theoretical physicist at the Institute for Advanced Study at Princeton, a few weeks ago at DLD Summer in Munich.1 Information gathered from our current world will fuel those designs. Artificial intelligence will guide decisions and help us navigate this new world. The theme of the event—“It’s Only the Beginning…”—reflected the journey to realizing these predictions.

The panel “Doing Better with Data and AI” addressed new frontiers for using AI to augment human intelligence and the modern data infrastructure required to make it happen.2 Zayd Enam, CEO and Founder of Cresta, argues that the current attitude concerning artificial intelligence is “lazy.”3 The overall trope is that AI is here to take our jobs by automating them. That’s too easy, and he contests that lazy view with a more creative view. Humans are creative. Humans may not be the fastest or the strongest species but we build tools to compensate. The spear, the wheel, the telegraph, the phone, the computer: These have all been building blocks to unlock productivity. We don’t question our ability to pick up the phone and call family, friends, or colleagues on the other side of the world or to hop on a bicycle or car to transport ourselves faster. Steve Jobs once said, “The computer is the bicycle of the mind.”4 It accelerates a journey. So Enam questioned why we view AI so negatively. Like these other tools, AI is a building block to unlock new possibilities. 

Cresta has built an expertise engine that it applies in a first use case to customer service. The model learns from transcripts of customer engagements and augments intelligence to improve future engagements. The tool helps customer service reps understand what makes them effective, learning what makes a positive outcome versus a negative outcome. Cresta is not about just having a conversation with a chatbot but rather having a more intelligent conversation with a real person. Cresta improves human interaction rather than replacing it. 

The conversation then shifts to how to implement these new tools. AI expands the realm of the possible, and it begs companies to test these new tools across different contexts. The training of these new models requires vast amounts of data from a wide range of sources. In this fast-paced world of experimentation and innovation, what kind of data infrastructure does it take to support these new initiatives?

A Modern Data Infrastructure Enables Experimentation

Modern data infrastructure and the best practices for deploying it are a lot like parenting, I argue. It’s about enabling freedom to explore and learn within guidelines that prevent catastrophic outcomes, providing a safe environment to explore the art of the possible. Snowflake offers that yin yang to a modern data infrastructure. 

The Snowflake Data Cloud enables experimentation and innovation; its shared data, multi-cluster architecture provides high performance and elasticity. What does that really mean? Well, the performance side of things is like teaching kids to pat their heads and rub their stomachs: doing things at the same time, and doing them as fast as possible. OK, maybe it’s about more than just that trivial exercise. It’s about juggling work and life, learning how to get it all done but still enjoy it. That was an important lesson that I learned and that I hope I passed on. In the data world, that means running a lot of queries and analytics workloads while also ingesting and processing data. But it also means experimenting and learning to improve a skill and not fearing failure—getting back on the bike when you fall off, for example. Again, in the data and analytics world that means testing new models quickly to see if they work. Having a high-performance data infrastructure helps.

On the elasticity front, it means being there when needed, also an aspect of parenting. During the recent pandemic, we’ve seen companies have to scale up or down. Think of Instacart or other online grocery services that saw a sharp uptick in demand or airlines and airports that have contracted with the collapse of travel. A modern data infrastructure is cloud-based and allows that kind of flexibility. It’s there when you need it but can go away when you don’t. And, Snowflake’s consumption billing model means that you don’t pay if you’re not using it. 

The Data Cloud facilitates data sharing, breaking down data silos whether they are internal across business units and functional departments or external across partners and customers. Snowflake Data Marketplace facilitates access to hundreds of data sets from third-party providers of consumer data, firmographic data, economic indicators, weather, location, or health information and other “alternative” data to help navigate complex and dynamic business environments. Is this like parenting? You bet! This is the fun stuff. The sleepovers. The family reunions. Getting friends and family together.

Data Governance Provides the Guidelines

But we all grew up with rules. We lived in governed worlds. In my house, we had tags like “inside voices” and “outside voices.” And we knew what we could and couldn’t do. Sometimes it was explicitly defined as a “no-no.” But as we matured, the rules were more nuanced—from the early rules of “Come home when the streetlights go on” in the summer months to “Be home by midnight” as an early teen. These rules gave us freedom to explore but with guidelines we needed to keep us out of trouble. 

The Data Cloud also provides a governed environment. Data governance establishes the guidelines that ensure we don’t get into trouble while exploring, sharing, and collaborating with data. Data governance isn’t just about security and privacy. Yes, it is about protecting data, but it’s also about knowing data—think of the “inside voices”—and about unlocking data—for example, eventually getting the keys to the car. 

Data governance also requires education. Organizations must ensure that people recognize data, understand the value it brings to the organization, and accept responsibility for their role—whether that role is capturing data, protecting it, or using it. This type of training or data literacy is critical to creating a data-driven culture, and it should be an element of employee onboarding.  

In our fast-moving world of AI innovation, we need both the yin and yang of this modern data infrastructure. We are only just beginning to understand the power of data and AI, and we need to be able to explore. However, as the headlines often remind us, there are risks out there and we need some guidelines, some rules, to follow. Today’s data governance is tomorrow’s AI ethics—or is it already tomorrow? I guess that’s what happens when we stay out past midnight.