I’m on a journey to help make machine learning (ML) and artificial intelligence (AI) more accessible to everyone.
I became fascinated by AI more than 20 years ago when I implemented strategies for playing games such as Hex and Qubic (3D tic-tac-toe) in some of my undergraduate computer science courses. I dabbled in AI planning during my first year of grad school, and later focused on ML after seeing ML in practice during my internship at Google in 2003. I then joined Google in 2004 and spent the next 17 years developing ML technologies that reached all of Alphabet and the rest of the world through open source software and cloud offerings.
At Google, I had the opportunity to work with many of the world’s best technologists spanning ML and large-scale systems. I researched and implemented new ML algorithms with Yoram Singer, a world famous ML researcher. I developed an appreciation for Explainable AI and the challenges of incorporating domain knowledge into ML while working on ranking in Google Search. I built some of the largest scale and most comprehensive end-to-end ML systems with Tushar Chandra, a leader in distributed systems. I worked with leaders across Ads, Search, YouTube, Cloud, Android, Waymo, and other parts of Alphabet in developing and deploying ML-based solutions.
With that said, ML is my second province. My first calling is chess.
A coincidental introduction to the game in 1985, at the age of 7, resulted in an international career that culminated in becoming a Grandmaster and winning the 1997 World Junior Chess Championship in Zagan, Poland. Since my chess accomplishments preceded my career in technology, people often try to find a relationship between the two that uses my past experiences in chess to explain my progression as an engineer.
However, I believe that’s a case of mistaking correlation with causation, which also happens to be a challenge with many ML algorithms. Because I’m wired to be a deep thinker who is always looking to improve and optimize for the longer term, I happen to be a good chess player. I like to understand systems, embrace complexity, and invest a lot of time thinking through difficult decisions to maximize value. In chess, it’s about deciding what moves to make during a game, combined with hard work and introspection to evolve myself into the best player that I can be. In technology, it’s about designing systems, prioritizing what to build next, and organizing projects and teams to maximize execution and impact.
Strategic, rational thinking also explains my recent move from Google to Snowflake. I see this transition as a logical next step that enables me to think deeply and act purposefully around ML, and influence what ML looks like in the future, for everyone.
Why I Made the Move to Snowflake
Up until 2015, I focused on developing ML technologies to solve problems within Google. Around that time, the people at Google Cloud started thinking about what to do in the ML space. I began working with Greg Czajkowski at Google, who is now Snowflake’s SVP of Engineering, to explore how to apply our learnings to Google Cloud. I was particularly excited about the opportunity to help design new ML services and capabilities to make ML accessible to many more people.
After working with Google Cloud for a couple years, I saw history repeat itself—this time in the cloud and enterprise settings. It took me 10+ years to learn that although ML can produce magical experiences, ML by itself is not magic. Many companies working in the cloud and enterprise space were overly focused on treating ML as a black box, searching for a silver bullet that would enable them to create an AI technology that would leapfrog ahead of all the competition.
In reality, ML is hard to use. It often takes many people and many teams to build a single ML-based solution. The resulting solutions are usually fragile because of the unpredictable behavior of ML systems. The mistakes that ML systems make typically look different from mistakes that a human would make, and therefore such errors are difficult to explain, debug, and improve. Although we’ve made progress on all of these areas, I believe that most of the opportunity with ML still lies ahead of us.
Over the last year I wondered what I could do to have the largest impact. After looking at close to 10 different opportunities, I was most excited by the potential I saw at Snowflake. Here’s why.
Data Gravity and Machine Learning
Snowflake is extremely well-positioned to transform ML and AI for one simple reason: data is the most important part of any ML system. We all know that better data leads to better models, but using data to train models covers only a small fraction of the role that data plays in an end-to-end ML system. For example, ML systems rely on streams of data for continuous training and real-time inference. They produce many versions of interrelated artifacts such as models and predictions that participate in complex business logic and consumer-facing products, all of which consume and produce large amounts of data. Providing a great end-to-end ML experience requires a holistic approach to organizing and processing all of this data.
“Data gravity” is the observation that a growing mass of data attracts services and applications, rather than vice versa, since it is harder to move large amounts of data. As we all know, we’re generating more data now than ever before. Therefore, data will accumulate even more “mass” over time resulting in more gravitational pull. Since data is the most important part of any ML system, and data gravity means that services and applications will move closer to the data, it’s clear Snowflake has an amazing opportunity to innovate and shape how many companies will leverage ML in the future.
Data Management and Data Sharing
Before joining Snowflake, I spent two years in Google Ads where I was responsible for many ML efforts focused on optimizing long-term value for users, advertisers, and Google. This was an incredibly complex space where more data invariably led to better results. The simplest cases involved one part of my team sharing data with another part of my team. More complex cases involved sharing data across different product areas such as Search, YouTube, and Ads in order to better understand users’ short-term behaviors and long-term interests. Some of the most complex cases involved sharing data across Google and other companies in order to optimize for conversions and understand users even better. Sharing data across teams, organizations, and companies enabled many ML systems to generate significantly more value for the overall Ads ecosystem.
However, making data sharing easy and doing it in a way that followed all of the companies’ policies and privacy regulations required sophisticated techniques for data governance and data processing. A large amount of resources in Ads went towards working through these complications, but that only addressed a part of the challenge since partner companies needed their own solutions, as well. Previously I had taken data management systems for granted. Then, my experience in Ads and working with partner companies helped me understand how much ML potential was blocked by the difficulty of managing and sharing data at scale. When I first heard about Snowflake in 2019, I filed it in the back of my mind as another data management company. When I took a closer look in 2021, read about Snowflake’s revolutionary data sharing technology, and learned how quickly the Data Cloud was growing, I put everything together and realized that there is a huge and unique opportunity to build ML capabilities and services on top of the Data Cloud.
A Tech-First and Customer-First Culture
At its core, Snowflake is a technology company. The Snowflake culture reflects the tech-first and customer-first attitude of its founders. They took a really hard technology problem and spent a lot of time architecting an innovative solution which led to the world’s most scalable multi- and cross-cloud data platform. This laid the foundation for enabling the Data Cloud, where anyone can share data and build services on top of that data for everyone else to use.
Before joining Snowflake, I read The Rocket Behind Snowflake’s Rocketship, which talks about Snowflake’s engineering culture. So far everything that I read in that article is consistent with what I’ve observed in my first few months. The founders are actively engaged in all of the most important technical and product questions, and there is a strong focus on product and engineering excellence. Rather than letting a sense of urgency lead to short cuts and technical debt, teams are deliberate and take the time to have healthy debates before they converge, commit, and work towards building the best features and solutions. People are open-minded, transparent, and willing to incorporate new data and change their opinions, which suits my personal style.
In addition, there’s a strong focus on the mission, which is typical in smaller companies and remains here today, despite Snowflake’s rapid growth. There’s a palpable energy that we are building something revolutionary and changing the world. We celebrate the deals that we win, and when we lose we reflect on what we should do better in the future. Sometimes it feels as if we’re one big sports team, working our hardest to innovate as quickly as possible while many other companies rush to build new technologies in the rapidly growing cloud.
Snowflake may not have a long history in ML, but I like going from zero to one. I experienced that at Google and I’m looking forward to doing so again—this time at Snowflake, with a whole new set of challenges and opportunities.