Earlier this year at Snowflake Summit 2021, we announced Snowpark Accelerated, a new program for partners who integrate with Snowpark. It provides them with access to technical experts and additional exposure to Snowflake customers. It’s been incredibly exciting to watch what our partners have been building with the help of our new developer experience, which brings deeply integrated, DataFrame-style programming to the languages developers like to use.
With the Snowpark API (currently in public preview) and Java user-defined functions (UDFs) (currently in public preview), Snowflake and our partners can now enable data engineers, data scientists, and developers to take advantage of Snowflake’s powerful platform capabilities and the benefits of Snowflake’s Data Cloud using their development languages and frameworks of choice.
As of August 1st, we have more than 50 partners enrolled in Snowpark Accelerated. Below, we chose a handful to showcase some of the new innovations and capabilities. These include scaling model scoring more quickly, performing complex machine learning tasks more easily, and meeting data privacy and security requirements more efficiently—all directly inside Snowflake.
Here are some of the amazing ways our Snowpark Accelerated partners have been democratizing data in the areas of data science, data engineering, data governance, and security:
DataRobot, Dataiku, and H2O are leveraging Snowpark to scale model scoring without the need to set up and manage a separate Spark cluster. With Snowpark, it can all happen in Snowflake.
All DataRobot non-time series models that can produce scoring code will now be available for deployment directly inside of a Snowflake Java UDF. This deployment option provides enhanced scoring speed beyond traditional model-scoring methods that previously ran outside of Snowflake.
Before Snowpark and Java UDFs, several data preparation functions and predictive model scoring had to happen in Dataiku or other engines because they could not be expressed in SQL. This meant data movement in and out of Snowflake was required, impacting performance. Now, customers can take full advantage of the performance and governance benefits of Snowflake by fully operating their Dataiku pipelines in Snowflake.
Snowpark enables the data in Snowflake to be available as a DataFrame using Scala and Java code, which can be executed within the Snowflake environment. Data scientists can use H2O.ai tools and models with the power and scaling of the Snowflake platform. In one use case, H2O combined its data set to predict loan defaults with LendingClub data and a demographic data set from Snowflake Data Marketplace to improve its model’s accuracy and score at scale within the Snowflake environment. Once H2O trained its model on its data and third-party data from Snowflake Data Marketplace, it was able to import the model to where the data lives for an easier path to deployment.
Rivery, LTI, and phData use Snowpark to help their customers unlock new data insights, provide a better customer experience, and perform machine learning tasks with lower costs and complexity.
Using Snowpark, customers can unlock new dimensions of their data by running SQL, Scala, and Java UDFs on Snowflake—all in parallel, directly inside Rivery. This gives customers control over their data and empowers them to extract more valuable insights. Rivery’s Snowpark integration also simplifies data operations by centralizing data workflows and eliminating superfluous data systems. Customers can also embed Snowpark’s functionality in automated data workflows. In one use case, a customer analyzed its customer experience data and scheduled alerts and notifications to be sent to customer service teams in Slack to take action on issues.
With Snowpark and Java UDFs, LTI Mosaic can simplify and amplify DataOps (the application of DevOps principles to data), MLOps (machine learning operations), and ModelOps (a holistic method for rapidly and iteratively moving models through the analytics lifecycle) capabilities for Snowflake. By leveraging Snowpark and Java UDFs, Mosaic customers can deploy machine learning models, push the model inferences, and write code that can be pushed down to Snowflake for processing.
phData customers want to perform machine learning tasks such as NLP or image recognition on their Snowflake data. Traditionally this has been a complex process that couldn’t be done with SQL alone. Using Snowpark and Java UDFs, customers can now manage this complex pipeline all in Snowflake, reducing cost and complexity significantly.
DATA GOVERNANCE AND SECURITY
With Snowflake, Talend and Protegrity help customers meet data quality, privacy, and security requirements more easily and efficiently.
Snowflake and Talend customers can now use Talend Trust Assessor to perform an instant health check on all their Snowflake data with a simple click. With Snowpark, the health check happens directly within Snowflake—no sampling, no data moving, and in compliance with privacy and sovereignty requirements.
Tarik Dwiek is Head of Technology Alliances at Snowflake.