Spark to Snowpark
for Data Engineering
Migrate Spark pipelines with minimal code changes and reduce operational overhead with an elastic processing engine that natively supports Python, Java, Scala, and SQL.
Code Like PySpark.
Execute Faster.
Develop data transformations and custom business logic from your integrated development environment or notebook of choice using the Snowpark Client Library. Push down operations to Snowflake's engine for elastic, performant, and governed processing.
Snowpark
Overview
DataFrame API
Build queries using Spark-like DataFrames from your integrated development environment of choice and push down processing to Snowflake’s elastic processing engine.
User Defined Functions
Run custom logic written in Python or Java to run directly in Snowflake using User Defined Functions (UDFs) that can be migrated from Spark with minimal code changes.
Stored Procedures
Operationalize and orchestrate your pipelines and custom logic directly inside Snowflake and make it accessible to your SQL users.
Snowflake Platform for
Multiple Languages
Snowflake’s unique multi-cluster shared data architecture powers the performance, elasticity, and governance of Snowpark.
Hear from
Snowpark Developers
Customers are using familiar programming in Snowpark to build scalable and governed data pipelines.
LANGUAGE OF CHOICE
“Snowpark enables us to accelerate development while reducing costs associated with data movement and running separate environments for SQL and Python.”
—Head of Data Engineering and ML, HyperFinity
STREAMLINED ARCHITECTURE
“UDFs bring simplicity, because a lot of processing that was previously in Spark is now able to be coded to a UDF and can be easily made accessible for execution as part of a SQL statement.”
—Sr. Director Clinical Data Analytics, IQVIA
ELASTIC SCALABILITY
“With our previous Spark-based platforms, there came a point where it would be difficult to scale, and we were missing our load SLAs. With Snowflake, the split between compute and storage makes it much easier. We haven’t missed an SLA since migrating.”
—Senior Manager of Data Platforms, EDF
Get Started