コンテンツへスキップ
  • AT SNOWFLAKE
  • Industry solutions
  • Partner & Customer Value
  • Product & Technology
  • Strategy & Insights
Languages
  • Italiano
  • Español
  • Deutsch
  • Français
  • Português
  • 日本語
  • English
  • 한국어
  • Italiano
  • Español
  • Deutsch
  • Français
  • Português
  • 日本語
  • English
  • 한국어
  • AT SNOWFLAKE
  • Industry solutions
  • Partner & Customer Value
  • Product & Technology
  • Strategy & Insights
  • Italiano
  • Español
  • Deutsch
  • Français
  • Português
  • 日本語
  • English
  • 한국어
  • 概要
    • Snowflakeを選ぶ理由
    • カスタマーストーリー
    • パートナー
    • サービス内容
  • 概要
    • プラットフォームの概要
    • Snowflakeマーケットプレイス
    • Powered by Snowflake
    • ライブデモ
  • ワークロード
    • コラボレーション
    • データサイエンス&機械学習
    • サイバーセキュリティ
    • アプリケーション
    • データウェアハウス
    • データレイク
    • データエンジニアリング
    • ユニストア
  • 価格
    • 料金体系
  • 業界
    • 広告・メディア・エンターテインメント
    • 金融サービス
    • ヘルスケア・ライフサイエンス
    • マーケティングアナリティクス
    • 官公庁・公的機関
    • 小売・消費財
    • テクノロジー
  • 詳しく見る
    • リソースライブラリー
    • 資料
    • ハンズオンラボ
    • トレーニング
  • Connect
    • ブログ
    • コミュニティ
    • イベント
    • ウェビナー
    • ポッドキャスト
  • 概要
    • Snowflakeについて
    • 投資家情報
    • 経営陣と取締役会
    • 採用情報
作成者
Ripu Jain Ripu Jain
Anders Swanson Anders Swanson
Share
Subscribe
2022年10月7日

Upgrade to the Modern Analytics Stack: Doing More with Snowpark, dbt, and Python

  • パートナー & カスタマーバリュー
    • パートナーの視点
Upgrade to the Modern Analytics Stack: Doing More with Snowpark, dbt, and Python

A large number of organizations are already using Snowflake and dbt, the open source data transformation workflow maintained by dbt Labs, together in production. Python is the latest frontier in our collaboration. This article describes some of what’s made possible by dbt and Snowpark for Python (in public preview).

The problem at hand

Building data applications (inclusive of visualization, machine learning (ML) apps, internal/external business apps, and monetizable data assets) has traditionally required teams to export data out of their analytical store due to a language/tooling preference or limitations of SQL. While users get access to bespoke tooling, improved productivity, and well-documented design patterns for individual personas and scenarios, it also created a situation of:

  • More data silos introduced by different tools and processes in the mix
  • Increased maintenance and operating costs because of complicated architecture
  • Increased security risks introduced by data movement

But what if this wasn’t the case? What if there was a way to address the challenges without sacrificing the advantages?

Introducing our players

In order to solve the problem at hand, we need to examine who are the key players most affected:

  • An analytics engineer who may occasionally reach for the Python wrench, for example, using a popular fuzzy string matching library vs. rolling your own implementation in SQL (keep reading, demo below). 
  • A Python-preferring data scientist and ML engineer deploying ML capabilities (featurization, scoring, training) who is expected to have SQL skills in order to access the enriched, transformed, trusted data from Snowflake. 

These players haven’t been properly equipped in the past. When our analytics engineer tries to use Python, they are faced with the challenge of having two data processes to manage. On the flip side, the data scientist and ML engineer don’t easily collaborate with the analytics engineer, who has already transformed the data into the data cloud, which means duplicating existing work. 

But what if I told you that Snowflake and dbt Labs can help with this conundrum, and deliver data products with improved productivity, without the issues we described earlier?

Enter Snowpark!

Snowpark is a data programmability framework to explore and transform your organization’s data and leverages Snowflake for data processing, while employing all the benefits that come along with it, such as enterprise-grade governance and security, near-zero infrastructure maintenance, and monetization opportunities. 

Snowpark for Python recently became available for public preview, and the use cases it enables are almost limitless, especially for data scientists and ML engineers—from feature engineering to training to serving batch inference. 

But what exactly comprises Snowpark for Python? Watch this video for a helpful explanation, which is illustrated in the image below. Hint:*

  • A client-side API to allow users to write Spark-like Python code
  • Custom Python Functions and Objects support that can run Python libraries available through the Anaconda integration
  • Stored Procedure support providing additional capabilities for compute pushdown

Organizations across industries are putting their data to use, leveraging Snowpark for Python for data science and ML workloads, and solving a number of unique business use cases.

Now, as awesome as Snowpark for Python itself is, its usefulness gets boosted when partners like dbt leverage Snowpark to allow data teams to unify data pipelines for both analytics and ML use cases. In fact, dbt Core’s most asked-for feature was support for Python models in the DAG. 

Enter dbt Python models

As dbt Labs CEO Tristan Handy notes in his recent post, Polyglot Pipelines: Why dbt and Python Were Always Meant to Be: 

[In July 2017] I wrote ‘we’re excited to support languages beyond SQL once they meet the same bar for user experience that SQL provides today.’ And over the past five years, that’s happened.

dbt-labs/new-python-wrench-demo serves to illustrate that Python has arrived with a great user experience. The made-up data, from a fictional “fruit purchasing” app, was created to illustrate a sample use case of when fuzzy string matching can be useful for an analytics engineer. Below are two video walkthroughs of the background, business problem, and code. If you already have a Snowflake account and a dbt project, you can also run this code today. Be sure to open an issue on the repo if you run into trouble.

  • Python wrench I: Intro & background

Taking the next step

If you’re interested in diving deeper into how to get the most out of dbt and Snowpark, then you won’t want to miss dbt’s Coalesce Conference 2022, starting October 17 in New Orleans (as well as virtually.) Expect to see talks such as this one, in which Eda and Venkatesh, Snowflake Partner Sales Engineers, explore how Snowpark further enhances a dbt + Snowflake development experience by supporting new workloads.

If you’d prefer to sink your teeth into something immediately, then the “Getting Started with Snowpark Python” hands-on guide and Eda’s blog post taking a first look at dbt Python models on Snowpark are fantastic resources.
All of this is just scratching the surface of the value created by dbt and Snowpark with Python. Where does it ultimately lead? Toward a future with fewer silos between the people working on analytics workflows and the people working on data science workflows—and we couldn’t be more excited for it.

Share

Related Content

  • 製品 & テクノロジー
  • 私たちの取り組み
    • イベント
2021年11月5日

Snowparkへようこそ: データクラウドのための新しいデータプログラマビリティ

注:本記事は(2021年6月15日)に公開された(Welco…

Here's How
全文を読む
  • 製品 & テクノロジー
    • データサイエンス&機械学習
2022年7月11日

Snowpark Python:データクラウドにエンタープライズレベルのPythonイノベーションをもたらす

注:本記事は(2022年6月1…

Discover
全文を読む
  • 製品 & テクノロジー
    • データサイエンス&機械学習
2022年8月18日

Build Your Code in Snowflake Using Snowpark and Your Favorite Notebook

One of the bigg…

Discover
全文を読む

Join the live demo: 

Building the Future of Data Science with Python

Data Science Platforms

Data science platforms enable new technologies and data science innovation.

Here's How
全文を読む

Snowflake + Fivetran + dbt: Turn Your Marketing Data Silos into...

The solution to marketing problems is not always the newest silo'd SaaS app. See how Snowflake + Fivetran + dbt turn data...

詳細を見る
全文を読む

Database Normalization for Faster Data Science

Database Normalization: benefits and methods explained. Learn how normalizing data optimizes data science processes.

Expand your knowledge
全文を読む

Snowflake Invests In dbt Labs, Cementing Our Partnership And Paving...

At Snowflake, we aim to provide the best possible experience for data professionals in the Data Cloud, which is why we’re...

More Details
全文を読む
Snowflake Inc.
  • プラットフォーム
    • メディア&エンターテインメント
    • アーキテクチャ
    • 価格
    • Snowflakeデータマーケットプレイス
  • ソリューション
    • 医療・ライフサイエンスのためのSnowflake
    • 金融サービスのためのSnowflake
    • マーケティング分析のためのSnowflake
    • 小売業のためのSnowflake
    • 教育のためのSnowflake
  • リソース
    • リソースライブラリー
    • ウェビナー
    • 資料
    • コミュニティ
    • 法務
  • もっと知る
    • ニュース
    • ブログ
    • トレンド
  • 企業情報
    • Snowflakeについて
    • 経営陣と取締役会
    • パートナー
    • 求人
    • お問い合わせ

Sign up for Snowflake Communications

Thanks for signing up!

  • Privacy Notice
  • Site Terms
  • Cookie Settings

© 2023 Snowflake Inc. All Rights Reserved