데이터 엔지니어링

JUN 01, 2025|10분 읽음

Empowering Data Engineering Today for Tomorrow’s Challenges

Data engineering has never been more vital than it is today. Given the progress in AI, advanced analytics and data-driven applications, data engineers have become indispensable to their organizations as they get ready to harness these technologies. These previously unsung heroes are now finding themselves in the spotlight, building the mission-critical data pipelines that will set their organizations up for future success. However, this opportunity doesn't come without its challenges. Fragmented tech stacks, performance bottlenecks and the high price of specialized talent have become the norm — hindering innovation, driving up hidden costs and stalling progress.

At Snowflake, we believe there's a better way. We are committed to empowering data engineers with the tools and platform to navigate the complexities of the modern data landscape — whether it’s streamlining the data pipeline creation process or unifying unstructured and structured data within the same infrastructure. We want our customers to feel confident in their ability to lead the charge, thanks to innovations that streamline processes, foster collaboration and unlock data’s true potential. That means reducing the time spent on tedious tuning and the mundane maintenance work involved with outdated data engineering systems; instead, data engineers can work with freedom to discover new use cases and explore the uncharted territory ahead.

Our vision for the future of data engineering simplifies the complex, democratizes insights and is more connected than ever before. Now, your data does more — for you.

Today, we are thrilled to announce a slate of new capabilities and product updates that are built for that future. In this blog post, we’ll go into detail about these features and the value they deliver, as you build efficient data pipelines, accelerate your open lakehouse initiatives and integrate AI and unstructured data into your workflows with surprising ease. With Snowflake, you can focus on delivering value and driving innovation, leaving the complexities of data infrastructure behind.

Announcing Snowflake's latest innovations for data engineering

Openflow: Revolutionizing data movement

Snowflake Openflow is an open, extensible, managed multimodal data integration service that makes data movement effortless between data sources and destinations. Supporting all data types — including structured and unstructured, batch and streaming — Openflow revolutionizes data movement directly within Snowflake, key for enabling seamless extract, transform, load (ETL) processing for AI. All data integration is unified in one platform, with limitless extensibility and interoperability to connect to any data source. Facilitating any data architecture, Openflow allows companies to confidently scale their integration needs with enterprise-grade reliability and governance. Hundreds of ready-to-use connectors and processors simplify and rapidly accelerate data integration from a broad range of data sources, including connectors from strategic partnerships. For example, Snowflake is partnering with Oracle on a high-performance, scalable and cost-effective solution for replicating Change Data Capture (CDC) from Oracle databases to Snowflake.

Snowflake Open Flow diagram - Data Eng.jpg

With Snowflake Openflow, you can:

Free up data movement with any connector your business demands.
Unlock ETL pipelines to empower AI agents to make decisions at machine speed.
Build scalable, enterprise-ready integration with flexible deployment, data observability and governance.

dbt Projects on Snowflake

As the core building blocks of any effective data strategy, transformations are crucial for constructing robust and scalable data pipelines. Today, we're eager to announce another exciting product advancement to build and orchestrate data pipelines: dbt Projects on Snowflake (coming soon to public preview).

A favorite among data teams, dbt offers the ability to bring software engineering best practices and improved efficiency in SQL and Snowpark data transformation workflows right in Snowflake. With this new native option, data teams can now build, run and monitor dbt Projects directly in Snowsight UI, reducing context switching, simplifying setup and accelerating the development lifecycle of data pipelines.

With dbt Projects support, you can:

Enable new teams to build and deploy pipelines with uniform governance: Accelerate onboarding and empower new teams to create the pipelines they need through an intuitive interface with a uniform governance and security model for data and pipelines.
Consolidate systems to reduce administration and improve debugging: Run dbt natively on Snowflake and streamline the pipeline development lifecycle to improve developer productivity and quickly spot and address problems.

These capabilities are just the start; more exciting updates to further streamline and enhance your workflows are coming soon.

SQL and Python pipeline enhancements

With recent enhancements to some of our most popular features, we are simplifying complex workflows throughout the data engineering landscape — impacting everything from collaborative SQL workflows to intricate Python pipelines. These improvements aim to streamline processes and boost efficiency for data engineers working across diverse tools and technologies.

Dynamic Tables offers a declarative framework for both batch and streaming pipelines, simplifying setup with automatic orchestration and continuous processing. Notable updates include full support for Apache Iceberg, reduced latency for near real-time pipelines (around 15 seconds, in private preview) and performance enhancements for various SQL operations. Furthermore, new SQL extensions (generally available soon) provide greater control over pipeline semantics by preventing updates or deletions and enabling data backfilling.

Another new update, pandas on Snowflake with hybrid execution (private preview), supports pandas pipelines across all data scales. This feature intelligently executes queries either by pushing down to Snowflake for large data sets or locally with standard pandas for smaller ones, enabling consistent and efficient performance throughout the development lifecycle.

Open lakehouse

Break free from data fragmentation and accelerate your time to insight and AI with Snowflake's comprehensive Apache Iceberg™ table support. Now, data engineers can fundamentally redefine how they build open and connected lakehouses. By automatically centralizing and activating nearly your entire Iceberg ecosystem on a single pane of glass, we simplify your data lifecycle, eliminating the need for complicated processes for data discovery and access. Continued support for transforming Parquet files into Iceberg tables, and newly available optimizations for file size and partitions, ensures your open lakehouse incorporates more of your data while increasing performance.

With Snowflake’s growing lakehouse capabilities, you can:

Discover and activate data from nearly anywhere: Build a single connected view of your open lakehouse by integrating Snowflake Open Catalog, or any other Iceberg REST-compatible catalog, to Snowflake with Catalog Linked Databases — coming to public preview soon. Automatically discover, refresh and activate the underlying tables with the Snowflake AI Data Cloud’s unified compute and price-performant engine.
Transform with unified governance and achieve optimal performance: With recently launched support for write on externally managed Iceberg tables, in public preview soon, perform seamless data transformation across your Iceberg ecosystem within Snowflake, and enjoy built-in comprehensive governance and security powered by Snowflake Horizon Catalog. Take control of performance with Snowflake by defining file sizes and partitions on nearly any Iceberg table, coming to public preview soon. Leverage Table Optimizations (public preview soon) to automate garbage collection, compaction and more. Spend less time managing infrastructure and more time delivering impact.
Build declarative pipelines for Iceberg tables: Simplify your pipelines with Dynamic Iceberg tables, a fully managed orchestration solution that continuously and incrementally transforms your data stored in Iceberg tables while maintaining full interoperability. Support for Snowflake-managed Iceberg tables is now generally available, with support for externally managed Iceberg tables becoming generally available soon.
Deliver advanced analytics on more data: Unleash the value of your semi-structured data with VARIANT support, now in Iceberg tables. Seamlessly integrate geospatial and geometry data types, in private preview soon, to unlock deeper, location-aware insights. Support for Merge on Read, now in private preview, means you can now activate more of your Iceberg ecosystem within Snowflake.
Access Delta Lake data as Iceberg tables without data migration: Bring more of your data into your open and connected lakehouse by converting Delta table metadata into Iceberg tables without ingestion or needing to move the underlying Parquet files.

Enhanced integration and enterprise-grade security for Open Catalog

Unlock the full potential of your Iceberg tables in Snowflake and enjoy comprehensive security and governance. By automatically syncing your Open Catalog-managed Iceberg tables, you gain unified read/write access and consistent, integrated governance powered by Horizon Catalog — all within the Snowflake environment. Enjoy clear separation of governance, with Horizon managing Snowflake queries and Open Catalog handling external multiengine access, eliminating ambiguity for your security posture. Plus, enterprise-grade security features are now available in Open Catalog, providing secure user access and private data connections for the leading secure, interoperable and vendor-neutral catalog.

Thanks to Snowflake’s Open Catalog enhancements, users can:

Unlock secure enterprise-grade user access: Enjoy seamless UI access through single sign-on (SSO) with SAML 2.0, and enable a secure programmatic integration across engines and services through either OAuth with your preferred identity provider or Snowflake's native key-pair authentication solution.
Activate bidirectional private connectivity for metadata access: Leverage Private Link, a unified security framework, to establish protected connections among your data, Snowflake, engines, tools and Snowflake Open Catalog, helping to ensure your data remains private and compliant throughout its lifecycle.
Access your entire Iceberg ecosystem: Unlock seamless access to virtually all of your Iceberg tables with Catalog Federation in Apache Polaris (incubating). By creating a single view of all linked catalogs, Federation streamlines data discovery and enables activation on any engine that supports Iceberg REST catalog integrations. Federation is coming to Open Catalog soon in private preview.
Simplify your Delta table management: Centralize both Iceberg and Delta tables on Snowflake Open Catalog. Create, update, delete and govern access across Delta and Iceberg tables from a single pane of glass. Automatically discover both formats within Snowflake, allowing you to query Delta tables, providing unified visibility and control over your lakehouse assets. Delta tables in Open Catalog, a managed service for Apache Polaris (incubating), is coming to private preview soon.

Modern DevOps experience

Reduce time to impact with developer productivity improvements that allow you to focus on high-value work rather than just keeping the lights on. The DevOps functionality in Snowflake allows you to streamline and automate the software development lifecycle for your Snowflake environments with an emphasis on best practices in CI/CD, code development and infrastructure management. Coupled with modern DevOps tools and AI support in Snowflake, you get smooth integration between development and operational tasks, leading to a more productive and efficient workflow.

In line with our commitment to continuously improve your DevOps experience with Snowflake, we are announcing some new updates.

Snowflake Workspaces: Provides a modern UI for all Snowflake development tasks. Starting with dbt Projects and SQL support, builders will be able to utilize a single common IDE with rich developer features, including native Git integration, side-by-side visual differencing and in-line AI Copilot code assistance, when working with files in Snowflake. Additional object support will be delivered in the future.
Snowflake infrastructure management using Terraform: The Snowflake Terraform provider enables a consistent workflow for managing Snowflake resources — including warehouses, databases, schemas, tables, roles, grants and more — using HashiCorp Terraform to manage your Infrastructure as Code (IaC).
More ways to connect to your Git repos: Now you can use custom URLs to connect to your Git repos (rather than being limited just to repos that belong to well-known domains), providing you more flexibility in how you configure your Git environment.
Python 3.9 runtime support: You can now use Python 3.9 with your Snowflake Warehouse Notebooks.

The future is now

Snowflake's latest innovations are designed to tackle the biggest challenges in data engineering head-on. Let's explore how these advancements can revolutionize your data strategy.

Build better pipelines

Modern data engineering thrives on streamlined collaboration and scalability. By expanding our native capabilities across ingestion and transformation with capabilities such as Openflow and dbt Projects, we're empowering your teams to work together seamlessly within Snowflake's secure environment. We are also supporting the flexibility of open standards and popular open source software (OSS) like dbt and Iceberg, integrating them effortlessly into your existing workflows.

Free your team from the burden of managing complex infrastructure and instead focus on high-value tasks. Our serverless transformations and orchestration options eliminate the need for hosting and managing compute clusters, all while delivering exceptional performance. And to top it off, automation is at the heart of our platform, streamlining your development lifecycle through CI/CD, deployment automation and robust infrastructure management.

Accelerate your open lakehouse

Your open lakehouse should run like a well-oiled machine, capable of handling all your data formats seamlessly, regardless of where it is stored. Snowflake empowers you to connect, transform and activate all your data with ease. Security and governance are paramount — our platform provides robust data protection, granular access controls and comprehensive governance practices, including data masking and audit access. With Snowflake, you can confidently maintain data quality, accuracy and reliability across your entire data ecosystem. We're committed to fostering a data environment that promotes innovation and productivity through optimized tooling and standards, all while ensuring your architecture can scale effortlessly as your business evolves.

Harness your data for AI

Unleash the power of AI with Snowflake's ability to unify your unstructured, semi-structured and structured data. Seamlessly combine text, documents, images and other unstructured data formats with your existing structured data, creating a comprehensive foundation for AI models. Leverage features like Openflow (built with Snowflake Cortex AI processors available) and Document AI to harness the power of LLMs and AI directly in your pipeline. Use Snowpark's powerful capabilities to process and transform unstructured data at scale using Python and other familiar languages.

Snowflake also enables you to build and deploy cutting-edge generative AI applications by harnessing the power of top-tier LLMs, state-of-the-art retrieval-augmented generation (RAG) and other advanced gen AI services through Cortex AI. Connect your entire enterprise data landscape to AI with near real-time, bidirectional data flows using Openflow and its support for diverse data structures and requirements. Simplify the complexity of data pipelines for AI, eliminating the need to juggle disparate tools across multiple teams. With Snowflake's unified security, governance and observability, you can confidently deliver AI solutions to production, adding trust and reliability every step of the way.

To learn more about these data engineering advancements and more, register for Snowflake’s next Data Engineering Connect event on July 29, 2025.

저자

Padmaja Vrudhula

Lauren Delgado

30일 무료 평가판시작하기

Snowflake를 30일 동안 무료로 사용해 보세요. 다른 솔루션에 내재된 복잡성, 비용 부담, 제약 조건 등을 해소하는 데 도움이 되는 AI 데이터 클라우드를 경험하실 수 있습니다.

제품

솔루션

Snowflake를 사용해야 하는 이유

리소스

개발자

요금

Empowering Data Engineering Today for Tomorrow’s Challenges

Announcing Snowflake's latest innovations for data engineering

Openflow: Revolutionizing data movement

dbt Projects on Snowflake

SQL and Python pipeline enhancements

Open lakehouse

Enhanced integration and enterprise-grade security for Open Catalog

Modern DevOps experience

The future is now

Build better pipelines

Accelerate your open lakehouse

Harness your data for AI

저자

Subscribe to our blog newsletter
Get the best, coolest and latest delivered to your inbox each week

30일 무료 평가판시작하기

Empowering Data Engineering Today for Tomorrow’s Challenges

Announcing Snowflake's latest innovations for data engineering

Openflow: Revolutionizing data movement

dbt Projects on Snowflake

SQL and Python pipeline enhancements

Open lakehouse

Enhanced integration and enterprise-grade security for Open Catalog

Modern DevOps experience

The future is now

Build better pipelines

Accelerate your open lakehouse

Harness your data for AI

저자

기사 공유하기

Subscribe to our blog newsletterGet the best, coolest and latest delivered to your inbox each week

30일 무료 평가판시작하기

Subscribe to our blog newsletter
Get the best, coolest and latest delivered to your inbox each week