Snowflake continued expanding its platform capabilities at the start of the new year, adding updates to data sharing, Snowsight, and data pipelines that help customers and partners access, mobilize, and share their data for better data-driven outcomes. Here’s a brief rundown of some of the exciting announcements from January 2021.

Data Sharing

The Snowflake Data Cloud makes it easy for organizations to securely share live, granular data across their business ecosystems, reducing the risks, costs, and challenges found in traditional sharing methods. In January, Snowflake announced additional data sharing capabilities, including extending Snowflake Data Marketplace to GCP and support for data exchanges on GCP.

Snowflake Data Marketplace extended to GCP

Snowflake Data Marketplace gives data science, business intelligence, and analytics professionals—and everyone else who desires data-driven decision-making—more than 375 live and ready-to-query data sets, from more than 125 third-party data providers and data service providers. In January, Snowflake announced the public preview of Snowflake Data Marketplace for Snowflake accounts hosted on GCP. All Snowflake users who have accounts hosted on GCP can go to Snowflake Data Marketplace to browse, consume, and request available data sets with no ETL, moving, or copying data, unlocking new insights for driving innovation. Insurers, for example, can grow their subscriber base by incorporating always-fresh third-party demographic data to better personalize new offers, and retail customers can align their supply chains with customer demand by using point-of-sale, weather, and consumer-mobility information. All of this is available as live, governed data on Snowflake Data Marketplace.

Data exchange support on GCP

In addition, Snowflake customers and partners on GCP can now securely share data among designated groups within their business ecosystems, thanks to Snowflake’s recently announced support for data exchanges hosted on GCP. Data exchanges enable Snowflake users to efficiently discover and access data to augment their own data sets, share data sets with suppliers and partners, and more. Data exchange support on GCP is now in public preview.

Core Platform

The Snowflake platform was designed from the outset to take full advantage of the unique attributes of the cloud. It is capable of massive scale, unwavering performance, superior economics, and world-class data governance and security. Among Snowflake’s recent updates to the core platform are expanded availability of Snowsight, simplified account administration, and GA support for external tables.

Snowsight available in GCP regions

The Snowsight interface enables entire organizations to move more quickly from data to insights via ad hoc analytics, collaboration, and rich visualizations. For example, customers can easily create usage and billing dashboards to monitor Snowflake compute resources. Previously, Snowsight was available only to regions on AWS or Microsoft Azure cloud platforms. In January, Snowflake announced Snowsight was available in public preview for GCP regions as well.

Simplified administration with new organizations object and format

Released in private preview in January, organizations are first-class Snowflake objects that link all the accounts owned by your entity. Organizations help Snowflake customers simplify account management and billing, database replication and failover/failback, data sharing, and many other account administration tasks. The Organizations feature introduces a new role, ORGADMIN, which enables administrators to create new accounts across most Snowflake-supported regions and cloud platforms, as well as enabling them to view and manage all of the accounts in an entity.

To simplify management of account names for new and existing accounts, Snowflake also introduced in private preview a new format for uniquely identifying accounts. In the URLs and other places where an account identifier is required, the current format uses the account locator (formerly referred to as the account name), along with additional region and cloud platform information (if required). For example, the current format of the URL for accessing the Snowflake web interface is:

https://<account_locator>.<additional_info>.snowflakecomputing.com

The new format introduced in this preview combines the name of your entity with the unique name assigned to each account:

<entity_name>-<account_name>

With this new format, no additional region or platform information is required. For example, the new format of the URL would be:

https://<entity_name>-<account_name>.snowflakecomputing.com

Data Lake

Snowflake provides a flexible solution for organizations that want to enable or enhance their data lake strategies. Snowflake delivers fast query performance, extensible data pipelines, and secure collaboration to help organizations unleash the full potential of their data. In January, Snowflake continued to extend its capabilities with GA support for external tables. 

External tables improve data lake queries

For customers looking to augment an existing data lake, external tables enable them to query the data in their data lake without ingesting it into Snowflake. Customers can also choose to create materialized views on external tables to speed up query performance. In January 2021, Snowflake announced external table support is now generally available and added support for Microsoft Azure Data Lake Storage (ADLS) Gen2. Customers now can query data directly by using external tables. Finally, this release also includes the ability to create streams on external tables. Streams track the new file registrations for external tables, so that actions can be taken on files newly added to the data lake.

Data Pipelines

Data pipelines automate many of the manual steps involved in transforming and optimizing continuous data loads, enabling a smooth, automated flow of data from one step to another. Data pipelines in Snowflake can be batch or continuous, and processing can happen directly within Snowflake itself. In January, Snowflake released the new External Functions feature, which provides support for extending data pipelines.

External functions for AWS and Azure APIs

External functions enable teams to leverage services or business logic outside of Snowflake to interact with data within Snowflake by bringing data to where the computation occurs. Snowflake first announced this feature for calling regional endpoints via an AWS gateway in June 2020. As of early February 2021, the External Functions feature is generally available for AWS API Gateway and Microsoft Azure API Management. With external functions, you can easily extend your data pipelines by calling out to external services, third-party libraries, or even your own custom logic, enabling exciting new use cases such as external tokenization, geocoding, scoring data using pretrained machine learning models, and much more. With the newly announced support for Azure API Management, organizations with Snowflake deployment and resources running on Azure now can ensure their gateway is also running on the same cloud and region to avoid making cross-cloud calls.

In December 2020, Snowflake also announced GA of asynchronous external functions. This means that users do not need to worry about queries timing out during a heavy workload or with large data volumes. Following the async protocol, the remote services can process the data asynchronously (without blocking the initial request) and return the results in a future response.

Ecosystem

Snowflake works with a wide array of industry-leading tools and technologies, enabling customers to access the platform through an extensive ecosystem of connectors, drivers, programming languages, and tools. In January, Snowflake announced GA support for Microsoft ADLS Gen2, asynchronous query support for Snowflake Connector for Python, and additional support for its PHO PDO and ODBC and JDBC drivers.

Support for ADLS Gen 2

Support for public cloud providers helps Snowflake customers seamlessly integrate their choice of cloud platform with Snowflake, enabling them to make their existing architecture more efficient. In January, Snowflake announced the GA of support for Microsoft ADLS Gen2. Customers now can load and query their data more easily. Specifically, they can bulk-load data from ADLS Gen2 (using COPY INTO <table>), load data continuously from ADLS Gen2 (by calling the Snowpipe REST API or leveraging event notifications), and query files in an ADLS Gen2 data lake by using external tables.

Asynchronous query support for Snowflake Connector for Python

Version 2.3.7 of the Snowflake Connector for Python adds support for asynchronous queries, so users can start a query and then use polling to determine when the query has been completed. This support, now generally available, allows the client program to run multiple queries in parallel without using multithreading, improving the program’s efficiency. 

Improved support for the PHP PDO driver for Snowflake

In January, Snowflake also announced GA of the PHP PDO driver for Snowflake. Customers developing PHP applications can now easily connect to Snowflake and perform all of the standard database operations. You can find installation and usage instructions on GitHub

Boosted performance for ODBC and JDBC drivers

In addition, Snowflake announced updates to its ODBC and JDBC drivers with a new feature, bulk array binding with streaming PUTs, that improves driver performance by streaming the data, without creating files on the local machine, to a temporary stage for ingestion when a user inserts a large number of values. Users can insert multiple rows in a single batch by binding parameters via an INSERT statement to array variables, and both ODBC and the JDBC drivers will automatically stream the data when the number of values exceeds a threshold. 

Until Next Month

These are just a few of the new features and updates available in Snowflake. For the full list, see the Release Notes. And as always, watch for additional updates as Snowflake continuously adds new features and functions that help customers in every industry go further and faster with their data.