New Snowflake Features Released in July 2020

Author: Snowflake

Snowflake News

In keeping with its continued focus on building the most powerful and flexible cloud data platform for data-driven companies, Snowflake released a host of new features in July. The following roundup briefly describes several of them.

Column-Level Security

Security has always been baked into Snowflake’s cloud data platform, allowing organizations to easily and safely share data with internal consumers and external customers while adhering to relevant data governance, security, and regulatory policies.

Two column-level security features Snowflake introduced in public preview last month allow users to further safeguard data and make it easier to govern by applying a masking policy to a column in a table or view. Dynamic Data Masking is a column-level security feature that uses masking policies to selectively mask plain-text data in table and view columns at query time, and External Tokenization allows organizations to tokenize sensitive data before loading that data into Snowflake and dynamically detokenize data at query runtime by using masking policies with external functions. 

With these features, policy administrators can hide details such as customers’ personally identifiable information (PII) and protected health information (PHI) data from specific users and roles while allowing others to see the data, allowing very specific masking policies for users and groups. And, because these policies are also universal, you can implement them with confidence.

Both features require Enterprise Edition or higher.

Also included for Enterprise Edition or above is a new POLICY NAME column in the DESCRIBE TABLE, now in GA, output that indicates whether a column-level security masking policy is applied to a column, so administrators can quickly and easily see how and where policies are applied.

New SQL Features and Functions

With SQL remaining the most-used query language for analyzing data, Snowflake supports the most common standardized version of SQL (ANSI) for powerful relational database querying and extends it with new functionality based on customer use cases and demand. Snowflake offers a wide range of SQL functions for use in data loading, data analysis, system administration, security, and more. In July, Snowflake added the following new SQL functions:

Geospatial: Snowflake recently released a new GEOGRAPHY data type that uses a spherical-earth coordinate system to store geospatial data. Users simply create a GEOGRAPHY column and load data from one of multiple supported formats, and Snowflake stores an optimized representation for fast queries. 

Snowflake already supports more than 40 functions used in geospatial analysis and continues to add more based on customer demand. A new SQL function, ST_ENVELOPE, now in public preview, returns the minimum bounding box (a rectangular “envelope”) that encloses a specified GEOGRAPHY object. And a new ST_HAUSDORFFDISTANCE function, also in preview, measures the similarity between two geospatial objects, aiding in location matching and finding similar objects. 

Semi-structured data: A new OBJECT_PICK capability now in general availability (GA) returns a new OBJECT containing only the specified subset of the key-value pairs from an existing object that the user needs.

Context: New SQL functions INVOKER_ROLE, INVOKER_SHARE, IS_GRANTED_TO_INVOKER_ROLE, and IS_ROLE_IN_SESSION, also now in GA, provide context about the user running a query, allowing for more expressive column-level security policies.  

Aggregation: A new COUNT_IF function now in GA counts only the records that meet the input condition, simplifying the SQL for a common query pattern.  

Conditional expressions: NULLIFZERO returns NULL if the argument evaluates to 0; otherwise, it returns the argument. This functionality, now in GA, can be handy if you want to avoid division by zero or avoid printing zeros in reports, for example. 

System: SYSTEM$GET_SNOWFLAKE_PLATFORM_INFO, also in GA, returns the ID of the Snowflake virtual network in which your account is located: Amazon Virtual Private Cloud or Azure Virtual Network. This ID can be specified in the policies for your cloud storage to allow Snowflake to connect to it while blocking requests that originate from outside the virtual network.

Ecosystem Updates 

The Snowflake partner ecosystem is an important driver in ensuring you get the most from your Snowflake investment. We are continually identifying ways to make the ecosystem easier to interact with, offer you a broader range of partner tools to integrate with, and deliver support for the latest technologies. The past few months saw several ecosystem updates, including:

Spark 3.0 support: Version 2.8.0 of the Snowflake Connector for Spark supports Spark 3.0. With this new version of the connector, now in GA, Snowflake continues to support the latest three Spark versions: Spark 2.3, 2.4, and 3.0.

Snowflake JDBC Driver v3.12.8: Version 3.12.8 of the Snowflake JDBC Driver, now in GA, supports asynchronous queries (queries that return control to the user before the query completes). Users can start a query, and then use polling to determine when the query has completed. After the query completes, the user can read the result set. This feature allows a client program to run multiple queries in parallel without the client program itself using multithreading.

Snowflake JDBC Driver v3.12.8 and Python Connector v2.2.8: Starting with version 3.12.8 of the Snowflake JDBC driver and version 2.2.8 of the Python connector, connection caching with browser-based single sign-on (SSO) is now available for MacOS and Windows environments. This feature, now in GA, enables an application to use Snowflake’s browser-based SSO feature while minimizing the number of authentication pop-ups when connecting to Snowflake through the JDBC driver or the Python connector.

Snowflake ODBC Driver v2.21.5: Version 2.21.5 of the Snowflake ODBC driver includes support for performing the following two tasks for Snowflake accounts hosted on Google Cloud Platform: using the PUT command to upload local data files on a client machine to an internal (in other words, Snowflake) stage, and using the GET command to download data files from an internal (Snowflake) stage to a client machine. (Note that the ODBC driver already supported the PUT/GET commands for Snowflake accounts hosted on Amazon Web Services and Microsoft Azure cloud platforms.)

Google Cloud Platform integrations: New connectors from Google Data Fusion and Google Dataflow are now in public preview, letting users load and integrate data easily into Snowflake.

Snowflake Partner Connect: Snowflake also brought on board a wide ecosystem of partners to Snowflake Partner Connect, making it easy for you to get started directly in Snowflake and connect to partners including Hunters, Datalytyx, Striim, Domo, SqlDBM, Matillion, Dataiku, and DataRobot.

Continuing Innovation

As always, a big thank you to our customers for the feedback that helps us prioritize Snowflake product development. Stay tuned for next month’s update to learn about further Snowflake innovations that help break the barriers of existing architecture and software offerings.