Building on the announcements made at this year’s Summit, Snowflake has released a number of new enhancements, especially in the areas of data programmability, global governance, and data sharing. Read on to learn more. For additional details and to see some of these new capabilities in action, be sure to check out the on-demand sessions from Summit. 

DATA PROGRAMMABILITY

Say Hello to Data Programmability in Snowflake with Snowpark and Java UDFs 

At Snowflake Summit 2021, we announced that Snowpark and Java user-defined functions (UDFs) were starting to roll out to customers. These features are now available in preview to all customers on AWS and represent a major advancement in data programmability, enabling you to more easily leverage Snowflake’s platform to do more for you.

  • Snowpark is a new developer experience Snowflake uses to bring deeply integrated, DataFrame-style programming to the languages developers like to use, starting with Scala. Snowpark is designed to make building complex data pipelines a breeze and to allow developers to interact with Snowflake directly without moving data.
  • With Java UDFs, customers can run their Java code right inside of Snowflake for better performance, greatly expanding the transformation capabilities and reducing the management complexity of hosting external services.

Snowpark and Java UDFs open up the data programmability of Snowflake to non-SQL developers, so they can use their preferred languages and tooling and execute with Snowflake to get the benefits of the Data Cloud’s performance, ease of use, scalability, and much more. Try Snowpark and Java UDFs yourself with this step-by-step guide.

Extensibility with External Functions for Google Cloud API Gateway 

Modern analytical workloads often require complex transformations or augmentations that require using custom code or third-party services. However, using external services and libraries can often complicate data pipelines. To simplify using remote services, Snowflake created the External Functions feature, which enables users to invoke external APIs and custom code from within Snowflake and blend the results into their query results. External functions are now available for Google API Gateway, so they are now generally available for all three major cloud providers, including support for virtual private endpoints (VPCs) on AWS. Try external functions yourself by following this step-by-step guide.

SQL REST API and API Playground 

The all-new SQL REST API was announced at Snowflake Summit 2021 and is currently available in public preview to all customers. This feature enables developers to submit and execute SQL statements directly over a REST interface and is useful for resource-constrained environments, custom integrations and plugins, and developing custom drivers for Snowflake. Developers can interact with this API in a playground environment at api.developers.snowflake.com.

GLOBAL GOVERNANCE

Sensitive Data Protection with Row Access Policies

Snowflake customers can now use the Row Access Policies feature to determine which rows to return in query results. With the ability to create a policy once and apply it to multiple tables, row access policies simplify the protection of sensitive data across organizations. A row access policy can be as simple as allowing one particular role to view particular rows, or it can be more customized through the use of a mapping table. This feature is now generally available. Learn more.

DATA SHARING AND DATA MARKETPLACE

Discover and Be Discovered in the Data Cloud  

Snowflake customers can discover and access helpful third-party data and services from more than 140 providers across 16 categories (as of April 30, 2021) as well as market their own products across the Snowflake Data Cloud. Snowflake Data Marketplace is generally available globally to all non-VPS Snowflake accounts hosted on Amazon Web Services (AWS), Microsoft Azure (with the exception of Microsoft Azure Government), and Google Cloud Platform. Support for Microsoft Azure Government is planned. Tap into third-party data.

Access Telemetry and Usage Metrics for Data Listings 

Snowflake Data Marketplace and Data Exchange providers can now access telemetry data such as click-through rates on listings and usage data such as the count of queries per consumer. Data listing telemetry and usage metrics give data providers insight into customer interest in and usage of data products in Snowflake Data Marketplace and Data Exchanges. This capability now is generally available. Explore telemetry and usage metrics.

Snowflake Data Providers Now Can Set Service Terms at Listing Level 

New functionality now in public preview allows data providers on Snowflake Data Marketplace and Data Exchanges to provide terms specific to a given standard listing. This allows data providers to offer unique terms of service for each standard data set they provide. For personalized listings, the provider can now choose to leave this area blank in order to handle terms of service offline. Learn more.

ECOSYSTEM UPDATES

Updates to Snowflake Connector for Python and Go Snowflake Driver 

For the Snowflake Connector for Python, Snowflake provides two parameters you can use to prevent Snowflake from prompting users to log in again after a period of inactivity during a session: Session parameter CLIENT_SESSION_KEEP_ALIVE (a variable common across Python, JDBC, ODBC, and Node.js connectors) and client_session_keep_alive (a setting specific to the Snowflake Connector for Python).  Version 2.4.6, now generally available, simplifies the logic by changing the default value of client_session_keep_alive from False to None. With None as the new default value, existing behavior does not change, because the default value is ignored and the value of the session parameter CLIENT_SESSION_KEEP_ALIVE is used, but passing client_session_keep_alive=False or client_session_keep_alive=True overrides the value of session parameter CLIENT_SESSION_KEEP_ALIVE, which was not the case in version 2.4.5 and earlier versions. Version 2.4.6 of the Snowflake Connector for Python also adds support for retrieving metadata about the columns in a result set without having to execute a query. And the Go Snowflake Driver, also generally available, now supports the Bulk Array Binding feature, which can improve performance when loading large volumes of data from a Golang client. Learn more about Snowflake Connector for Python and Go Snowflake Driver.

PLATFORM OPTIMIZATION

Expanded Location Intelligence Capabilities with Geospatial Data in Snowflake 

Support for geospatial data in Snowflake is now generally available, providing a suite of functions for constructing, formatting, measuring, and computing relationships between geospatial objects. Since the launch of this capability, we’ve added more functions and improved the performance of queries that use geospatial joins. Additionally, Tableau, CARTO, and Safe Software have enabled better visualization and location intelligence capabilities in their software with native connectors, data enrichment, rapid exploration, and spatial analysis and visualizations. Further, location-based data sets in Snowflake Data Marketplace are now available: CARTO shares demographic data, SafeGraph shares foot traffic data, Weather Source shares weather and forecast data, Airlines Reporting Corporation shares travel sales data, and CoreLogic shares parcel boundaries. Learn more. 

View Query History and Get All Your Administration Tasks Done in the New Web Interface 

The new Snowflake web interface, now in preview and accessible under “preview app,” got a major update in June that makes it possible for users to access query history and to add and manage warehouses and users. Additional new functionality, including visualizations for role hierarchy and account usage, help users understand who has access and how data is being used in the organization. Learn more.

Larger Warehouse Sizes for Your Compute-Intensive Workloads

In June, Snowflake announced the addition of two new virtual warehouse sizes, 5XL and 6XL, giving users the ability to add more compute power to their workloads and enable faster data loading, transformations, and querying. These sizes, now in public preview, currently are available only on AWS, with support for Microsoft Azure to follow. Previously, customers who needed to support compute-intensive workloads for data processing had to do batch processing and use multiple 4XL warehouses to accomplish their tasks. The new 5XL and 6XL virtual warehouse sizes give users the ability to run larger compute-intensive workloads in a performant fashion without any batching. Learn more

Create and Manage Multiple Accounts Within Your Organization 

Snowflake Organizations is a first-class Snowflake object that enables administrators to view, create, and manage all of their accounts across different regions and cloud platforms. This feature, now generally available, simplifies account management and billing, enables self-service account creation, increases data availability and durability with data replication and failover, and enables seamless data sharing across regions. Learn more.

REGIONAL EXPANSION

New Region Alert: Central U.S. (Iowa) on Azure 

Snowflake has expanded its availability to the Central U.S. (Iowa) region on Microsoft Azure. With this added region, Snowflake now offers 11 geographical locations in North America across all three supported cloud platforms (AWS, Google Cloud Platform, and Azure).

For a full list of supported regions, check out this guide. 

RECENTLY ADDED SNOWFLAKE DATA MARKETPLACE PROVIDERS 

Atheon Analytics

Atheon Analytics provides data and analytics products to the U.K. grocery sector. The organization’s SKUtrak customers can access this data, which includes two years of daily trading data across U.K. grocery retailers, on Snowflake Data Marketplace. Learn more.

Atlas Technology Group

Atlas Technology Group provides retail analytics to empower brands. Its sample data set includes standard weekly and daily metrics by product and store. Learn more.

BDEX

Combining hundreds of data sources in real time into one unique data infrastructure, BDEX offers what it claims is the most accurate fully validated identity graph available in the U.S. and across the globe. Learn more.

Compile

Compile is a unified, fully linked system of record for the U.S. healthcare market. Compile’s data offers an unprecedented level of visibility into the treatment activity of patients and networks of prescribers and providers in the healthcare domain. Learn more.

ContentEngine

ContentEngine aggregates, produces, and syndicates the largest library of news and information content covering Mexico and Central and South America, offering as many as 9,000 stories daily from more than 500 individual titles. Learn more.

Data n Dashboards

Data n Dashboards is a one-stop solution for organizations’ internal and external data and dashboard needs. Its Stats NZ Census 2018 data set comprises more than 205 tables providing information about Aotearoa New Zealand demography. Learn more.

Edvisors Network

Edvisors Network, a higher education media marketing company, provides more than 13 million records for U.S.-based members of Generation Z, millennials, college students/graduates, and alumni via Snowflake Data Marketplace. Learn more.

Equilar

Equilar is the leading provider of corporate leadership data solutions that help business leaders, investors, and advisors assess leadership through a social and governance lens. Its Snowflake Data Marketplace listings include data sets for executive and director relationships, equity and compensation, and people business intelligence. Learn more.

eyos

eyos provides transactional-level POS data directly from the point of sales of 2,000 digitally connected, independently owned grocery retail stores in Indonesia. The global retail data automation platform company helps retailers around the world identify shoppers, automate in-store marketing, and leverage insights and predictions. Learn more.

Facteus

Whether you’re looking for macro-, micro-, or company-specific trends, Facteus’ U.S. Consumer Payments data set gives you a distinct advantage: The panel’s breadth and depth of coverage of U.S. consumer spending includes data from more than 20 million active payment cards and delivers actionable intelligence and insight into the drivers behind company performance or industry trends. Learn more.

Facts and Dimensions Ltd

Facts and Dimensions provides the widest available single source of U.K. health statistics and reference data. This data is in use by most of the NHS, including NHS England, and many other organizations. Its coronavirus data set provides a comprehensive catalog of U.K.-based coronavirus data along with worldwide data sets for comparison usage. Learn more.

Gretel.ai

Gretel.ai is an advanced synthetic data platform featuring simple APIs and an open-source AI-based core. Its U.S. Census Income Reduced Bias data set uses synthetic data to boost underrepresented race, gender, and income-bracket classes and commonly is used to predict whether income exceeds $50K/year based on census data. Learn more.

Jobvite

Jobvite is leading the next wave of talent acquisition innovation with a candidate-centric recruiting model that helps companies engage candidates with meaningful experiences at the right time, in the right way. Its Talent Acquisition Platform data set provides granular data that helps organizations track and measure hiring success. Learn more.

Pollen Analytics

Pollen Analytics reprocesses projections for hundreds of climate variables from the scientific community into an easy-to-use data set that you can query within Snowflake. Tables include historical and projected climate data for variables such as temperature, humidity, and snow coverage. Learn more.

RIMES

The RIMES Global ETF Data Samples data set includes validated daily exchange traded fund (EFT) composition and reference data. RIMES normalizes, validates, and enriches ETF data sets by linking compositions with their underlying constituents, enabling customers to perform portfolio, risk, compliance, and performance analysis according to their custom specifications. Learn more.

Rockerbox

Rockerbox is a leading attribution provider for digital brands, providing a single source of truth across all of your organization’s marketing efforts so you can quickly uncover your marketing effectiveness. You can easily join the data from Rockerbox’s data set with your internal data to develop custom analytics. Learn more.

Quantfy

Quantfy provides compliant and seamless access to normalized low-latency market data and AI-based insights across cryptocurrency, foreign exchange, and stock market trades. Its Crypto OHLCV Feed data set provides near real-time information about trade activity across several crypto exchanges. Learn more.

Vantage Point Consulting

Vantage Point Consulting specializes in higher education data, technology, and user experience design. Its U.S. Salaries by Occupation and ZIP Code data set on Snowflake Data Marketplace provides salary data for 10th, 25th, 50th, 75th, and 90th percentiles by ZIP code, latitude, longitude, and O*NET codes and contains more than 40 million rows. Learn more.

Viscacha Data

Viscacha Data tracks real-time sales, inventory, and price data from big box and specialty retailers to provide a direct line of sight into which products and brands are being sold. Among its listings, the “Target Sales, Inventory & Prices” data set available on Snowflake Data Marketplace monitors sales at Target (TGT) on an hourly basis for ecommerce sales and on a daily basis for in-store sales. SKU-level sales are tracked and reported as soon as 15 minutes after they occur. Learn more.

Windsor.ai

Windsor.ai helps you to connect all your marketing, analytics, sales, and CRM platforms and stream data into Snowflake so it can be modified, joined, and visualized in your BI platform of choice. Its Multitouch Attribution data set on Snowflake Data Marketplace provides more than 50 connectors—including connectors for Adobe Analytics, Google Ads, HubSpot, Twitter Ads, and Salesforce—to help answer questions such as which channels are working the best across the entire customer journey, return on ad spend, and cost per acquisition. Learn more.