Data engineers love to use SQL to solve all kinds of data problems. For this and more, Snowflake is a perfect partner. Snowflake’s support for standard SQL and several SQL variations, combined with JavaScript stored procedures, has helped me solve complex data challenges. But sometimes you might have the need for custom code. 

As an example, a few weeks ago, a client reached out wanting to know driving distances and times between its top customers and their warehouses. While Snowflake has HAVERSINE calculation and rich spatial data types, driving distance isn’t a simple geometric calculation: it requires a complex analysis of the road network and expected speeds along each path, and is usually provided by a sophisticated third-party service. I couldn’t use simple Snowflake JavaScript user-defined functions (UDFs) because those can’t call unsafe operations such as network calls. 

This is where Snowflake’s External Functions came in handy. External Functions is a new capability that enables you to write and call your own custom applications or call third-party applications that reside outside of Snowflake. These applications, referred to as remote services, can be written using any HTTP server stack, including cloud serverless compute services, such as AWS Lambda, with any programming language.  

Although the steps to set up an external function are well documented, they can be time consuming because you need to create roles, functions, and relations between your cloud platform and Snowflake instance. You also have to make sure only Snowflake can call your functions, and that they do so in a secure and auditable way. Hence, to make things simpler, I created a Serverless plugin to deploy the required resources for external functions to AWS and Snowflake.

Serverless to the Rescue

Serverless Framework is an open source cloud automation tool for deploying Lambda functions. It helps developers and data engineers focus on the code while hiding the deployment complexities. The following section describes a Serverless plugin that automates deployment of external functions to AWS and Snowflake. This plugin is available for free on github

The high-level process to deploy a Snowflake external function includes four steps: 

1. Create a new Serverless project from a Snowflake template. 

2. Modify the handler code. 

3. Add AWS and Snowflake configuration (such as credentials or region). 

4. Deploy the external function. 

Everything is automated. There’s no need to log into the AWS Management Console or Snowflake UI. 

For example, follow these steps to create a simple hello world function:

1. Create your project template: 

serverless create --template-url https://github.com/starschema/snowflake-aws-external-function -p hello-function rncd hello-function rnnpm install serverless-snowflake-external-function-plugin 

2. Set up the Snowflake and AWS connections. For example, you could change the following lines in Serverless.yml to your actual account information: 

The actual code for the function is located in handler.js:

'use strict';rnrnmodule.exports.hello = async event => {rn  const body = JSON.parse(event.body);rnrn  return {rn    statusCode: 200,rn    body: JSON.stringify(rn      {rn        data: body.data.map((row) => [row[0], 'hello from lambda'])rn      }rn    )rn  };rnrn  // Use this code if you don't use the http event with the LAMBDA-PROXY integrationrn  // return { message: 'Go Serverless v1.0! Your function executed successfully!', event };rn};

AWS access and secret keys are kept as environment variables (AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY). 

3. Deploy the function to AWS and Snowflake with an sls deploy command. 

If all went well, you should be able to call your function from the Snowflake console: 

Adding the Mapbox API 

This CALCDISTANCE function example uses the Mapbox Directions API, Mapbox’s turn-by-turn navigation REST API. Follow these steps:

1. Register to get an access token. This enables you to retrieve driving distance with a simple node/javascript application. 

2. Add the following JavaScript REST API call to your handler.js and modify the function signature in serverless.yml. This sample Mapbox implementation is just a web call:

To map this handler to the external function, change the function signature in serverless.yml as shown here: 

functions:rn  calc_distance:rn    handler: handler.calcDistancernrn    snowflake:rn      argument_signature: (with_profile varchar,coorda varchar,coordb varchar)rn      data_type: variantrnrn    events:rn      - http:rn          path: calc_distancern          method: post rn          authorizer: aws_iam

This definition creates a function variant CALC_DISTANCE (with_profile varchar,coorda varchar,coordb varchar), executing the handler.calcDistance code. Note that while JavaScript UDFs cannot access external web services, external functions can securely call remote endpoints such as web APIs or other external applications. To test your newly deployed function, log into Snowflake and test the function with a simple SELECT statement to determine if driving or cycling is faster.

The UDF in the example works as expected, returning results with the driving distance and duration in meters and seconds respectively, showing that the bicycle route is shorter but takes a bit more time. These results come from pure SQL, directly from Snowflake.

The full source code of this Mapbox Snowflake function is available on github, here:
https://github.com/tfoldi/snowflake-mapbox-functions

Summary

External Functions is a functionality in Snowflake that enables you to write and call custom applications as user-defined functions from within Snowflake. Unlike traditional UDFs, however, external functions are executed on your own cloud infrastructure, while enabling you to securely access your Snowflake data. By using Serverless’s plugin to deploy your external functions to AWS and Snowflake, you can drastically simplify the deployment process and take advantage of the ability to call custom code and third-party APIs outside of Snowflake.