SAS to Snowflake Migration

Snowflake for Developers/Guides/SAS to Snowflake Migration

SAS to Snowflake Migration

Migrations

Ivan Baranov

1. Introduction

This T1A Alchemist Quickstart Guide will demonstrate how to leverage Alchemist to accelerate the migration of your legacy SAS code to Snowflake. We'll be covering the 3 most frequently occurring types of SAS assets:

SAS Enterprise Guide projects
SAS Data Integration Studio jobs
SAS BASE code files

Prerequisites

Snowflake account (a free 30-day trial works perfectly)
T1A Alchemist license (steps to obtain are outlined below)
Basic familiarity with SQL and SAS concepts
(Optional) Access to SAS tools:
- SAS Enterprise Guide to preview the sample EG project and SAS BASE code
- SAS Data Integration Studio to explore the sample job
- SAS Studio to explore SAS BASE code

What You'll Learn

By the end of this guide, you will know how to:

Assess the scope and complexity of your SAS estate
Convert SAS Enterprise Guide projects and SAS BASE code files to Snowflake-ready code
Convert SAS Data Integration Studio jobs using a metadata-driven approach

2. Migration Overview

Why migrate?

SAS has been at the core of enterprise data environments for decades. While it started as a statistical analysis platform, many organizations have come to rely on it far beyond analytics. It's common to find SAS powering full-scale data warehouses, complex ETL pipelines, and regulatory reporting workflows — often through tools like SAS Data Integration Studio (SAS DI) and SAS Enterprise Guide (SAS EG). Over the years these environments have grown into mission-critical systems with hundreds or even thousands of jobs running daily.

However, maintaining a legacy SAS estate comes at a growing cost. Licensing is expensive, the talent pool is shrinking, and the platform was not designed for the modern cloud-native world. Snowflake offers a compelling alternative: near-unlimited scalability, separation of storage and compute, and all of that at pay-per-use pricing. Migrating to Snowflake lets your team consolidate data engineering, analytics, and data science workloads onto a single modern platform, while significantly reducing infrastructure overhead and operational complexity.

Migration challenges: what to expect

Migrating from SAS to Snowflake might sound straightforward on the surface, given that SAS is just code and data flows. In reality, most organizations significantly underestimate the scope. Years of organic growth leave behind a tangled mess where the actual code is just the tip of the iceberg.

Beneath the surface lie hidden dependencies between jobs, undocumented macros, duplicate assets scattered across teams, and compliance constraints. Without full visibility into these layers, migration projects quickly spiral into missed deadlines and budget overruns. The bulk of the complexity of a migration project comes from uncovering the right scope for migration and coming up with a robust migration plan that takes all the different aspects into account.

The biggest risk in SAS migration isn't the code you can see — it's everything you can't. The right tools turn hidden complexity into a clear plan.

Most of this hidden complexity can be uncovered and reduced with the right tooling. Rather than attempting to migrate everything at once, a structured approach lets you filter out what's no longer needed, understand what remains, and convert only what matters.

This is exactly the workflow that our toolkit is built around. SAS Logs Analyzer and Metascope help you see the full picture. Alchemist Code Analyzer tells you what you're actually dealing with. And Alchemist Converter turns it into production-ready Snowflake code.

Why not just use an AI?

With the rise of AI, it may be tempting to paste your SAS code into an LLM and ask for a translation. While LLMs are great for explaining code or quick prototyping, they often introduce subtle logic differences — a changed join order, unexpected null handling, slightly different rounding. In regulated industries, even minor deviations can mean a compliance violation. Additionally, with not a lot of SAS code available publicly, LLMs were not trained to understand it on a deep enough level - which leads to conversion mistakes.

LLMs can rewrite your code, but they can't guarantee consistency and compliance. Production migration demands repeatability and precision that pure AI alone cannot deliver.

This is why Alchemist uses a combinaion of deterministic conversion logic and AI in a broader, structured conversion pipeline — not AI as the sole engine.

3. Prepare Your Environment

Sign Up for Snowflake Trial

If you haven't already, register for a Snowflake free 30-day trial. The rest of the sections in this lab assume you are using a new Snowflake account created by registering for a trial.

After your first login, go to Projects, click + Add new, and choose SQL file.

Paste and run the following to create the database all converted notebooks will use:

CREATE DATABASE IF NOT EXISTS LDEMO;
USE DATABASE LDEMO;
USE SCHEMA PUBLIC;

All three SAS libraries referenced in this lab (LDEMO, LDEMODI, LMART) map to LDEMO.PUBLIC — a single Snowflake database and schema is all you need.

Sign Up for Alchemist Trial

Register at app.getalchemist.io to get access to the portal. The trial signup is free and gives you the credentials needed to download the installer.

Install Alchemist

This guide requires Alchemist version 2025.2 or later. Sign in to the portal, open the Downloads page, and pick the installer for your operating system (macOS Apple Silicon or Windows).

Alchemist Portal Downloads page showing the 2025.2 macOS and Windows installers

Run the installer.

On macOS, add Alchemist to your PATH after installing:

export PATH="/Applications/Alchemist:$PATH"

To make this permanent, add the line above to your ~/.zshrc or ~/.bash_profile.

On Windows, the installer should register the Alchemist install directory on your Path automatically — open a fresh Command Prompt or PowerShell window after installation to pick up the change.

Verify the installation:

alchemist --version

If alchemist --version returns "command not found" in a fresh shell, add the install directory to your user Path manually:

$env:Path = "C:\Program Files (x86)\Alchemist\Alchemist;$env:Path"

The line above is for the current session only. To make it permanent, add C:\Program Files (x86)\Alchemist\Alchemist to your user Path via System Properties → Environment Variables, then restart the shell.

The license file you'll need is bundled with the lab assets you'll clone in the next sections — license activation is covered at the end of step 5.

4. Explore SAS Artifacts

This tutorial uses three sample SAS artifacts that represent a typical insurance analytics estate.

insurance.egp (EG Project) — end-to-end insurance workflow: risk tiering, premium ranking, claim enrichment, and adjuster reporting. Covers Sort, Query Builder, Rank, Transpose, Split, Append, and custom PROC SQL/DATA step tasks.
quarterly_portfolio_review.sas (BASE Code) — parameterized quarterly review producing a multi-section ODS PDF: headline KPIs, policy mix, risk-band cross-tabs, and adjuster workload. Exercises PROC FORMAT, PROC SQL, PROC MEANS, PROC FREQ, PROC TABULATE, PROC SGPLOT, and ODS styling.
di.spk (DI Job) — metadata-driven ETL that joins three operational tables into MART.CLAIMS_ENRICHED. Flow: Data Validation → SQL Join × 2 → Sort → Table Loader (Replace).

5. Set Up the Lab

Pull just this guide's lab folder using a shallow, sparse clone. The same commands work on macOS, Linux, Git Bash on Windows, and Windows PowerShell — git is identical across platforms.

git clone --depth 1 --filter=blob:none --sparse --branch sas-to-snowflake-migration https://github.com/ibaranov91/sfquickstarts.git
cd sfquickstarts
git sparse-checkout set site/sfguides/src/sas-to-snowflake-migration/assets
cd site/sfguides/src/sas-to-snowflake-migration/assets

Note: the git clone line is long but intentionally on a single line — it works without modification in macOS Terminal, Linux, Git Bash, and Windows PowerShell. Bash backslash continuations (\) would break in PowerShell.

You are now in the lab directory. It contains everything you need:

.
├── alc_config.yml
├── source_lock.json
├── .gitattributes
├── sas_artifacts/
│   ├── enterprise_guide/insurance.egp
│   ├── data_integration/di.spk
│   └── base_code/quarterly_portfolio_review.sas
├── data/
│   ├── insurance_customers.csv
│   ├── insurance_claims.csv
│   └── insurance_adjusters.csv
├── snowflake/00_setup.sql
└── output/
    ├── base_code/
    ├── enterprise_guide/
    └── data_integration/

A few pieces of scaffolding worth understanding:

alc_config.yml — picked up automatically by alchemist when run from this directory. It tells the converter to emit Jupyter notebooks (output_format: ipynb) instead of plain .py files, and maps the SAS librefs used in the samples (LDEMO, LDEMODI, LMART) all to LDEMO.PUBLIC in Snowflake.
source_lock.json — the demo license file. You'll activate it at the end of this step.
.gitattributes — pins SAS source line endings to LF so the source-locked license validates identically on macOS, Linux, and Windows. Don't remove this file.
output/ — three empty subfolders the convert commands write into. alchemist convert -o does not create the output directory itself, so they ship pre-created in the repo.

Note: Alchemist performs static code analysis and does not require or process the CSV data — it only needs the SAS source files. The CSVs are loaded into Snowflake separately in step 7.1.

Activate the demo license

The lab uses a source-locked demo license: the source_lock.json file in the lab folder is paired with a license key listed on the Alchemist Portal Get Started page. Scroll to the bottom of that page, find the SAS to Snowflake Migration — Self-Service Quickstart Demo section, and copy the Demo-locked license required key.

Alchemist Portal Get Started page showing the demo license key for this quickstart

From the lab directory you cd'd into above, run:

alchemist license -l source_lock.json

When prompted, paste the license key you copied from the portal.

Note: If you previously activated a different Alchemist license on this machine, reset it first so the demo license takes effect:
alchemist license reset
alchemist license -l source_lock.json

6. Analyze SAS Artifacts

Before converting, run Alchemist Analyzer to get a breakdown of your SAS estate: file count, token volume, workload type mix, and the constructs that will need attention during conversion.

In this video, you can see the capabilities of the Analyzer

A single analyzer run covers all three sample asset types at once:

alchemist analyze src ./sas_artifacts

The dashboard opens automatically at http://127.0.0.1:10888. The Main tab shows a project overview across every source file — token count, executable-type breakdown (SAS Programs, EG Flows, DI Jobs), and workload-type mix:

Alchemist Analyzer dashboard showing all three SAS asset types in one run

The EG, DI, and Code tabs alongside Main provide more details about each specific asset type.

Note: Alchemist Analyzer can include at most one .spk file per run, but that single .spk can be analyzed alongside any number of .egp projects and .sas files. If your estate has multiple DI jobs, run the analyzer once per .spk and combine the others freely with each run.

7. SAS BASE Code

7.1 Import Source Data

The converted code reads from three tables — INSURANCE_CUSTOMERS, INSURANCE_CLAIMS, and INSURANCE_ADJUSTERS. Load them into Snowflake using the Add Data UI before running the converted code.

The CSV files are in the data/ folder of the cloned repository.

Repeat the following steps for each of the three CSV files:

In Snowsight under Ingestion select Add Data
Choose Load data into a Table.
Upload the CSV file from your local data/ folder.
Set the database to LDEMO and the schema to PUBLIC. The converted code references tables as LDEMO.INSURANCE_* so these names must match exactly.
Use the filename as the table name: insurance_customers, insurance_claims, insurance_adjusters.

Load Data into Table dialog showing LDEMO.PUBLIC schema and insurance_adjusters table name

On the next tab click "View options" and select "First line contains header" in the Header dropdown
Let Snowflake infer the schema from the file, then confirm and Load

Configure data load options to parse headers

After loading all 3 files your LDEMO.PUBLIC schema should look like that

7.2 Convert

From your lab directory, run:

alchemist convert \
  -p insurance_review \
  -t spark \
  -o ./output/base_code \
  ./sas_artifacts/base_code

Open the converted notebook. Alchemist preserves the source-tree shape under your output directory, so the full path is:

output/base_code/insurance_review_converted_files/Programs/sas_artifacts/base_code/quarterly_portfolio_review.ipynb

Subsequent steps abbreviate this as output/base_code/.../quarterly_portfolio_review.ipynb.

Any construct Alchemist couldn't convert automatically is marked # TODO:. Reporting-only statements (ODS, TITLE, FOOTNOTE) can be deleted — they have no notebook equivalent.

7.3 Deploy

In Snowsight go to Projects → Workspaces.

Click + Add new → Notebook → Upload .ipynb file. Select output/base_code/.../quarterly_portfolio_review.ipynb, name it, and click Create.
Click Connect → + Create new service, set Idle timeout to 30 minutes, and click Create and connect.

Create new compute service dialog with Idle timeout set to 30 minutes

Connect notebook to the newly created compute service

Add a SQL cell at the very top of the notebook:

USE DATABASE LDEMO;
USE SCHEMA PUBLIC;

In cell "1. Imports", swap SparkSession for snowpark_connect:

# Before
from pyspark.sql import SparkSession
from pyspark.sql.functions import desc, expr, when

# After
from snowflake import snowpark_connect
from pyspark.sql.functions import desc, expr, when

In cell "2. Setup Spark session", replace the builder:

# Before
spark = SparkSession.builder.config("spark.sql.ansi.enabled", True).getOrCreate()

# After
spark = snowpark_connect.server.init_spark_session()

Click Run all.

8. EG Project

8.1 Convert

The sample project contains 15 task nodes across two parallel branches — a customer risk path and a claims enrichment path — that merge into a final reporting stage:

SAS Enterprise Guide Process Flow showing 15 task nodes

Run the converter:

alchemist convert \
  -p insurance_eg \
  -t spark \
  -o ./output/enterprise_guide \
  ./sas_artifacts/enterprise_guide

Open output/enterprise_guide/.../insurance_Process_Flow.ipynb. Delete the four cosmetic TODOs:

# TODO: Unsupported node type <SASLibnameDef>     ← delete
# TODO: Unsupported node type <SASRunStatement>   ← delete
# TODO: Unsupported statement: TITLE              ← delete
# TODO: Unsupported statement: FOOTNOTE           ← delete

The Adjuster report task ends with # TODO: Unsupported procedure PRINT. Replace it with:

df_adjuster_performance.show(20)

8.2 Deploy

The three source tables are already loaded in LDEMO.PUBLIC from step 7.1 — no additional data loading needed.

In Snowsight go to Projects → Workspaces → Add new → Notebook → Upload .ipynb file. Select output/enterprise_guide/.../insurance_Process_Flow.ipynb, name it insurance_eg_flow, and click Create.
Click Connect and select the compute service you already created.
Add a SQL cell at the very top:

USE DATABASE LDEMO;
USE SCHEMA PUBLIC;

In cell "1. Imports", swap SparkSession for snowpark_connect:

# Before
from pyspark.sql import SparkSession, Window
from pyspark.sql.functions import asc, col, desc, expr, row_number, when

# After
from snowflake import snowpark_connect
from pyspark.sql.functions import asc, col, desc, expr, row_number, when
from pyspark.sql import Window

In cell "2. Setup Spark session", replace the builder:

# Before
spark = SparkSession.builder.config("spark.sql.ansi.enabled", True).getOrCreate()

# After
spark = snowpark_connect.server.init_spark_session()

Click Run all.

9. DI Job

9.1 Convert

The sample DI job (LOAD_CUSTOMER_RISK_MART) reads three operational tables and produces the CLAIMS_ENRICHED mart through six numbered transforms — Extract, Data Validation, two Joins, Sort, and a Table Loader:

SAS Data Integration Studio job LOAD_CUSTOMER_RISK_MART showing the six-transform flow from claims/customers/adjusters into CLAIMS_ENRICHED

Run the converter:

alchemist convert \
  -p insurance_di \
  -t spark \
  -o ./output/data_integration \
  ./sas_artifacts/data_integration

Open output/data_integration/.../LOAD_CUSTOMER_RISK_MART.ipynb. Find the step2_Data_Validation stub and replace it — the notebook will fail without this:

# Before (stub — will crash at runtime)
def step2_Data_Validation(spark, df_input):
    # TODO: Unsupported Data/Data Validation transform <Data Validation> (node.sas_meta_id='A5JCJVA0.BW005NFI'). Convert manually or via a template.
    return (df_output1, df_output2)

# After
def step2_Data_Validation(spark, df_input):
    from pyspark.sql.functions import col
    df_output1 = df_input.filter(
        (col("CLAIM_AMOUNT") > 0) & col("ADJUSTER_ID").isNotNull()
    )
    df_output2 = df_input.subtract(df_output1)
    return (df_output1, df_output2)

9.2 Deploy

In Snowsight go to Projects → Workspaces → + Add new → Notebook → Upload .ipynb file. Select output/data_integration/.../LOAD_CUSTOMER_RISK_MART.ipynb, name it load_customer_risk_mart, and click Create.
Click Connect and select the compute service you already created.
Add a SQL cell at the very top:

USE DATABASE LDEMO;
USE SCHEMA PUBLIC;

In cell "1. Imports", swap SparkSession for snowpark_connect:

# Before
from pyspark.sql import SparkSession
from pyspark.sql.functions import asc, col, desc

# After
from snowflake import snowpark_connect
from pyspark.sql.functions import asc, col, desc

In cell "2. Setup Spark session", replace the builder:

# Before
spark = SparkSession.builder.config("spark.sql.ansi.enabled", True).getOrCreate()

# After
spark = snowpark_connect.server.init_spark_session()

Click Run all. Once complete, open Database Explorer → LDEMO → PUBLIC → CLAIMS_ENRICHED to verify the output.

CLAIMS_ENRICHED table in Snowflake Database Explorer showing 2.5K enriched claim rows

10. Conclusion

Congratulations — you have completed the full SAS-to-Snowflake migration workflow using T1A Alchemist.

What you learned

Assess: how to use Alchemist Analyzer to build a complete, dependency-aware picture of a SAS estate before writing a single line of SQL.
Convert: how to migrate all three major SAS asset types (Enterprise Guide projects, Data Integration Studio jobs, and BASE code files) to Snowflake-ready SQL and Snowpark notebooks.

Validation of the converted code outside the scope of this guide, but T1A provides a dedicated validation framework that automatically compares the output of original SAS code against the converted Snowflake code — row by row, column by column — to confirm logical equivalence. This is especially important for regulated workloads where manual spot-checking is not sufficient.

Why Alchemist

Manual SAS migration is slow, error-prone, and expensive. Alchemist shortens the cycle significantly by automating the parts that can be automated — syntax translation, dependency mapping, and notebook scaffolding — while keeping a human in the loop for the logic that requires review. Key advantages:

Repeatability: the same pipeline produces consistent output across hundreds or thousands of assets.
Auditability: every conversion decision is traceable, so compliance and QA teams can verify the migration rather than trust it.
AI with guardrails: LLMs are used as one layer inside a structured pipeline, not as a free-form translator. This prevents the logic errors that plague purely generative approaches on regulated workloads.

Next steps

Explore the T1A Alchemist documentation for advanced converter templates and batch-conversion workflows.
Run Alchemist Analyzer against your own SAS estate to get a complexity and effort estimate before committing to a full migration project.
Reach out to T1A if you need help scoping or accelerating a production migration.

Updated 2026-05-28

This content is provided as is, and is not maintained on an ongoing basis. It may be out of date with current Snowflake instances