Distributed Medical Image Processing with MONAI on Snowflake
Overview
Medical image registration is a critical task in healthcare AI, enabling the alignment of CT scans taken at different times or breathing phases. This guide demonstrates how to build a production-ready distributed training and inference pipeline using MONAI (Medical Open Network for AI) on Snowflake's Container Runtime with GPU acceleration.
In this Guide, you will build a complete medical image registration system that:
- Downloads and processes lung CT scans from public datasets
- Trains a LocalNet deep learning model using distributed Ray computing
- Registers the trained model in Snowflake's Model Registry
- Runs distributed inference across multiple GPUs
What You Will Build
- Data ingestion pipeline for NIfTI medical images
- Distributed training system using Ray and MONAI
- Model registration workflow with Snowflake Model Registry
- Distributed inference pipeline with parallel processing
- Results table with registration quality metrics
What You Will Learn
- How to use Snowflake Container Runtime Notebooks with GPU compute pools
- How to integrate MONAI medical imaging framework with Snowflake
- How to leverage Ray for distributed deep learning workloads
- How to use Snowflake Model Registry for ML model management
- How to store and retrieve medical images from Snowflake stages
Prerequisites
- Familiarity with Python and deep learning concepts
- Familiarity with medical imaging (helpful but not required)
- A Snowflake account with access to Container Runtime and GPU compute pools
- Go to the Snowflake sign-up page and register for a free account
Architecture Overview
Solution Architecture
The MONAI medical image processing solution consists of three notebooks running on Snowflake Container Runtime with GPU acceleration:

-
Data Ingestion (
01_ingest_data.ipynb)- Downloads paired lung CT scans from public dataset
- Uploads NIfTI files to encrypted Snowflake stages
-
Model Training (
02_model_training.ipynb)- Initializes Ray cluster for distributed computing
- Trains LocalNet registration model
- Saves checkpoints to Snowflake stages
- Registers model in Snowflake Model Registry
-
Model Inference (
03_model_inference.ipynb)- Loads model from Model Registry
- Runs distributed inference across GPUs
- Saves registered images and metrics to Snowflake
Key Technologies
| Technology | Purpose |
|---|---|
| MONAI | Medical imaging transforms, networks, and losses |
| Ray | Distributed computing across GPU nodes |
| PyTorch | Deep learning framework |
| Snowflake Container Runtime | GPU-accelerated notebook execution |
| Snowflake Model Registry | Model versioning and deployment |
Setup Snowflake Environment
In this step, you'll create all the Snowflake objects needed for the MONAI solution.
Step 1: Create Database Objects
- In Snowsight, click
Projects, thenWorkspacesin the left navigation, or click here to go there directly - Click
+ Add newto create a new Workspace - Click
SQL Fileto create a new SQL file - Copy the setup script from setup.sql and paste it into your SQL file
Step 2: Run Infrastructure Setup (Sections 1-7)
Run the first part of the setup script to create:
- Role:
MONAI_DATA_SCIENTISTwith appropriate privileges - Warehouse:
MONAI_WH(SMALL size) - Database:
MONAI_DBwithUTILSandRESULTSschemas - Stages:
NOTEBOOK,MONAI_MEDICAL_IMAGES_STG,RESULTS_STG - Network Rule + External Access Integration: For pip install
- GPU Compute Pool:
MONAI_GPU_ML_M_POOL(GPU_NV_M instances)
Step 3: Import Notebooks
Download each notebook from GitHub:
Then import each notebook into Snowflake:
- In Snowsight, navigate to
Projects→Notebooks, or click here to go there directly - Click the dropdown arrow on + Notebook and select Import .ipynb file
- Upload a notebook file and configure:
- Name: Keep the default (e.g.,
01_ingest_data) - Notebook location:
MONAI_DB/UTILS - Runtime: Select Run on container
- Runtime version: Select a GPU runtime
- Compute pool:
MONAI_GPU_ML_M_POOL - Query warehouse:
MONAI_WH
- Name: Keep the default (e.g.,
- Click Create

- After the notebook opens, click the ⋮ menu → Notebook settings
- Click the External access tab
- Toggle ON the
MONAI_ALLOW_ALL_EAIintegration and click Save

- Repeat steps 2-7 for all 3 notebooks
Run Data Ingestion Notebook
Step 1: Open the Notebook
- In Snowsight, navigate to
Projects→Notebooks, or click here to go there directly - Find
MONAI_01_INGEST_DATAin theMONAI_DB.UTILSschema - Click to open the notebook
Step 2: Start Container Runtime
- Click the Start button in the top-right corner
- Wait for the Container Runtime to initialize (this may take 2-3 minutes on first run)
- You should see the notebook kernel become active
Step 3: Install Dependencies and Restart Kernel
- Run the install_monai cell (
!pip install monai) to install the MONAI library - A "Kernel restart may be needed" message appears with a Show me how button - click it
- A dropdown menu opens from the top (next to the Active button)
- Click Restart kernel to load the new packages
Step 4: Run Remaining Cells
After the kernel restarts:
- Click Run all at the top of the notebook to execute all cells
This will execute the remaining cells:
- Initialize Session: Connects to Snowflake and sets query tags
- Download Data: Downloads paired lung CT scans from Zenodo (~266MB)
- Upload to Stages: Uploads NIfTI files to Snowflake stages

Expected Output
After successful execution, you should see:
- 20 paired lung CT scan cases uploaded
- Files organized in
lungMasksExp,lungMasksInsp,scansExp,scansInspfolders - All files stored with Snowflake Server-Side Encryption
Step 5: Exit and Proceed to Next Notebook
- Click the ← back arrow in the top-left corner
- In the "End session?" dialog, click End session
- Proceed to the next notebook (02_model_training)
Run Model Training Notebook
Step 1: Open and Run the Training Notebook
- Navigate to
Projects→Notebooks, or click here to go there directly - Open your imported
02_model_trainingnotebook - Click Start to initialize Container Runtime
- Once active, click Run all to execute all cells
Step 2: Understand the Training Pipeline
The notebook executes these key steps:
- Ray Cluster Setup: Initializes distributed computing with 4 worker nodes
- Dependency Installation: Installs MONAI, PyTorch, nibabel on all nodes
- Data Loading: Reads paired CT scan paths from stages
- Model Definition: Creates LocalNet registration network
- Training Loop: Trains with Mutual Information + Bending Energy loss
- Model Registry: Saves best model to Snowflake Model Registry
Step 3: Monitor Training Progress
The training loop displays:
- Epoch number and total loss
- Similarity loss (image alignment quality)
- Regularization loss (deformation smoothness)
- Validation Dice score (segmentation overlap)
The notebook also includes interactive CT scan visualization to inspect the training data:

Step 4: Verify Model Registration
After training completes, the notebook automatically registers the model in the Snowflake Model Registry. You should see LUNG_CT_REGISTRATION with version v1 in the output.

Step 5: Exit and Proceed to Next Notebook
- Click the ← back arrow in the top-left corner
- In the "End session?" dialog, click End session
- Proceed to the next notebook (03_model_inference)
Run Model Inference Notebook
Step 1: Open and Run the Inference Notebook
- Navigate to
Projects→Notebooks, or click here to go there directly - Open your imported
03_model_inferencenotebook - Click Start to initialize Container Runtime
- Once active, click Run all to execute all cells
Step 2: Review Inference Results
The notebook:
- Loads Model: Retrieves trained model from Model Registry
- Configures Ray Workers: Sets up parallel inference actors
- Processes Images: Runs registration on all test cases
- Saves Results: Writes registered images to stages and metrics to table
The notebook displays results automatically and saves them to MONAI_DB.RESULTS.MONAI_PAIRED_LUNG_RESULTS.

Cleanup
To remove all resources created by this guide:
- Stop any running notebooks in Snowsight
- Delete the notebooks manually in Snowsight (Projects → Notebooks → select → Delete)
- Open your setup.sql Workspace (the same one you used during setup)
- Scroll to the TEARDOWN SCRIPT section at the bottom
- Uncomment the DROP statements
- Run the teardown commands
The teardown script will remove all compute pools, integrations, database objects, warehouses, and roles created by this guide.
Conclusion and Resources
Congratulations! You have successfully built a distributed medical image registration pipeline using MONAI on Snowflake.
What You Learned
- How to configure Snowflake Container Runtime with GPU compute pools
- How to use MONAI for medical image processing tasks
- How to leverage Ray for distributed training and inference
- How to manage ML models with Snowflake Model Registry
- How to store and process medical images in Snowflake stages
Related Resources
Blog:
Snowflake Documentation:
MONAI Resources:
Ray Documentation:
This content is provided as is, and is not maintained on an ongoing basis. It may be out of date with current Snowflake instances