Snowflake AI Research Blogs and Publications

We believe in a thriving research community, and we are committed to sharing our insights as we advance AI research focused on the tools, systems and algorithm optimizations for performant yet cost-effective LLM training and inference for everyone.

Results

Newest - Oldest

LLM Deployment

SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation

LLM Deployment

Efficiently Serving LLM Reasoning Programs with Certaindex

Agentic System

ReFoRCE: A Text-to-SQL Agent with Self-Refinement, Format Restriction, and Column Exploration

Pretraining – System & Efficiency

TurboMoE: Enhancing MoE Model Training with Smart Kernel-Fusion and Data Transformation

LLM Deployment

SuffixDecoding: A Model-Free Approach to Speeding Up Large Language Model Inference

LLM Evaluation

Unchecked and Overlooked: Addressing the Checkbox Blind Spot in Large Language Models with CheckboxQA

LLM Deployment

Query and Conquer: Execution-Guided SQL Generation

LLM Training

ExCoT: Optimizing Reasoning for Text-to-SQL with Execution Feedback

Pretraining – System & Efficiency

SSDTrain: An Activation Offloading Framework to SSDs for Faster Large Language Model Training

LLM Evaluation

In Case You Missed It: ARC ‘Challenge’ Is Not That Challenging

Retrieval

CORD: Balancing COnsistency and Rank Distillation for Robust Retrieval-Augmented Generation

Retrieval

Embedding

Arctic-Embed 2.0: Multilingual Retrieval Without Compromise

2

3

4

Product

Solutions

Why Snowflake

Resources

Developers

Pricing

Snowflake AI Research Blogs and Publications

SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation

Efficiently Serving LLM Reasoning Programs with Certaindex

ReFoRCE: A Text-to-SQL Agent with Self-Refinement, Format Restriction, and Column Exploration

TurboMoE: Enhancing MoE Model Training with Smart Kernel-Fusion and Data Transformation

SuffixDecoding: A Model-Free Approach to Speeding Up Large Language Model Inference

Unchecked and Overlooked: Addressing the Checkbox Blind Spot in Large Language Models with CheckboxQA

Query and Conquer: Execution-Guided SQL Generation

ExCoT: Optimizing Reasoning for Text-to-SQL with Execution Feedback

SSDTrain: An Activation Offloading Framework to SSDs for Faster Large Language Model Training

In Case You Missed It: ARC ‘Challenge’ Is Not That Challenging

CORD: Balancing COnsistency and Rank Distillation for Robust Retrieval-Augmented Generation

Arctic-Embed 2.0: Multilingual Retrieval Without Compromise

Where Data Does More

Features

Use Cases

Snowflake AI Research Blogs and Publications

SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation

Efficiently Serving LLM Reasoning Programs with Certaindex

ReFoRCE: A Text-to-SQL Agent with Self-Refinement, Format Restriction, and Column Exploration

TurboMoE: Enhancing MoE Model Training with Smart Kernel-Fusion and Data Transformation

SuffixDecoding: A Model-Free Approach to Speeding Up Large Language Model Inference

Unchecked and Overlooked: Addressing the Checkbox Blind Spot in Large Language Models with CheckboxQA

Query and Conquer: Execution-Guided SQL Generation

ExCoT: Optimizing Reasoning for Text-to-SQL with Execution Feedback

SSDTrain: An Activation Offloading Framework to SSDs for Faster Large Language Model Training

In Case You Missed It: ARC ‘Challenge’ Is Not That Challenging

CORD: Balancing COnsistency and Rank Distillation for Robust Retrieval-Augmented Generation

Arctic-Embed 2.0: Multilingual Retrieval Without Compromise

Where Data Does More