Snowflake AI Research

Snowflake AI Research

We are a team with extensive experience building systems and technology that have significantly reduced the cost of LLM training and inference. A lot of our work has been open-sourced to provide the AI community with more accessible and cost-effective LLMs. The team includes many specialists in natural language processing and search. With the help of thousands of engineers worldwide at Snowflake, our cutting-edge technology powers enterprise AI products in Cortex AI and more. Check out what we're working on: https://www.snowflake.com/en/product/ai/ai-research/

MORE POSTSFROM Snowflake AI Research

Gen AI

SuffixDecoding at Production Scale with Arctic Inference and vLLM

SuffixDecoding now delivers 1.96x–3.12x end-to-end speedups in vLLM and Arctic Inference with major CPU optimizations for fast, production-ready LLM serving.
|||
DEC 02, 2025|9 min read
Gen AI

Accelerating PyTorch Innovation at Scale: Snowflake at PyTorch Conference 2025

How Snowflake tackles four core AI challenges: scaling deep learning, training thousands of models, accelerating inference, and balancing multilingual performance.
|||||
NOV 19, 2025|7 min read
Gen AI

What’s Your Agent’s GPA? A Framework for Evaluating AI Agent Reliability

Learn how the Snowflake AI Research team's Agent GPA framework achieved 95% error detection and 86% localization accuracy to accurately measure and debug agent performance.
||||||
NOV 04, 2025|8 min read
Gen AI

Optimizing Query Execution in Cortex AISQL

How we made AISQL up to 8x faster and 70x cheaper through AI-aware planning, adaptive model cascading, and semantic join rewriting.
|||||||||||
NOV 04, 2025|8 min read
Gen AI

Smarter, Faster and Snowflake-Native: Real-Time Text2SQL Behind Snowflake Intelligence

Discover Arctic-Text2SQL-R1.5, Snowflake's new, native Text-to-SQL model built to overcome LLM latency and deliver higher accuracy for real-time conversational analytics.
|||||
NOV 04, 2025|7 min read
Gen AI

Fast Reasoning on GPT-OSS with Speculative Decoding and Arctic Inference

Snowflake AI Research boosted GPT-OSS reasoning speed by 1.7–1.8x using Arctic Inference with speculative decoding, enabling faster agentic AI.
|||||
AUG 25, 2025|4 min read
Gen AI

Bridging the Gap Between LLMs and Real-World Challenges: Snowflake at ACL 2025

Snowflake AI Research shares 4 breakthrough papers accepted at ACL 2025, advancing model efficiency, document AI, text-to-SQL reliability, and LLM evaluation.
|
JUL 30, 2025|7 min read

Arctic Long Sequence Training (ALST): Scalable And Efficient Training For Multi-Million Token Sequences

Snowflake's ALST enables scalable training of long-context models with up to 15 million tokens using Hugging Face and DeepSpeed, all without custom modeling code.
||||||||
JUN 24, 2025|10 min read
Gen AI

Inside Snowflake Intelligence: Five Pillars of Enterprise-Grade Agentic AI

Explore the underlying architecture, orchestration, and system-level optimizations behind Snowflake Intelligence, a production-grade agentic AI system built for enterprise reasoning.
|||||||
JUN 03, 2025|13 min read

Previous

1

2

3

4

5

Next

Where Data Does More

  • 30-day free trial
  • No credit card required
  • Cancel anytime